Absolute Values and the Product Formula in Number Fields

Archimedean Absolute Values on $K$

If $K$ is a number field of degree $d$ over $\mathbb Q$, then there are $r_1$ real embeddings $K \hookrightarrow \mathbb R$ and $r_2$ complex embeddings $K \hookrightarrow \mathbb C$ where $r_1 +2 r_2 = d$. Each of these embeddings produces an absolute value on $K$ formed by restricting the usual absolute value on $\mathbb R$ or $\mathbb C$ to the embedded copy of $K$.

Each of these absolute values is incomparable in the sense that given any two of them $| \cdot |_{\sigma}$ and $| \cdot |_{\tau}$ and any $\epsilon > 0$ we can find some $x \in K$ so that $| x |_{\sigma} < \epsilon$ and $| x |_{\tau} > \epsilon^{-1}$. This is a non-trivial fact that follows from the density of $K$ in $\mathbb R^{r_1} \times \mathbb C^{r_2}$ via the diagonal embedding. The $r_1$ real embeddings thus produce $r_1$ distinct real places; similarly there are $r_2$ complex places, and together these form the set of archimedean places of $K$. We denote the set of archimedean places of $K$ by $\mathcal M_{\infty}(K)$ and if $v \in \mathcal M_{\infty}$ we will write $v | \infty$.

Theorem

If $| \cdot |$ is an archimedean absolute value on $K$, then $|\cdot|$ is in one of the archimedean places of $K$.

Proof

Let $\overline K$ be the closure of $K$ with respect to $| \cdot |$, then $\overline K$ is a complete archimedean field and hence is isomorphic to $\mathbb R$ or $\mathbb C$. $K \hookrightarrow \overline K$ is dense, and so this must be isomorphic to one of the known real or complex embeddings. Since the completion is determined by the place and not the absolute value, $| \cdot |$ must be in the place associated to the embedding $K \hookrightarrow \overline K$.

An archimedean absolute value on $K$ restricts to an archimedean absolute value on $\mathbb Q.$ Given an archimedean place $v$, we choose the representative $\| \cdot \|_v$ to be the restriction of the usual absolute value on $K$ as embedded in $K_v$ ($=\mathbb R$ or $\mathbb C$). We then define another equivalent absolute value by setting $| \cdot |_v = \| \cdot \|_v^{1/d}$ if $v$ is real, and $\| \cdot \|_v^{2/d}$ if $v$ is complex. This latter choice of normalization implies that if $r \in \mathbb Q$, $$ |r|_{\infty} = \prod_{v | \infty} | r |_v.$$

Non-archimedean Absolute Values on $K$

By Ostrowski’s Theorem the non-archimedean places of $\mathbb Q$ are indexed by the rational primes (see Absolute Values and Completions of $\mathbb Q$). Thus, if $| \cdot |$ is a non-archimedean absolute value on $K$, $| \cdot |$ restricted to $\mathbb Q$ is equivalent to $| \cdot |_p$ for some prime $p$. If $v$ is the place of $| \cdot |$ we will say $v$ lies above $p$, and write $v|p$. We denote the set of places of $K$ above $p$ by $\mathcal M_p(K)$.

If $f(x)$ is the minimal polynomial for $K|\mathbb Q$, then $K = \mathbb Q[x] / f(x) \mathbb Q[x]$. Since $\mathbb Q \subseteq \mathbb Q_p$, we may view $f(x)$ as a polynomial with coefficients in $\mathbb Q_p$, and it may be the case that $f(x)$ is no longer irreducible over the larger field $\mathbb Q_p$. Suppose $f(x)$ factors as $f_1(x) \cdots f_{\ell}(x)$, where the $f_i(x) \in \mathbb Q_p[x]$ are irreducible.

If $a(x)$ and $b(x)$ are two polynomials in $\mathbb Q[x]$ with $a(x) – b(x)$ not divisible by $f(x)$, then in $\mathbb Q_p[x]$, $a(x) – b(x)$ is not divisible by any of the $f_i(x)$. That is, the map $a(x) + f(x) \mathbb Q[x] \mapsto a(x) + f_i(x) \mathbb Q_p[x]$ is injective for each $i=1,2,\ldots, \ell$. It follows that $K$ embeds in each of the $K_{v_i} := \mathbb Q_p[x] / f_i(x) \mathbb Q_p[x]$.

The diagonal embedding of $K$ into $K_{v_1} \oplus \cdots \oplus K_{v_{\ell}}$ is similar to the archimedean situation where we embed $K$ into $\mathbb R^{r_1} \times \mathbb C^{r_2}$. However, unlike the archimedean situation, the local degrees $[K_{v_i}:\mathbb Q_p]$ may be larger than 2. Like in the archimedean case, under the diagonal embedding, $K$ is dense in $K_{v_1} \oplus \cdots \oplus K_{v_{\ell}}$.

We can create an absolute value on $K_{v_i}$ by using the norm $N_{K_{v_i}|\mathbb Q_p}$ (the multiplicative homomorphism from $K_{v_i}$ onto $\mathbb Q_p$; see Field Extensions and Number Fields) and then hitting the result with the $p$-adic absolute value. That is, we define $$\|\alpha\|_{v_i} = \left| N_{K_{v_i}|\mathbb Q_p}(\alpha)\right|_p.$$ It remains to show this is a non-archimedean absolute value, and as usual, it is the strong triangle inequality that is the tricky part, and before we prove it we need a couple of results.

Hensel’s Lemma

Hensel’s Lemma is the name of a set of related results on how factorization in $\mathbb Z_p[x]$ is related to factorization in $\mathbb F_p[x]$ under the map on coefficients given by $\mathbb Z_p / p \mathbb Z_p \cong \mathbb F_p$. Hensel’s Lemma is usually attributed to the result which allows us to (under certain conditions) “lift” roots of the reduced polynomials in $\mathbb F_p[x]$ to roots of the original polynomial in $\mathbb Z_p[x]$. The classic proof is constructive and relies on a variant of Newton’s method for approximating zeros of function based on derivative information. For the moment, we need a less precise (but still useful) version which allows us to test for factorization of polynomials in $\mathbb Z_p[x]$ by checking the factorization of the reduced polynomial in $\mathbb F_p[x]$.

The form of Hensel’s Lemma we prove here replaces the Newton’s method step with the fact that if $\varphi_1$ and $\varphi_2$ are coprime polynomials in $\mathbb F_p[x]$, then the ideal generated by $\varphi_1$ and $\varphi_2$ is all of $F_p[x]$, and hence given any $\eta \in \mathbb F_p[x]$ we can find $\psi_1, \psi_2 \in \mathbb F_p[x]$ such that $$\psi_1 \varphi_2 + \psi_2 \varphi_1 = \eta.$$ The existence of the solution $(\psi_1, \psi_2)$, which can be found constructively using the Division Algorithm, will be necessary for the inductive construction of a factorization for $f \in \mathbb Z_p[x]$ from the factorization of its reduction in $\mathbb F_p[x]$.

Hensel’s Lemma is important in its own right, especially as it gives us a method to looks for factorizations of polynomials in $\mathbb Z_p[x]$ (and $\mathbb Q_p[x]$) by considering a much easier factorization problem in $\mathbb F_p[x]$.

Hensel’s Lemma

Suppose $f(x) \in \mathbb Z_p[x]$ and write $\widehat{f}$ for the polynomial in $\mathbb F_p[x]$ formed by reducing the coefficients of $f$ modulo $p \mathbb Z_p$. If there exists a factorization $\widehat f(x) = \varphi_1(x) \varphi_2(x)$ in $\mathbb F_p[x],$ with $\varphi_1$ and $\varphi_2$ coprime, then there exists a factorization $f(x) = f_1(x) f_2(x)$ in $\mathbb Z[x]$ with $\varphi_1 = \widehat f_1$, $\varphi_2 = \widehat f_2$ and $\deg f_1 = \deg \varphi_1$.

Proof

We will construct sequences of polynomials $f_1^{(n)}$ and $f_2^{(n)}$ in $\mathbb Z_p[x]$ satisfying

  • $\deg f_1^{(n)} = \deg \varphi_1$;
  • ${f}_1^{(n)} = \varphi_1 \bmod p \mathbb Z_p$; ${f}_2^{(n)} = \varphi_2 \bmod p \mathbb Z_p$;
  • $f_1^{(n+1)} \equiv f_1^{(n)} \bmod p^n \mathbb Z_p$; $f_2^{(n+1)} \equiv f_2^{(n)} \bmod p^n \mathbb Z_p$;
  • $f \equiv f_1^{(n)} f_2^{(n)} \bmod p^n \mathbb Z_p$.

Given such sequences $(f_1^{(n)})$ and $(f_2^{(n)})$, $f_1 = \lim f_1^{(n)}$ and $f_2 = \lim f_2^{(n)}$ are defined and satisfy the conditions of the lemma.

Set $f_1^{(1)} = \varphi_1$ and $f_2^{(1)} = \varphi_2$, where we view $a \in \mathbb F_p$ as the constant base-$p$ expansion, $a + 0 \cdot p + 0 \cdot p^2 + \cdots$ in $\mathbb Q_p$.

The inductive hypothesis guarantees the existence of a polynomial $h^{(n)} \in \mathbb Z_p[x]$ such that $$f = f_1^{(n)} f_2^{(n)} + p^n h^{(n)}.$$

We define $f_i^{(n+1)}$ in terms of $f_i^{(n)}$ as $$f_i^{(n+1)} = f_i^{(n)} + p^n g_i^{(n)}, \quad i=1,2;$$ where $g_i^{(n)} \in \mathbb Z_p[x]$ with $\deg g_i^{(n)} < \deg \varphi_i$, and look for $g_i^{(n)}$ so that $(f_i^{(n)})$ satisfies the conclusions of the lemma.

Computing \begin{align*}f_1^{(n+1)} f_2^{(n+1)} &= f_1^{(n)} f_2^{(n)} + p^n(g_1^{(n)} f_2^{(n)} + g_2^{(n)} f_1^{(n)} ) + p^{2n} g_1^{(n)} g_2^{(n)} \\ &= f – p^n(-h^{(n)} + g_1^{(n)} f_2^{(n)} + g_2^{(n)} f_1^{(n)}) + p^{2n} g_1^{(n)} g_2^{(n)}.\end{align*} If we can find $g_1^{(n)}$ and $g_2^{(n)}$ such that $$p \mid \left(-h^{(n)} + g_1^{(n)} f_2^{(n)} + g_2^{(n)} f_1^{(n)}\right),$$ then there is some polynomial in $\mathbb Z_p[x]$, call it $h^{(n+1)}$ such that $$\left(-h^{(n)} + g_1^{(n)} f_2^{(n)} + g_2^{(n)} f_1^{(n)}\right) + p^{2n} g_1^{(n)} g_2^{(n)}= p h^{(n+1)}, $$ and $f = f_1^{(n+1)} f_2^{(n+1)} + p^{n+1} h^{(n+1)}, $ as desired.

To find $g_1^{(n)}$ and $g_2^{(n)}$, Let $\eta = \widehat{h}^{(n)} \in \mathbb F_p[x]$ and consider solutions to the equation $$-\eta + \psi_1 \varphi_2 + \psi_2 \varphi_1 \equiv 0 \bmod p, \qquad \psi_1, \psi_2 \in \mathbb F_p[x].$$ By hypothesis $\varphi_1$ and $\varphi_2$ are relatively prime in $\mathbb F_p[x]$ and therefore there is a non-trivial solution $(\psi_1, \psi_2)$. Let $g_1^{(n)}$ and $g_2^{(n)}$ be any choices of polynomial in $\mathbb Z_p[x]$ with $\widehat{g}_1^{(n)} = \psi_1$ and $\widehat{g}_2^{(n)} = \psi_2$. It follows that $$p \mid \left(-h^{(n)} + g_1^{(n)} f_2^{(n)} + g_2^{(n)} f_1^{(n)}\right),$$ and the lemma is established.

The Ring of Integers in $K_v$

Fix $v = v_i$ and define $\mathfrak O_v = \{ \alpha \in K_v : \| \alpha \|_v \leq 1 \}$.

Theorem

$\mathfrak O_v$ is a ring. Moreover, it is the integral closure of $\mathbb Z_p$ in $K_v$.

Proof of Theorem

Suppose $\alpha \in \mathfrak O_v$ and $f(x) \in \mathbb Q_p[x]$ is a monic irreducible polynomial such that $f(\alpha) = 0$. We need to show that, in fact $f(x) \in \mathbb Z_p[x]$. Suppose $$f(x) = \sum_{i=0}^n c_i x^i \qquad c_n = 1.$$ We can find $b \in \mathbb Z_p$ such that $b c_i \in \mathbb Z_p$ for all $i=0,\ldots, n$. Indeed we can do this so that not all the $b c_i$ are in $p \mathbb Z_p$. However, notice that if $f$ has a coefficient not in $\mathbb Z_p$ then the leading coefficient, $b c_n$ is in $p \mathbb Z_p$. It follows that when we reduce the coefficients of $b f(x)$ modulo $p \mathbb Z_p$ we get a polynomial of lower degree. That is, $\widehat{f}(x) = \varphi_1(x) \cdot 1$ where $\varphi_1(x) \in \mathbb F_p[x]$ with $\deg \varphi_1 < \deg f$.

The Weak Hensel’s Lemma then implies that $b f(x)$ factors as $f_1(x) f_2(x)$ in $\mathbb Z_p[x]$ where $\deg f_1(x) = \deg \varphi_1 < \deg $. But then this implies that $f(x)$ is not irreducible in $\mathbb Q_p[x]$. This is a contradiction, and hence $f(x) \in \mathbb Z_p[x]$.

Finally, to see that $\mathfrak O_v$ is a ring, we need to show that it is closed under addition (the other ring axioms are immediate). Let $\alpha, \beta$ be nonzero $\mathfrak O_v$. Then if we denote the linear operators given by multiplication by $\alpha$ and $\beta$ on $K_v \cong \mathbb Q_p^{d_v}$ as $T_{\alpha}$ and $T_{\beta}$, then \begin{align*} \| \alpha – \beta \|_v &= | N_{K_v|\mathbb Q_p}(\alpha – \beta) |_p = | \det(T_{\alpha + \beta}) |_p \\ &= | \det T_{\alpha} + \det T_{\beta} |_p = | \det T_{\beta} |_p |\det (T_{\alpha/\beta} – I)|_p. \end{align*} Note that $\gamma = \alpha/\beta$ is in $\mathfrak O_v$ and hence, the characteristic polynomial of $\gamma$, $f_{\gamma}$ has all coefficients in $\mathbb Z_p$. Note that $f_{\gamma}(1) = \det (T_{\gamma} – I)$, and hence $$\| \alpha – \beta \|_v = \| \beta \|_v |f_{\gamma}(1)|_p. $$ But by assumption $\| \beta \|_v \leq 1$ and $|f_{\gamma}(1)|_p \leq 1$, because $f_{\gamma}(1)$ is the sum of the coefficients of $f_{\gamma}$ all of which have $| \cdot |_p \leq 1$. It follows that $\| \alpha – \beta \|_v \leq 1$, and hence $\mathfrak O_v$ is closed under addition.

$\| \cdot \|_v$ is an Absolute Value

We finally can prove that $\| \cdot \|_v$ is a non-archimedean absolute value.

Theorem

The function $\| \cdot \|_v : K_v \rightarrow [0, \infty)$ given by $\| \alpha \|_v = \left| N_{K_v | \mathbb Q_p}(\alpha)\right|_p$ is a non-archimedean absolute value on $K_v$.

Proof

The only non-immediate property is the strong triangle inequality. Suppose $\alpha, \beta \in \mathbb K_v$ and $\| \alpha \|_v \leq \| \beta \|_v$.

Defining $\gamma = \alpha/\beta$ note that $\| \gamma \|_v = \| \alpha \|_v/\|\beta\|_v \leq 1$ and hence $\gamma \in \mathfrak O_v$. Then, $$\det(T_{\beta – \alpha}) = \det(T_{\beta} – T_{\alpha}) = \det T_{\beta} \det(I – T_{\alpha/\beta}) = \det(T_{\beta}) f_{\gamma}(1),$$ where $f_{\gamma}(x) \in \mathbb Z_p[x]$ is the monic characteristic polynomial of $\gamma$. It follows that $\|\beta – \alpha\|_v = \| \beta \|_v |f_{\gamma}(1)|_p \leq \| \beta \|_v$ as desired.

The Places of $K$ above $p$

The $i$th irreducible factor of $f(x) = f_1(x) \cdots f_{\ell}(x) \in \mathbb Z_p[x]$ gives rise to the absolute value $\| \cdot \|_{v_i}$ on $K$ as restricted from $K_{v_i}$. Define the local degree of $K_{v_i}$ to be $d_i = [K_{v_i} : \mathbb Q_p] = \deg f_i.$ If $r \in \mathbb Q$, then multiplication by $r$ on $K_{v_i} \cong \mathbb Q_p^{d_i}$ is given by the constant matrix $r I$ where $I$ is the $d_i \times d_i$ identity matrix. It follows that $N_{K_{v_i} | \mathbb Q_p} = r^{d_i}$ and hence restricted to $\mathbb Q$, $\| \cdot \|_{v_i} = | \cdot |_{p}^{d_i}$. These absolute values represent all the different different places of $K$ that lie above $p$, a set we denote $\mathcal M_p(K)$. If $v \in \mathcal M_p(K)$ we will write $v|p$.

We choose a canonical absolute value for $v \in \mathcal M_p(K)$ by setting $| \cdot |_{v_i} = \| \cdot \|_{v}^{1/d}$. This choice is motivated by the fact that, if $r \in \mathbb Q$, $$|r|_p = \prod_{v | p} | r |_v. $$

In fact, we have the following prelude to the Product Formula.

Lemma

Suppose $\alpha \in K$, then $$\prod_{v \in \mathcal M_p(K)} |\alpha|_v = \left| N_{K|\mathbb Q}(\alpha)\right|^{1/d}_p.$$

Proof

Suppose $\alpha \in K$, then multiplication by $\alpha$ gives a linear transformation on $K \cong \mathbb Q^d$. Write $T_{\alpha}$ for this linear transformation. By definition $N_{K|\mathbb Q}(\alpha)$ is the determinant $T_{\alpha}$. Now $T_{\alpha}$ also makes sense as a linear transformation on $\mathbb Q_p[x]/f(x)\mathbb Q_p[x]$, which, as a vector space is isomorphic to $K \otimes_{\mathbb Q} \mathbb Q_p$, and we write $\overline T_{\alpha}$ for the linear transformation on $K \otimes_{\mathbb Q} \mathbb Q_p$. Note that, $\det \overline T_{\alpha} = \det T_{\alpha}$ because any basis for $K$ will lift to a basis for $K \otimes_{\mathbb Q} \mathbb Q_p$, and the matrices for $T_{\alpha}$ and $\overline T_{\alpha}$ are identical for these bases.

The Fundamental Theorem of Finitely Generated Modules over Principal Ideal Domains, means that as vector spaces $$\mathbb Q_p[x]/f(x)\mathbb Q_p[x] \cong \mathbb Q_p[x]/f_1(x) \mathbb Q_p[x] \oplus \cdots \oplus \mathbb Q_p[x]/f_{\ell}(x) \mathbb Q_p[x],$$ or what amounts to the same thing $$K \otimes_{\mathbb Q} \mathbb Q_p \cong K_{v_1} \otimes \cdots \otimes K_{v_{\ell}}.$$ Moreover, as vector subspaces the $K_{v_1}$ are invariant under $\overline T_{\alpha}$. It follows again from the Fundamental Theorem that $\overline T_{\alpha}$ decomposes as a direct sum $\overline T^{(1)}_{\alpha} \oplus \cdots \oplus \overline T^{(\ell)}_{\alpha}$, and $$\det T_{\alpha} = \det \overline T^{(1)}_{\alpha} \cdots \det \overline T^{(\ell)}_{\alpha}.$$ That is, $$N_{K|\mathbb Q}(\alpha) = N_{K_{v_1}|\mathbb Q_p}(\alpha) \cdots N_{K_{v_\ell}|\mathbb Q_p}(\alpha).$$ And hence $$| N_{K|\mathbb Q}(\alpha)|_p = \prod_{v \in \mathcal M_p(K)} |N_{K_{v}|\mathbb Q_p}(\alpha)|_p = \prod_{v \in \mathcal M_p(K)} \|\alpha\|_p = \prod_{v \in \mathcal M_p(K)} |\alpha|^d_p .$$

We can lift the linear transformation $T_{\alpha}$ to $\overline T_{\alpha}$ this decomposes as $\overline T^{(1)}_{\alpha} \oplus \cdots \oplus \overline T^{(\ell)}_{\alpha}$ because each $K_{v_i}$ is an invariant subspace. From this it follows that $\det T_{\alpha} = \det \overline T_{\alpha} = \det \overline T^{(1)}_{\alpha} \cdots \det \overline T^{(\ell)}_{\alpha}$.

To calculate $|\alpha|_v$ we note that there is some basis for $K \otimes_{\mathbb Q} \mathbb Q_p$ for which the matrix of $T_{\alpha}$ is given by the Frobenius companion matrix of the associated irreducible factor of $f(x)$ when factored over $\mathbb Q_p$. The determinant of this matrix is the constant coefficient, say $c_0$, of that irreducible factor, and $|\alpha|_v = |c_0|_p^{1/d}$.

The Product Formula

The Product Formula For Number Fields

Let $\mathcal M(K)$ denote the set of places of $K$. Suppose $v \in \mathcal M(K)$, let $d_v = [K_v : \mathbb Q_p]$ be the local degree, and and let $| \cdot |_v$ be the absolute value in $v$ which is equal to $| \cdot |_p^{d_v/d}$ on $\mathbb Q$. Then, for all nonzero $\alpha \in K$, $$\prod_{v \in \mathcal M(K)} | \alpha |_v = 1.$$

Proof

The proof reduces to the Product Formula for the rational numbers (see Absolute Values and Completions of $\mathbb Q$). Specifically, $$ \prod_v | \alpha |_v = \prod_{p \in \mathcal M(\mathbb Q)} \prod_{v \in \mathcal M_p(K)} |\alpha|_v = \prod_{p \in \mathcal M(\mathbb Q)} |N_{K|\mathbb Q}(\alpha)|^{1/d}_p = 1,$$ where the penultimate equality is the lemma (and the analogous fact for archimedean places of $K$) and the final equality is the invocation of the Product Formula for $N_{K|\mathbb Q}(\alpha) \in \mathbb Q$.

The General Setup

So far, in the non-archimedean situation, we have considered $K | \mathbb Q$ with places $v | p$. However we can replace $\mathbb Q$ with some other base field $k$ and an appropriate place $u$ of $k$ with $v | u$ and have many of our previous results still hold. Usually the necessary changes to the proofs to produce these more general results are minor, and so we will not prove all details other than to wave our hands at any necessary changes from the $k = \mathbb Q$, $u = p$ case.

Let $k$ be a number field and suppose $u$ is a place of $k$ lying above $p \in \mathcal M(\mathbb Q)$. If $\| \cdot \|$ is any absolute value on $k$, then we may form the completion $k_u$ by taking the ring of Cauchy sequences in $k$ (Cauchy with respect to $\| \cdot \|$) and modding out by the maximal ideal formed from sequences converging to 0 (again with respect to $\| \cdot \|$). This is a field, and a generic element looks like an equivalence class of Cauchy sequences in $k$ whose elements differ by a sequence which converges to 0. $k$, represented by constant sequences, is dense in $k_u$ and we may extend $\| \cdot \|_u$ to $k_u$. If $u | \infty$ (that is, $\| \cdot \|_u$ is archimedean), then $k_u$ is equal to $\mathbb R$ or $\mathbb C$ and $\| \cdot \|_u$ is some power of the usual absolute value. We will circle back to that case, but for now we will assume that $p$ is a rational prime, and $u$ is a non-archimedean absolute value.

First we associate to $\| \cdot \|$ to a prime ideal $\mf p$ in the ring of integers $\mf o$ of $k$.

Theorem
  • If $\alpha \in \mf o$, then $\| \alpha \| \leq 1$.
  • $\mf p := \{ \alpha \in \mf o : \| \cdot \| < 1 \}$ is a prime ideal in $\mf o$.
Proof

The strong triangle inequality implies that $\| c \| \leq 1$ for all $c \in \mathbb Z$. If $\alpha$ is in $\mf o$ then there exists a polynomial $x^n + c_{n-1} x^{n-1} + \cdots + c_1 x + c_0$ with integer coefficients that vanishes at $\alpha$. If $\|\alpha\| > 1$ then $$0 = \| \alpha^n + c_{n-1} \alpha^{n-1} + \cdots + c_1 \alpha + c_0 \| = \| \alpha^n \| > 1,$$ where the penultimate relation follows from the case of equality in the strong triangle inequality because $\| \alpha^n \| > \| c_j \alpha^j\| \geq \|\alpha\|^j$. This is an obvious contradiction, and thus $\| \alpha \| \leq 1$.

For the second statement, it is clear that $\mf p$ is an ideal of $\mf o$; the only questionable axiom is additivity, but as always the strong triangle inequality comes to the rescue. Now, if $\alpha \in \mf p$ and $\alpha = \beta \delta$ for some $\beta, \delta \in \mf o$, then clearly either $\| \beta \| < 1$ or $\| \delta \| < 1$ and hence $\mf p$ is a prime ideal.

It will turn out the this association between non-archimedean places and prime ideals is in fact a bijection, and we will often use $\mf p$ to represent the place associated to the prime ideal. This is, more-or-less, the content of Ostrowski’s Theorem for number fields. In this situation $\mf p | p$ has a meaning in terms and a different meaning in terms of places. However, $\mf p | p$ is simultaneously true or simultaneously false for these different interpretations.

One canonical representative of the place indexed by $\mf p$ is the $\mf p$-adic absolute value given by $$\| \cdot \|_{\mf p} = (\mathbb N \mf p)^{-v_{\mf p}(\cdot)}, $$ where $v_{\mf p}(\alpha)$ is the valuation of $\alpha$; the largest integer $n$ with $\alpha \in \mf p^n$. Here $\mathbb N \mf p$ is the norm of $\mf p$, that is $\mathbb N \mf p = [ \mf o : \mf p ]$. We also define the canonical absolute value $| \cdot |_{\mf p} = \| \cdot \|_{\mf p}^{1/d}$. We will see that this notation is consistent with our previous definitions for $\| \cdot \|_v$ and $| \cdot |_v$ when $v = \mf p$.

Completing $k$ and $\mf o$ with respect to $\mf p$ produces the completions $k_{\mf p}$ and $\mf o_{\mf p}$. The units $U_{\mf p}$ in $\mf o_{\mf p}$ consist of all elements of absolute value 1. In particular, prime ideals not equal to $\mf p$ embed as subsets of $U_{\mf p}$ and hence do not maintain their identity as ideals under the embedding. Many authors continue to use $\mf p \subset \mf o_{\mf p}$ for the maximal ideal formed by completing $\mf p$. However, we will distinguish the completion by denoting it $\mathbf m_{\mf p}$. This is the unique maximal ideal in $\mf o_{\mf p}$. This ideal is principal, and we choose $\pi = \pi_{\mf p}$ to be a generator or uniformizer of $\mf m_{\mf p}$.

The quotient $\mf o_{\mf p} / \mf m_{\mf p}$ is a finite field, and we will denote by $q$ and $f_{\mf p}$ the integers $$q := [\mf o_{\mf p} : \mf m_{\mf p}] = p^{f_{\mf p}}.$$ We identify $\mf o_{\mf p} / \mf m_{\mf p}$ with $\mathbb F_q$. Like in $\mathbb Q_p$ there is a series representation for the elements of $K_{\mf p}$. The proof follows mutatis mutandis from the $\mathbb Q_p$ case (see The Algebra and Geometry of $\mathbb Q_p$).

Theorem

Suppose $\alpha \in \mf k_{\mf p}$ then $\alpha$ there exists integer $n_0$ and $c_{n_0}, c_{n_0+1}, \ldots, \in \mathbb F_q$ such that $\alpha$ can be represented by the sequence of partial sums of $$\sum_{n=n_0}^{\infty} c_n \pi^n. $$ If $\alpha \in \mf o_{\mf p}$ then $n_0$ can be taken to be 0.

Notation

Number fields
$f(x), g(x), h(x),$ etc.Polynomials, often in $\mathbb Q[x]$ or $k[x]$
$k, K, L$Number fields
$\alpha, \beta, \gamma,$ etcGeneric field elements. The field depends on context.
$[K : k]$, $d$The degree of a field extension. The fields depend on context.
$T_{\alpha}$The linear transformation on $K|k$ (fields context dependent) given by multiplication by $\alpha$.
$r_1, r_2$The number of real and complex embeddings (respectively) of $K$ (context dependent).
$N_{K|k}$, $\mathrm{Tr}_{K|k}$The Norm and Trace maps $K \rightarrow k$ given by $\alpha \mapsto \det( T_{\alpha})$ and $\alpha \mapsto \mathrm{Tr}( T_{\alpha})$.
$\mf o$, $\mf O$Rings of integers in $k$ and $K$.
$\mf a, \mf b, \mf A, \mf B,$ etc.Ideals in rings of integers. We often use lower case fraktur letters for ideals in $\mf o$ and capital fraktur letters for ideals in $\mf O$.
$\mf p, \mf q, \mf P, \mf Q$ Prime ideals in $\mf o$ and $\mf O$.
$\mathbb N \mf a, etc$ etcThe ideal norm $\mathbb N \mf a = [\mf o : \mf a]$.
Absolute Values
$| \cdot |$, $| \cdot |_p$, $| \cdot |_{\infty}$A generic absolute value, the $p$-adic absolute value on $\mathbb Q_p$ and the usual absolute value on $\mathbb R$ (respectively).
$\mathcal M(K)$, $\mathcal M_p(K)$, $\mathcal M_{\infty}(K)$The places of $K$, the places of $K$ over rational place $p$, the archimedean places of $K$.
$v | p, v | \infty$Shorthand for $v \in \mathcal M_p(K)$ and $v \in \mathcal M_{\infty}(K)$.
$\mf p | p$, $\mf p \in \mathcal M_p(K)$Non-archimedean places indexed by prime ideals
$\mathbb Q_p$, $K_v$The completion of $\mathbb Q$ with respect to the place $p$, and the completion of $K$ with respect to the place $v$.
$\mf o_{\mf p}$, $\mf o_v$, $\mf O_{\mf p}$, $\mf O_v$Local integers. The completion of the integers $\mf o$ or $\mf O$ with respect to $v$ or $\mf p$.
$\mf m_{\mf p}$, $\mf m_v$The maximal ideal in the local integers.
$d_v = [K_v : \mathbb Q_p], d = [K : \mathbb Q]$The local and global degrees of the place $v \in \mathcal M_p(K)$.
$\| \cdot \|_v, \| \cdot \|_{\mf p}$ The absolute value in the place $v = \mf p$ given by $| N_{K_v|\mathbb Q_p} |_p$.
$| \cdot |_v, | \cdot |_{\mf p}$$\mathbb N \mf a,$ etc
0

The Algebra and Geometry of $\mathbb Q_p$

In Absolute Values and Completions of $\mathbb Q$ we looked at the completions of $\mathbb Q$, and in particular the non-archimedean completions $\mathbb Q_p$, from the viewpoint of analysis and topology. Here we investigate the algebraic and geometric properties of the $p$-adic numbers, though we will not ignore the topology completely.

The archimedean property of the real numbers asserts that for any real number $x$ there is an integer $n$ such that $| x – n |_{\infty} \leq 1/2$. That is, every real number is at most $1/2$ unit away from its closest integer. We contrast this with $\mathbb Q_p$ where $| n |_p \leq 1$ for all $n \in \mathbb Z$, and where we can find $x \in \mathbb Q_p$ with $|x|_p$ as large (or small) as we want. Suffice it to say $\mathbb Q_p$ does not have the archimedean property, hence the adjective non-archimedean for such fields.

The completion of $\mathbb Z$ in $\mathbb Q_p$ is denoted $\mathbb Z_p$ and explicitly given by $$\mathbb Z_p = \{ x \in \mathbb Q_p : |x|_p \leq 1 \}.$$ Put another way, $p$-adic integers are equal to the closed unit ball in $\mathbb Q_p$.

It is easy to verify that $\mathbb Z_p$ is a ring; the only property in doubt is closure under addition, but this comes from the strong triangle inequality. If $x, y \in \mathbb Z_p$ then $$|x+y|_p \leq \max\{ |x|_p, |y|_p\} = 1.$$ This same argument shows that $\mathfrak m_p = \{ x : |x|_{p} < 1 \}$ is itself a ring—in fact a maximal ideal—of $\mathbb Z_p$. We also define $U_p = \{ x : |x|_p = 1\}$. Note that $U_p$ is not a ring, but it is a group under multiplication: the group of units in $\mathbb Z_p$. We note that the definition of $\mathbb Q_p, \mathbb Z_p, \mathfrak m_p$ and $U_p$ are invariant under substitution of an equivalent absolute value. That is, the $p$ that indexes these sets is associated to the place indexed by $p$ not the specific choice of absolute value $| \cdot |_p$.

Commutative rings and maximal ideals quotient to make fields, and we define the residue field of $\mathbb Q_p$ to be $\mathbb F_p \cong \mathbb Z_p/\mathfrak m_p$, which as the notation suggests is the finite field with $p$ elements. We will prove this in the next section.

An explicit construction of $\mathbb Q_p$

We have defined the $p$-adic numbers as equivalence classes of Cauchy sequences. It is useful to have a proscribed choice of representative for each equivalence class. This is done using series. Consider the formal series $$ \sum_{m=v}^{\infty} a_m p^m$$ where each $a_m \in \{0,1,\ldots,p-1\}$. The $M$th partial sum is a rational number $$n_M = \sum_{m=v}^M a_m p^m,$$ and $|n_M – n_{M-1}|_p \leq p^{-M}$. In fact, $$|n_{M + n} – n_M|_p \leq \max_{\ell=1,\ldots n}\{|n_{M+\ell} – n_{M+\ell-1}|_p\} \leq p^{-M}$$ and hence $(n_M)$ is Cauchy with respect to $| \cdot |_p$. It follows that $$ x = \lim n_M = \sum_{m=v}^{\infty} a_m p^m$$ defines a $p$-adic number with $|x|_p \leq p^{-v}$. Note that when $x$ is a positive rational integer, its series representation has finitely many terms (indexed starting at 0) and is simply its base-$p$ expansion.

So every power series of this form produces a $p$-adic number. What about the converse? Given a $p$-adic number, can we find a representative given as a “base $p$ series”?

Theorem

Suppose $x \in \mathbb Q_p$ then there exists an integer $v$ and a sequence of integers $(a_n)_{n=v}^{\infty}$ with each $0 \leq a_n < p$ such that $x$ is represented by the sequence of partial sums of $$ \sum_{n=v}^{\infty} a_n p^n.$$ Moreover each such series defines an element of $\mathbb Q_p$.

Proof

We first do this for sequences of integers. Let $(\ell_m)$ be a sequence of integers Cauchy with respect to $| \cdot |_p$. We may assume, by taking a subsequence if necessary that $|\ell_m – \ell_{m-1}|_p \leq p^{-m}$. We make a new sequence of integers $(n_m)$ by taking the base-$p$ expansion of $\ell_m$ and truncating it at the $m$th term. That is, if $\ell_m = \sum_{j=0}^J a_j p^j$ then $n_m = \sum_{j=0}^m a_j p^j$. We note that $|\ell_m – n_m| \leq p^{-m-1}$, and that \begin{align}|n_m – n_{m-1}|_p &= |n_m – \ell_m + \ell_m – \ell_{m-1} + \ell_{m-1} – n_{m-1}|_p \\ &\leq \max\{|n_m – \ell_m|_p, |\ell_m – \ell_{m-1}|_p, |\ell_{m-1} – n_{m-1}| \} \leq p^{-m}.\end{align} All this is to say that $(\ell_m)$ and $(n_m)$ are equivalent Cauchy sequences. It remains to show that $(n_m)$ is the sequence of partial sums of an infinite base-$p$ expansion. Currently we know that $n_m$ is a polynomial in $p$ of degree $m$ with coefficients in $\{0, 1, \ldots, p-1\}$, but we don’t know if these coefficients agree with those (of degree $<m$) of $n_{m-1}$ for all $m$. That is we need, to verify that for each $m$, $n_m = n_{m-1} + a_{m} p^m$ for some $a_m \in \{0, 1, \ldots, p-1\}$. Because $|n_m – n_{m-1}|_p \leq p^{-m}$ we know that $n_m = n_{m-1} + A_m p^m$ for some integer $A_m$. We may replace $A_m$ with $a_m \in \{0, 1, \ldots, p-1\}$ congruent modulo $p$, and by replacing $n_m$ with $n_{m-1} + a_m p^m$ (if necessary), we find that the series $(n_m)$ is the sequence of partial sums of $$\sum_{m=0}^{\infty} a_m p^m$$ as desired.

We are almost done, we now simply need to show that if we have any rational sequence $(r_m)$ Cauchy with respect to $| \cdot |_p$ that it can be represented by the partial sums of an infinite base-$p$ expansion of the form $$\sum_{m=v}^{\infty} a_m p^m$$ for some $v < 0$. We could bust out the previous analysis, but here we remark that there is some largest (least negative) integer $v$ such that $p^{v} \mathbb Z_p$ contains $(r_n)$. That is $x_m = p^{-v} r_m$ defines a sequence such that $|x_m|_p \leq 1$. Scaling by $p^{-v}$ is continuous, and so $(x_m)$ is a Cauchy sequence. We define $y_m$ to be the degree $m$ truncation of the base-$p$ expansion of $x_n$. This is exactly what we did before by defining $n_m$ in terms of the $\ell_m$ except here we do not know that the $x_m$ are integers—however we just proved that they have (possibly infinite) base-$p$ expansions because they are all in the closure of the integers—this can be truncated to produce the $y_m$. Regardless all the analysis works and we find that $(y_m)$ is equivalent to $(x_m)$ and is the sequence of partial sums of some $$\sum_{m=0}^{\infty} b_m p^m.$$ We then define $$s_m = p^v \sum_{m=0}^{\infty} b_m p^m.$$ Scaling is still continuous, and so $(s_n)$ is Cauchy and equivalent to $(r_n)$ and is the sequence of partial sums of an infinite base-$p$ expansion (allowing for finitely many negative powers of $p$).

So far we have only constructed positive numbers. Negative numbers can be represented in base-$p$ expansion. In particular, $$ -1 = \sum_{n=0}^{\infty} (p-1) p^n.$$ To compute the negative of a generic number $x$ we simply compute $0-x$ base-$p$.

Example

The base-5 expansion of 429 is $429 = 4 \cdot 5^0 + 0 \cdot 5^1 + 2 \cdot 5^2 + 3 \cdot 5^3$. To compute $-429$ we wish to add powers of 5 that always cause us to “carry the one”.

We find $-429 = (1 \cdot 5^0 + 4 \cdot 5^1 + 2 \cdot 5^2 + 1 \cdot 5^3) + 4 \cdot 5^4 + 4 \cdot 5^5 + \cdots$.

We return to our claim about $\mathbb Z_p/ \mathfrak m_p$.

Theorem

$\mathbb Z_p/ \mathfrak m_p$ is isomorphic to the field with $p$ elements.

The proof is now obvious because $\mathfrak m_p = p \mathbb Z_p$ and thus two base-$p$ expansions are the same modulo $\mathfrak m_p$ if and only if they have the same constant coefficient, and hence $\mathbb Z_p/\mathfrak m_p$ is a field with $p$ elements. It is easy to see that, in fact $p^n \mathbb Z_p/p^{n+1} \mathbb Z_p \cong \mathbb F_p$ (for all $n \in \mathbb Z$).

The Geometric Picture of $\mathbb Z_p$

Here we want to think of the coefficients of $x = \sum_{n=0}^{\infty} a_n p^n \in \mathbb Z_p$ not as coefficients of a power series, but as an address. Imagine driving in a strange town of one-way roads, where at each intersection you have $p$ choices of roads ahead of you (numbered in some consistent way using $0,1, \ldots, p-1$). Then by telling you a sequence of numbers $(a_n)$ with $a_n \in \{0, 1, \ldots, p-1\}$ I am giving you instructions to an address at the end of an infinite sequence of roads.

Six Corners in Chicago. Coming into this intersection along the red arrow, there are five choices, labelled $0, 1,2,3,4$ leaving the intersection. This choice of labelling will be convenient for visualizing elements in $\mathbb Z_5$.

This analogy is not very apt, because we allow no loops in our strange city, but the point remains: we may think of the $(a_n)$ as an “address” for $x = \sum a_n p^n \in \mathbb Z_p$. Each $x \in \mathbb Z_p$ has a unique address, and we may visualize the network of roads as a complete, infinite $p$-nary tree.

Schematic of the “roads” in $\mathbb Z_5$ and $\mathbb Z_3$. These are different embeddings of the complete $5$-nary and $3$-nary trees. The points in $\mathbb Z_p$ are the “boundary” of these trees.

What if we drive only part way to an address? Suppose we start down the roads labelled $(3, 4, 2)$ in $\mathbb Z_5$. This finite tuple then gives us the address of a neighborhood—that consisting of all infinite addresses that start $(3, 4, 2, \ldots )$. Note that $3 + 4 \cdot 5 + 2 \cdot 25 = 73$, and so we can think of this neighborhood as the ball of radius $1/125$ around $73$.

The “roads” to $0, 1$ and $-1$ in $\mathbb Z_5$. Slide left to see the road to the neighborhood defined by the finite tuple $(3,4,2)$. Where is $73$ in that neighborhood?

The positive rational integers can be seen inside $\mathbb Z_p$ as the destination of itineraries which eventually have no turns to the left or right. Negative integers follow itineraries that eventually have a clockwise spiral like that of $-1$. In either event we see visually how $\mathbb Z$ (and indeed $\mathbb N$ and $-\mathbb N$ individually) are dense in $\mathbb Z_p$.

A Bijection Between Balls and Cosets

There is a bijection between the balls in $\mathbb Z_p$ of radius $p^{-n}$ and the cosets of $\mathbb Z_p / p^n \mathbb Z_p$. In the schematic for $\mathbb Z_5$ we may think of a ball as one of the naturally appearing pentagons (of any size). This is not quite right, the ball is actually the fractal bits of the boundary of the tree contained in such a pentagon. In general for $\mathbb Z_p$ there would be a similar schematic with the pentagons (and their fractal tree boundaries) replaced with $p$-gons.

The correspondence between balls and cosets when $p = 5$ and $n=1,2$.

This allows us to index balls of radius $p^{-n}$ by the integers $\{0, 1, \ldots, p^n-1\}$. For instance, the neighborhood in $\mathbb Z_5$ indexed by $(3,4,2)$ is exactly the coset $73 + 125 \mathbb Z_5$.

$\mathbb Z_p$ as a Pro-finite Completion

Another way of specifying the directions to a point $x \in \mathbb Z_p$ is to record the neighborhoods one passes through on the way to $x$. By the correlation between neighborhoods and cosets we can identify that point in $\mathbb Z_p$ with a sequence of cosets $(c_n)$ with $c_n \in \mathbb Z_p/p^n \mathbb Z_p$ represented as integers $0 \leq c_n < p^n$. The relationship between $(c_n)$ and the coefficients $(a_m)$ of the base-$p$ expansion of $x$ is $$c_n = \sum_{m=0}^n a_m p^m.$$ Notice the congruence relations $$c_n \equiv c_{n-1} \bmod p^{\ell}, \qquad \ell=1,\ldots,n.$$ More generally, there is an inverse system of projections $\pi_{\ell \leftarrow n}: \mathbb Z/p^n \mathbb Z_p \rightarrow \mathbb Z/p^{\ell} \mathbb Z_p$, for $0 \leq \ell \leq n$, such that for any $\ell \leq m \leq n$, $\pi_{\ell \leftarrow n} = \pi_{\ell \leftarrow m} \circ \pi_{m \leftarrow n}$. We may thus identify $\mathbb Z_p$ with $$\lim_{\leftarrow n} \mathbb Z/p^n \mathbb Z := \left\{ (c_n) \in \prod_{n=1}^{\infty} \mathbb Z/p^n \mathbb Z : c_{\ell} = \pi_{\ell \leftarrow n}(c_n) \mbox{ for all } 1 \leq \ell \leq n \right\}.$$ This set is exactly the pro-finite completion of the inverse system.

 

A geometric representation of the profinite completion view of the $3$-adic integers. Elements in the $\mathbb Z_3$ are not indexed by the “roads” that you take to get there, but by the cosets aka neighborhoods (here represented as disks) you must pass through to get to your destination. The destination is the same, but the data you use to get there is (only slightly) different.

Addition and multiplication in the pro-finite completion are done component wise, and sums and product remain in $\lim_{\leftarrow } \mathbb Z/p^n \mathbb Z$ because the $\pi_{\ell \leftarrow n}$ are ring homomorphisms. These operations are the same addition and multiplication that comes from the base-$p$ expansion representation of $\mathbb Z_p$.

0

Absolute Values and Completions of $\mathbb Q$

An absolute value on a field $K$ is a function $|\cdot| : K \rightarrow [0, \infty)$ such that for any $x, y \in K$,

  • $|x| = 0$ if and only if $x = 0$;
  • $|x y| = |x| |y|$;
  • $|x + y| \leq |x| + |y|$

These properties are called respectively positive definiteness, multiplicativity and the triangle inequality. If the absolute value satisfies the stronger condition (called the strong triangle inequality)

  • $| x + y| \leq \max\{ |x|, |y| \}$

we say it is a non-archimedean absolute value. An absolute value that does not satisfy the strong triangle inequality is called an archimedean absolute value.

The usual absolute values on $\mathbb Q$, $\mathbb R$ and $\mathbb C$ are all archimedean absolute values.

Every field has a trivial absolute value given by $| 0 | = 0$ and $| x | = 1$ for all $x \neq 0$. This absolute value is not very interesting and we will usually concentrate on the non-trivial absolute values of a field.

Equality in the Strong Triangle Inequality

It is often easy to determine when the strong triangle inequality is actually an equality. The proof of the following lemma is easy, but the result is surprisingly powerful.

Lemma

Suppose $| \cdot |$ is a non-archimedean absolute value $x, y \in K$ are such that $| x | < | y |$ then $| x + y | = | y |$. That is, if $| x | \neq | y |$, $|x + y| = \max\{ |x|, |y| \}.$

Proof

Suppose $|x| < |y|$ and $|x + y| < |y|$. Then,$$ |y| = |x + y – x| \leq \max\{|x+y|,|x|\} < |y|;$$ an obvious contradiction.

Absolute Values on $\mathbb Q$

To distinguish the usual absolute value from new ones we may construct we will denote it $| \cdot |_{\infty}$. That is $$ |x|_{\infty} = \left\{ \begin{array}{rl} x & x \geq 0; \\ -x & x < 0. \end{array}\right.$$

If $p$ is a prime integer then define the valuation $v_p : \mathbb Q^{\times} \rightarrow \mathbb Z$ by $$ v_p(x) = v \qquad x = p^v \frac{a}{b}, \quad \mathrm{GCD}(a, b) = 1.$$ That is, we determine the valuation of a rational number by determining the highest power of $p$ that divides it. This power is positive if the numerator has $p$ as a factor, and is negative if the denominator has $p$ as a factor (when written in lowest terms). If the valuation is 0 then the rational number does not have $p$ in its factorization.

The valuation $v_p$ is a homomorphism from $(\mathbb Q^{\times}, \cdot) \rightarrow (\mathbb Z, +)$. That is $v_p$ is additive: $v_p(xy) = v_p(x) + v_p(y)$. It is common to take $v_p(0) = \infty$ with the justification that $0$ is “infinitely divisible” by $p$.

We may write a non-zero rational number $x$ in terms of the various valuations $\{v_p(x) : p \mbox{ prime} \}$ by $$x = \prod_p p^{v_p(x)}.$$ Note that $v_p(x) \neq 0$ for only the primes that appear in the factorization of $x$. It follows that this product is actually a finite product.

We define the $p$-adic absolute value on $\mathbb Q$ by $$| x |_p = p^{-v_p(x)}.$$ Of course, we need to verify that this is an absolute value. Positive definiteness is a matter of definition, multiplicativity comes from the additivity of $v_p$. Only the triangle inequality remains, and we will in fact show that $| \cdot |_p$ satisfies the strong triangle inequality. Suppose $x = p^v a/b$ and $y = p^u c/d$ where $a/b$ and $c/d$ are written in lowest terms. Then $$ x + y = \frac{p^v d a + p^u b c}{bd}.$$ Because $bd$ is relatively prime to $p$, we see that the minimum power of $p$ we can pull out of the numerator is $\min\{u, v\}$. That is $v_p(x + y) \geq \min\{v_p(x), v_p(y) \}$. This is equivalent to the strong triangle inequality.

Note that we actually proved something slightly stronger than the strong triangle inequality here. Written in terms of valuations and absolute values, $$v_p(x + y) > \min\{v_p(x), v_p(y) \} \quad \mbox{only if} \quad v_p(x) = v_p(y)$$ $$|x + y|_p < \max\{|x|_p, |y|_p \} \quad \mbox{only if} \quad |x|_p = |y|_p.$$

Thus, the $p$-adic absolute value is a non-archimedean absolute value.

The Places of $\mathbb Q$

We say two absolute values $| \cdot |_0$ and $| \cdot |_1$ on a field $K$ are equivalent if there is a positive real number $c$ such that $|\cdot |_0 = |\cdot|_1^c$. It is easily verified that this gives an equivalence relation on the set of absolute values of $K$, and we call the equivalence classes the places of $K$. The place corresponding to the trivial absolute value (which is the only representative in its class) is called the trivial place, and is often excluded from attention.

We will eventually talk about how to complete $K$ with respect to an absolute value, using the same methods as when we construct $\mathbb R$ out $\mathbb Q$ with respect to the usual absolute value. We will see that equivalent absolute values produce the same completion, and different places produce different completions. The completion of $\mathbb Q$ with respect to the trivial absolute value is $\mathbb Q$ itself—another hint that nothing interesting happens with trivial absolute values.

The set of non-trivial places of $K$ is denoted $\mathcal M_K$.

Ostrowski’s Theorem

If $| \cdot |$ is a non-trivial absolute value on $\mathbb Q$ then $|\cdot|$ is equivalent to either the usual absolute value $| \cdot |_{\infty}$ or is equivalent to $| \cdot |_p$ for some prime $p$. That is $\mathcal M_{\mathbb Q}$ is in correspondence with $\mathcal P = \{ \mbox{ primes } \} \cup \{ \infty \}$.

For a proof, see https://en.wikipedia.org/wiki/Ostrowski%27s_theorem#Proof

It is hard to understate the importance of this result. It is surprising (though less so after reading the proof) to see a correspondence between the primes and a set of analytic object. It suggests (rightly) that there is progress to be made in understanding the primes by understanding absolute values. Indeed, as we will see this correspondence extends to number fields, and we will see a bijection between non-archimedean absolute values and prime ideals. The archimedean absolute values in that setting correspond to the real and complex embeddings of the number field. It all hangs together very nicely.

The Product Formula

The product formula is the trivial observation that for any rational number $x \neq 0$, $$ \prod_{p \in \mathcal P} | x |_p = 1.$$

This can be verified by calculation: $$ | x |_{\infty} = \prod_{p \mbox{ prime} } p^{v_p(x)} \quad \mbox{and} \quad \prod_{p \mbox{ prime}} | x |_p = \prod_{p \mbox{ prime}} p^{-v_p(x)},$$ and the product formula follows.

In spite of its triviality, it is an important observation and we will return to a version of the product formula for number fields shortly.

Completions

Recall these two definitions from elementary analysis.

Definition

A sequence of rational numbers $(x_n)$ converges to 0 with respect to the absolute value $| \cdot |$ if, for all $\epsilon > 0$ there exists positive integer $N$ such that if $n \geq N$ then $|x_n| < \epsilon$. In this situation we write $\lim x_n = 0$ or $(x_n) \rightarrow 0$.

Definition

A sequence of rational numbers is Cauchy with respect to $| \cdot |$ if, for all $\epsilon > 0$ there exists positive integer $N$ such that if $n, m \geq N$ then $|x_n – x_m| < \epsilon$. We will denote the set of Cauchy sequences by $\mathcal C$.

If you recall from elementary analysis, Cauchy sequences are convergent sequences. However, not every convergent sequence of rational numbers converges to a rational number. That is, taking limits can take you out of $\mathbb Q$. These new limit points live in the completion, and that completion depends on the absolute value (in fact place) used in the definition of convergence and Cauchy.

The equivalence of Cauchy sequences and convergent sequences (once you account for new limit points) is useful because the condition of Cauchyness depends only on the rational numbers in the sequence. That is we can determine if something is Cauchy without having to know what its limit is or where that limit lives.

The other thing that is useful about Cauchy sequences, is that they form a ring under coordinate-wise addition and multiplication. This is essentially equivalent to the limit laws: If $(x_n), (y_n) \in \mathcal C$ then $(x_n + y_n) \in \mathcal C$ and $(x_n y_n) \in \mathcal C$. The identically zero sequence $(0)$ is the additive identity, and $(1)$ is the multiplicative identity. Cauchyness of the coordinate-wise quotient of two sequences also follows from the limit laws, though one has to be careful that the Cauchy sequence in the denominator does not converge to 0, and to “throw out” any quotients in the sequence where the denominator may be 0. We may embed $\mathbb Q \hookrightarrow \mathcal C$ by sending $x$ to the constant sequence $(x)$.

In the real numbers there are many different sequences of rational numbers which converge to the same real number. For instance, much mathematics has been made of discovering interesting rational sequences that converge to some of our favorite irrational numbers, like $\pi$. Notice if we do have two sequence $(x_n)$ and $(y_n)$ and $(x_n) \rightarrow \pi$ and $(y_n) \rightarrow \pi$, and here $\pi$ can be replaced by any real number, then $(x_n – y_n) \rightarrow 0$. Thus we can determine when two convergent sequences converge to the same number, simply by determining whether their difference converges to 0.

Returning to $\mathcal C$, we define an equivalence relation $(x_n) \equiv (y_n)$ if $(x_n – y_n) \rightarrow 0$. In our head we should think of an equivalence class as the set of all Cauchy sequences that converge to the same number, and there is one equivalence class for every possible limit point. And indeed, that is the definition of the completion of $\mathbb Q$ with respect to $| \cdot |$. It is the field formed from the equivalence classes of Cauchy sequences. We add, subtract, multiply and divide by choosing representatives of the equivalence classes and performing the appropriate operation coordinate-wise, and returning the equivalence class of that new Cauchy sequence. The rational number $x$ is represented by the equivalence class of the constant sequence $(x)$.

Let us denote the completion by $\overline{\mathbb Q}$ (this obviously depends on the absolute value, but for the moment we are working the distinguished absolute value $| \cdot |$). Suppose we have another equivalent absolute value $| \cdot |_0 = | \cdot |^c; c > 0$. We wish to argue that both absolute values produce the same completion, that is that the completion is something more appropriately associated to a place rather than a single absolute value. This is done by showing that the absolute values $| \cdot |$ and $| \cdot |_0$ determine the same set of Cauchy sequences, and that they determine the same equivalence relations on those Cauchy sequences.

Suppose $(x_n)$ is Cauchy, and given $\epsilon$ let $N(\epsilon)$ be the guaranteed integer such that $n, m \geq N(\epsilon)$ implies $| x_n – x_m | < \epsilon$. It follows that if we set $N = N(\epsilon^{1/c})$ then $| x_n – x_m |_0 = |x_n – x_m|^c < |\epsilon^{1/c}|^c = \epsilon$. Thus if $(x_n)$ is Cauchy with respect to $| \cdot |$ it is Cauchy with respect to $| \cdot |_0$ the reverse containment is established similarly (but setting $N = N(\epsilon^c)$), and we see both absolute values produce the same Cauchy sequences.

We also need to establish that $(x_n) \rightarrow 0$ with respect to $| \cdot |$ if and only if it does the same with respect to $| \cdot |_0$. The argument is almost identical to that used to show that both absolute values produce the same Cauchy sequences.

The places are in correspondence with $\mathcal P$ and we denote the completion with respect to $| \cdot |_p$ by $\mathbb Q_p$. In particular $\mathbb Q_{\infty}$ is the real numbers.

Extending Absolute Values to $\mathbb Q_p$

Given an element in $\mathbb Q_p$ as represented by Cauchy sequence $x = (x_n)$, then we define $|x|_p = \lim |x_n|_p$ where we take the limit in the real numbers as usual. Upon showing this is well-define, we arrive at an absolute value (which we continue to denote $| \cdot |_p$) on $\mathbb Q_p$ which restricts to $| \cdot |_p$ on $\mathbb Q$.

Suppose $x = (x_n)$ and $y = (y_n)$ are in the same equivalence class—that is $\lim |x_n – y_n|_p = 0$. Then $$| x |_p = \lim | x_n |_p = \lim |y_n + (x_n -y_n)| _p \leq \lim |y_n|_p + \lim |x_n – y_n|_p = |y|_p.$$ But symmetry implies that $|y|_p \leq |x|_p$ so in fact $|x|_p = |y|_p$ and $| \cdot |$ is well defined on $\mathbb Q_p.$

To verify $| \cdot |_p$ is an absolute value on $\mathbb Q_p$, notice that if $| x |_p = 0$ then $(x_n)$ is equivalent to the series $(0)$, that is $x$ is in the zero equivalence class. Multiplicativity follows from the multiplication limit law for series. The triangle inequality likewise follows since $$ | x + y |_p = \lim |x_n + y_n|_p \leq \lim |x_n|_p + \lim|y_n|_p = |x|_p + |y|_p$$ as does the strong inequality (via the continuity of the max function) in the case $| \cdot |_p$ is non-archimedean.

Equipping a field with an absolute value also allows us to define distances, and hence a metric topology. This topology in turn generates a $\sigma$-algebra (which is equal to the Borel $\sigma$-algebra when $p = \infty$) and translation invariant measures (a la Lebesgue measure) on $(\mathbb Q_p, +)$ and $(\mathbb Q^{\times}_p, \cdot)$. I am getting ahead of myself, but the point is that $\mathbb Q_p$ isn’t just a new field with few limit points filled in from $\mathbb Q$, but rather it is a metric space and a measure space and exhibits many features in common with $\mathbb R$.

Different Places Produce Different Completions

Before digging into the topology of $\mathbb Q_p$ we want to justify out claim that the completions of $\mathbb Q$ are in correspondence with $\mathcal P$. So far we have seen that every completion is equal to $\mathbb Q_p$ for some $p \in \mathcal P$, but we have not yet established that different elements in $\mathcal P$ produce different completions.

This is a consequence of the Weak Approximation Theorem for $\mathbb Q_p$, which in our case says that if $| \cdot |$ and $| \cdot |_0$ are non-equivalent absolute values, then you can find an element $x \in \mathbb Q$ such that $| x |$ is large and $| x |_0$ is small (and vice-versa). This will in term imply that different sequences are Cauchy or converge to 0 with respect to these different absolute values.

The Weak Approximation Theorem

Suppose $p_1, \ldots, p_N \in \mathcal P$ index any finite number of places of $\mathbb Q$ and $1 \leq n < N$. Given $a \in \mathbb Q$ and any $\epsilon > 0$ there exists $x \in \mathbb Q$ such that $|x – a|_{p_1}, \ldots, |x – a|_{p_n} \in (0, \epsilon)$ and $|x-a|_{p_{n+1}}, \ldots, |x-a|_{p_N} \in [\epsilon^{-1}, \infty).$

We do this for $a=0$ as follows, first we suppose the $p_1, \ldots, p_N$ are all non-archimedean absolute values and then discuss how to modify the argument if we also want $| x |_{\infty}$ either large or small. Consider $$ x = \frac{p_1^{m_1} \cdots p_n^{m_n}}{p_{n+1}^{m_{n+1}} \cdots p_N^{m_N}}. $$ Then $|x|_{p_1} = p_1^{-m_1}, \ldots, |x|_{p_n} = p_n^{-m_n}$ and these can be made as small as desired by choosing $m_1, \ldots, m_n$ as large as necessary. Similarly, $|x|_{p_{n+1}} = p_{n+1}^{m_{n+1}}, \ldots |x|_{p_N} = p_N^{m_N}$ which can be made as large as desired by choosing $m_{n+1}, \ldots, m_N$ as large as necessary.

How would we simultaneously make $|x|_{\infty}$ as large or small as specified? We may multiply $x$ by a rational number $r$ whose factorization avoids $p_1, p_2, \ldots, p_N$ without changing any of the absolute values $|\cdot|_{p_1}, \ldots, |\cdot|_{p_N}$. That is $|r x|_{p_1} = |x|_{p_1}, \ldots, |r x|_{p_N} = |x|_{p_N}$. Note however, that $|r x|_{\infty} = |x|_{\infty} |r|_{\infty}$, and hence by choosing $r$ sufficiently large or small, the rational number $r x$ will satisfy the conclusions of the theorem when $a=0$. The general case is a consequence of the Chinese Remainder Theorem.

Completions as Topological and Measurable Spaces

Here we are mostly going to assume that $p < \infty$, though we will compare and contrast the situation with the $\mathbb Q_{\infty} = \mathbb R$ case.

The Borel Topology on $\mathbb Q_p$

Once we have an absolute value on a field we can make $\epsilon$-neighborhoods, and these form the basis for a topology. In the case of $x \in \mathbb Q_p$, given $\epsilon > 0$, we define $$B_{\epsilon}(x) = \{ y \in \mathbb Q_p : |y – x|_p < \epsilon \}.$$ When $p = \infty$ these are the usual neighborhoods of $x \in \mathbb R$. The topology generated by all such sets is called the Borel topology. It is easy to see that the collection of open neighborhoods does not depend on which absolute value you use from a particular place. An $\epsilon$-neighborhood with respect to $| \cdot |^c$ is an $\epsilon^{1/c}$-neighborhood of $| \cdot |$ and thus the set of neighborhoods is the same. So from the topological point of view $\mathbb Q_p$ is more naturally associated to a place than to a specific absolute value.

When $p < \infty$ something interesting happens that does not happen in $\mathbb R$. First note that, unlike $| \cdot |_{\infty}$, the non-archimedean absolute values are discrete. Namely $| \cdot |_p$ takes values in $\{p^n : n \in \mathbb Z\}$. This means that any open ball $B_{\epsilon}(x)$ can also be described as a closed ball $\overline{B}_{\epsilon’}(x)$ for some slightly larger $\epsilon’ > \epsilon$. The language sometimes used is that the balls in $\mathbb Q_p$ are clopen.

The balls in $\mathbb Q_p$ are nested in a way that they are not for $\mathbb R$. Namely two balls in $\mathbb Q_p$ are either disjoint, or one is a subset of the other. That is, for $p < \infty$, $\mathbb Q_p$ is totally disconnected. This, like all other differences is driven by the strong triangle inequality. To see this, suppose $B_{\epsilon}(x)$ and $B_{\delta}(y)$ are balls in $\mathbb Q_p$ with $z \in B_{\epsilon}(x) \cap B_{\delta}(y)$. Without loss of generality we may assume $\epsilon \leq \delta$.

First note $x \in B_{\delta}(y)$: $|x – z|_p < \epsilon$ and $|z – y|_p < \delta$. It follows that $|x-y|_p \leq \max\{|x-z|_p, |z-y|_p\} = \delta$.

If $x \not \in B_{\delta}(y)$ then the strong triangle inequality is violated (dotted distance in purple).

Next, if $w \in B_{\epsilon}(x)$ then $|w-y|_p \leq \max\{|w – x|_p, |x – y|_p\} = \delta$ and hence $w \in B_{\delta}(y)$ as claimed. It follows that $B_{\epsilon}(x) \subset B_{\delta}(y)$.

If $w \not \in B_{\delta}(y)$ then the strong triangle inequality is violated (dotted distance in red).

Another property of $\mathbb Q_p$ (this time shared between $p$ finite and infinite) is local compactness. Recall the definition: a space is locally compact if every point $x$ has a neighborhood which is contained in a compact set. It turns out that $\mathbb Q_p$ has the Heine-Borel property, and for any $x$, there is an epsilon such that $x \in B_{\epsilon}(x) \subset \overline B_{\epsilon}$ which does the job.

Haar measure on $\mathbb Q_p$

In the standard manner we set $\mathcal B$ to be the $\sigma$-algebra on $\mathbb Q_p$ generated by all balls. When $p < \infty$ the total disconnectivity of $\mathbb Q_p$ means that the generic sets in $\mathcal B$ are much easier to describe that in the $\mathbb R = \mathbb Q_{\infty}$ situation. Namely, because the countable intersection of balls is either another ball or a singleton (a set containing a single point), we see that a generic set in $\mathcal B$ looks like a countable union of balls and singletons. This is very tidy in comparison the nightmarishness of a general Borel subset of $\mathbb R$.

$(\mathbb Q_p, +)$ and $(\mathbb Q_p^{\times}, \cdot)$ are locally compact abelian groups. Locally compact abelian groups are important kinds of topological and measurable spaces because they can be equipped with a translation invariant measure. Specifically, if $B \in \mathcal B$ is a Borel set and $x \in \mathbb Q_p$ then we call $$ x + B = \{ x + b : b \in B \} \qquad \mbox{and} \qquad x B = \{ x b : b \in B\}$$ the additive and multiplicative translations of $B$ by $x$. A measure $\mu$ on $(\mathbb Q_p, \mathcal B)$ is said to be translation invariant if $\mu(x + B) = \mu(B)$ for all $B \in \mathcal B$ and all $x \in \mathbb Q_p$. In the case of $(\mathbb Q_p^{\times}, \mathcal B^{\times})$ the corresponding condition for the multiplicactive translation invariant measure $\mu^{\times}$ is $\mu^{\times}(x B) = \mu^{\times}(B)$. (I have not formally introduced the $\sigma$-algebra, $\mathcal B^{\times}$ on $\mathbb Q_p^{\times}$ but as as a set of points $\mathbb Q_p^{\times}$ is simply $\mathbb Q_p$ with 0 removed. We may take $\mathcal B’$ to be the $\sigma$-algebra generated by all balls except those containing 0.)

We may make $\mu$ and $\mu^{\times}$ unique by specifying the measure of a single clopen set (or in the case of $\mathbb Q_{\infty} = \mathbb R$ a single (non-singleton) closed interval). Thus we define the measures $\mu_p$ and $\mu^{\times}_p$ to be the unique translation invariant measures on $(\mathbb Q_p, \mathcal B)$ and $(\mathcal Q_p^{\times}, \mathcal B^{\times})$ normalized so that $$ \mu_p \{ x : |x|_p \leq 1 \} = 1 \qquad \mbox{and} \qquad \mu_p^{\times} \{x : |x|_p = 1 \} = \frac{p-1}{p}.$$ These measures are referred to as the (normalized) Haar measures for $\mathbb Q_p$ and $\mathbb Q_p^{\times}$.

The normalization on $\mu_p^{\times}$ may look a little strange. This choice was motivated by the fact that for any $x \in \mathbb Q_p$ and $B \in \mathcal B$, $$\mu_p(x B) = |x|_p \mu_p(B).$$ This implies that $$\mu_p^{\times}(dx) = \frac{\mu_p(dx)}{|x|_p}.$$ Moreover, if we set $C$ to be the closed unit ball, we have $p C = \{ x \in \mathbb Q_p : |x|_p < 1 \}$ and $\mu(pC) = \mu(C)/p$. It follows that $$\mu_p\{ x : |x|_p = 1 \} = \mu_p(C) – \mu_p(pC) = \frac{p-1}{p}.$$ The normalization for $\mu_p^{\times}$ makes it equal to $\mu_p$ on $\{ x : |x|_p = 1 \}$, a situation which can be advantageous when both measures come into play.

Example

Let $B = \{ x : 0< |x|_p < 1\}$ and $\overline B = \{ x : 0< |x|_p \leq 1\}$, and let $U = \overline B \setminus B = \{x : |x|_p = 1\}$. Then $$ \overline B = U \sqcup B \quad \mbox{and} \quad B = p\overline{B}.$$ Induction then implies that $$\overline B = \bigsqcup_{n=0}^{\infty} p^n U; \qquad \mbox{Indeed} \qquad \mathbb Q^{\times}_p = \bigsqcup_{n \in \mathbb Z} p^n U.$$ This is the decomposition of $\mathbb Q^{\times}_p$ into sets of equal absolute value. That is $p^n U$ is exactly the set where on which $|x|_p = p^{-n}$.

Suppose $s > 0$ and consider $$ \int_{\overline B} |x|_p^s \, \mu_p^{\times}(dx) .$$ Using the decomposition, $$\int_{\overline B} |x|_p^s \, \mu_p^{\times}(dx) = \sum_{n=0}^{\infty} \int_{p^n U} |x|_p^s \, \mu_p^{\times}(dx).$$ The integrand is constant (and equal to $p^{-n s}$) on $p^n U$, and $\mu_p^{\times}(p^N U) = \mu_p^{\times}(U) = (p-1)/p$. Hence, $$\int_{\overline B} |x|_p^s \, \mu_p^{\times}(dx) = \sum_{n=0}^{\infty}p^{-ns} \left(\frac{p-1}{p}\right) = \left(\frac{p-1}{p}\right) \frac{1}{1 – p^{-s}}.$$ This is an important calculation in the theory of the Riemann $\zeta$-function.

0

Field Extensions and Number Fields

Here I am storing various basic facts about Number Fields that are useful in other notes. I hope this becomes more complete as time goes on.

Number Fields

Recall that a number field $K$ is a finite extension of $\mathbb Q$. While we often think of number fields as $\mathbb Q(\alpha)$ for some algebraic number embedded in $\mathbb C$ it is useful to recall the general (unembedded) construction. $\mathbb Q[x]$ is the ring of polynomials with rational coefficients in the indeterminant $x$. If $f(x) \in \mathbb Q[x]$ is irreducible, then $f(x) \mathbb Q[x]$, the ideal formed from all rational polynomials divisible by $f(x)$, is a maximal ideal in $\mathbb Q[x]$. It follows that $K = \mathbb Q[x]/f(x) \mathbb Q[x]$ is a commutative ring with all non-zero elements invertible—that is a field.

In this construction, the elements of $K$ are cosets of the form $g(x) + f(x) \mathbb Q[x]$. If $g(x)$ and $h(x)$ generate the same coset, then we will write $g(x) \equiv h(x)$ (or $g(x) \equiv h(x) \bmod f(x)$ if more clarity is necessary). In this situation $f(x) | (g(x) – h(x))$.

Given the coefficients of $f(x)$, the arithmetic in $K$ is easy to perform. Suppose for $a_0, \ldots, a_{d-1}$ are the rational coefficients to $$f(x) = x^d + \sum_{n=0}^{d-1} a_n x^n,$$ then, $$ x^d \equiv -a_0 – a_1 x – \cdots – a_{d-1} x^{d-1}.$$ Now suppose $g(x) + f(x) \mathbb Q[x]$ is an arbitrary coset. By replacing monomials $x^n$ in $g(x)$ when $n > d$ (serially, if necessary) using this congruence, we see that $g(x) \equiv h(x)$ for some $h(x) \in \mathbb Q[x]$ with $\deg(g) < d$. The polynomial $h(x)$ is equivalent to the result of the Division Algorithm in $Q[x]$ for the remainder of $g(x)$ when divided by $f(x)$.

That is, as a group (in fact, as a vector space) $K$ is isomorphic to $\mathbb Q^d$ where the isomorphism is given by $$(b_0, \ldots, b_{d-1}) \mapsto x^d + \sum_{m=0}^{d-1} b_m x^m + f(x) \mathbb Q[x].$$ The only thing missing in this description is the multiplication. If we want to multiply two vectors $\mathbf b, \mathbf c \in \mathbb Q^d$, we set $g(x)$ to be the monic polynomial with coefficient vector $\mathbf b$ and $h(x)$ to be the polynomial with coefficient vector $\mathbf c$. We first multiply $g(x)$ and $h(x)$ as usual in $\mathbb Q[x]$, and then we use the equivalence $ x^d \equiv -a_0 – a_1 x – \cdots – a_{d-1} x^{d-1}$ to replace monomials in $g(x) h(x)$ (repeatedly if necessary) until we arrive at a polynomial $p(x)$ of degree $< d$. The coefficient vector of this polynomial in $\mathbb Q^d$ is the product of $\mathbf b$ and $\mathbf c$.

$K$, What is it Good for?

First, note that $\mathbb Q \hookrightarrow K$ by the map $r \mapsto r + f(x) \mathbb Q[x]$, and by definition (the fact that $K$ is a vector space of dimension $d$ over $\mathbb Q$) it is a number field of degree $d$ over $\mathbb Q$. This implies $\mathbb Q[x] \hookrightarrow K[x]$, and in particular, $f(x)$ has a life in $K[x]$. Because $f(x)$ is irreducible in $\mathbb Q[x]$ it has no zeroes in $\mathbb Q$. However, we will show that this is no longer the case in $K[x]$. And that is what $K$ is good for—producing a number field where $f(x)$ has a zero.

The element $x + f(x) \mathbb Q[x]$ is the root of $f(x)$ in $K$. To see this, we need only calculate $$f(x + f(x) \mathbb Q[x]) = f(x) + f(x) \mathbb Q[x] = 0 + f(x) \mathbb Q[x].$$

The element $x + f(x) \mathbb Q[x]$ is important as well because if we know how to multiply by this element, then we know how to multiply by arbitrary elements (which are, after all, simply linear combinations of its powers).

Multiplication by $x$ is a linear operator on $\mathbb Q[x]$, indeed $x( a g(x) + h(x) ) = a x g(x) + x h(x)$, and multiplication by $x + f(x) \mathbb Q[x]$ is a linear operator on $K$. We know $K$ is a vector space with basis $( x^n + f(x) \mathbb Q[x] : n=0,\ldots, d-1)$, so it makes sense to talk of the matrix of the multiplication operator, call it $T$, with respect to this basis. Note that, if we denote the standard basis of $\mathbb Q^d$ (with coordinates indexed from 0 to $d-1$ for consistency) by $\mathbf e_0, \ldots, \mathbf e_{d-1}$, then for $n < d-1$, $T \mathbf e_{n} = \mathbf e_{n+1}$. This corresponds to the multiplication $x x^{n} = x^{n+1}$ which remains true in $K$ if $n < d-1$. The final calculation, using the same equivalence that has gotten us so far $ x^d \equiv -a_0 – a_1 x – \cdots – a_{d-1} x^{d-1}$, shows that $T \mathbf e_{d-1} = -a_0 \mathbf e_0 – a_1 \mathbf e_1 – \cdots – a_{d-1} \mathbf e_{d-1}$. It follows that the matrix of $T$ with respect to the basis $(\mathbf e_n)$ is $$ \begin{pmatrix} 0 & 0 & \cdots & 0 & -a_0 \\ 1 & 0 & \cdots & 0 & – a_1 \\ 0 & 1 & \cdots & 0 & – a_2 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & \cdots & 1 & -a_{d-1}\end{pmatrix}.$$

If this matrix looks familiar it is because it is the (Frobenius) companion matrix to $f(x)$ and the characteristic polynomial of this matrix (and hence the operator $T$) is $f(x)$. Indeed, the irreducibility of $f(x)$ implies that the minimal polynomial of $T$ is $f(x)$ as well.

But $f(x)$ has roots in $\mathbb C$. What about them?

The Fundamental Theorem of Algebra (ironically a theorem in analysis) guarantees that $f(x)$ has $d$ roots (counting multiplicity) in $\mathbb C$. How are they related to the root of $f(x)$ is $K$?

Let’s start with our favorite $\alpha \in \mathbb C$ such that $f(\alpha) = 0$. We know that $\alpha$ is either in $\mathbb R$ or it has a complex conjugate—more about that later. We can embed $K$ into $\mathbb C$ by sending $x + f(x) \mathbb Q[x] \rightarrow \alpha$. That is, $$ a_{d-1} x^{d-1} + \cdots + a_1 x + a_0 + f(x) \mathbb Q[x] \quad \mapsto \quad a^{d-1} \alpha^{d-1} + \cdots a_1 \alpha + a_0.$$ We denote this embedding by $\mathbb Q(\alpha) \subset \mathbb C$. Notice that if $\alpha \in \mathbb R$, then $\mathbb Q(\alpha) \subset \mathbb R$ and we call it a real embedding of $K$.

A count of the real and complex embeddings products the first of the classical invariants of a number field.

Invariants: Number of real and complex embeddings $r_1$ and $r_2$

Let us distinguish the real and complex roots of $f(x)$ by setting $\alpha_1, \ldots, \alpha_{r_1}$ to be the real roots and $\beta_1, \overline{\beta_1}, \ldots, \beta_{r_2}, \overline{\beta_{r_2}}$ be the non-real complex roots. Clearly $r_1 + r_2 = d$. Then the embeddings $\mathbb Q(\alpha_1), \ldots, \mathbb Q(\alpha_{r_1}) , \mathbb Q(\beta_1) , \ldots, \mathbb Q(\beta_{r_2})$ are called the archimedean embeddings of $K$.

The Norm and Trace

Here we wish to work in some generality and consider field extension $K | k$ where both are number fields. Little generality is lost by keeping the example $k = \mathbb Q$ at the front of your mind. However, as many properties of number fields ‘factor through’ intermediate fields (for instance $[ K : \mathbb Q] = [K : k] [k : \mathbb Q]$) it is useful to maintain some generality in notation etc.

We will also abandon our attempt to denote elements of $K | k$ as cosets in $k[x] / f(x) k[x]$, writing for instance $\alpha, \beta, \gamma, \ldots $ for generic field elements. Often we will implicitly identify $K$ with $k(\alpha)$ for some algebraic number $\alpha$ of degree $d$ over $k$. In this situation $\{1, \alpha, \ldots, \alpha^{d-1} \}$ is a basis for $K$, and the matrix of multiplication by $\alpha$ with respect to this basis is exactly the matrix of $T$ as before (the Frobenius companion matrix of the minimal polynomial of $K | k$).

More generally, given any $\gamma \in K$, we can make the linear operator “multiplication by $\gamma$” $T_{\gamma}$. If $\gamma$ is given as a $k$-linear combination of $\{1, \alpha, \ldots, \alpha^{d-1}\}$ then it is relatively easy to compute the matrix of $T_{\gamma}$ with respect to this basis. Note that this matrix has entries in $k$.

Norm and trace of $K|k$

The norm $N_{K|k} : K \rightarrow k$ and trace $\mathrm{Tr}_{K|k} : K \rightarrow k$ are respectively the determinant and trace of $T_{\gamma}$.

This definition is independent of basis, but can be computed explicitly in the basis $\{1, \alpha, \ldots, \alpha^{d-1}\}$.

If $\beta, \gamma \in K$, then $T_{\beta \gamma} =T_{\beta} \circ T_{\gamma}$ and $T_{\beta + \gamma} = T_{\beta} + T_{\gamma}$. The multiplicativity of the determinant and the additivity of the trace imply that $$N_{K|k}(\beta \gamma) = N_{K|k}(\beta) N_{K|k}(\gamma)$$ and $$\mathrm{Tr}_{K|k}(\beta + \gamma) = \mathrm{Tr}_{K|k}(\beta) + \mathrm{Tr}_{K|k}(\gamma).$$

The norm is a natural homomorphism from $K^{\times}$ onto $k^{\times}$ and the trace is a natural homomorphism from the additive group $(K, +)$ onto $(k,+)$.

Notation

Number fields
$f(x), g(x), h(x),$ etc.Polynomials, often in $\mathbb Q[x]$ or $k[x]$
$k, K, L$Number fields
$\alpha, \beta, \gamma,$ etcGeneric field elements. The field depends on context.
$[K : k]$, $d$The degree of a field extension. The fields depend on context.
$T_{\alpha}$The linear transformation on $K|k$ (fields context dependent) given by multiplication by $\alpha$.
$r_1, r_2$The number of real and complex embeddings (respectively) of $K$ (context dependent).
$N_{K|k}$, $\mathrm{Tr}_{K|k}$The Norm and Trace maps $K \rightarrow k$ given by $\alpha \mapsto \det( T_{\alpha})$ and $\alpha \mapsto \mathrm{Tr}( T_{\alpha})$.
$\mf o$, $\mf O$Rings of integers in $k$ and $K$.
$\mf a, \mf b, \mf A, \mf B,$ etc.Ideals in rings of integers. We often use lower case fraktur letters for ideals in $\mf o$ and capital fraktur letters for ideals in $\mf O$.
$\mf p, \mf q, \mf P, \mf Q$ Prime ideals in $\mf o$ and $\mf O$.
$\mathbb N \mf a, etc$ etcThe ideal norm $\mathbb N \mf a = [\mf o : \mf a]$.
0

Recalling Galois Theory

This is a brief reminder of the main ideas of Galois theory. Any proofs purported here are meant to be suggestive. I learned Galois theory out of Dummit and Foote, which I thought was pretty good. I also have Classical Galois Theory by Gaal on my shelf. This book is in essence one giant worksheet. I have not completed many of the exercises, but I suspect anyone who did would gain a remarkable intuition as to how the theory hangs together.

At any rate, this is mostly for me, since I seem to need to be reminded of the basics of Galois theory every few years.

It may be useful to review Field Extensions and Number Fields before continuing.

Automorphisms of Field Extensions

We want to work in a bit of generality here, so we assume $K | k$ is an extension of number fields. Little generality is lost at this point if you take $k = \mathbb Q$.

Definition

An isomorphism of K is called an automorphism. The set of automorphisms of $K,$ $\mathrm{Aut}(K)$, forms a group under composition. An automorphism σ is said to fix $k$ if $\sigma \gamma=\gamma$ for all $\gamma \in k$. The set of automorphisms of $K$ which fix $k$ is denoted $\mathrm{Aut}(K|k)$ and is a subgroup of $\mathrm{Aut}(K)$.

$\mathrm{Aut}(K|k)$ is a finite group of degree at most $d = [K:k]$. We will use this fact, though the proof would take us too far afield.

Automorphisms of $K$ which preserve $k$ permute roots of polynomials with coefficients in $k$ and roots in $K$.

Proposition

Suppose $g(x) \in k[x]$ is irreducible and there exists $\beta \in K$ such that $g(\beta) = 0$. Then $g( \sigma \beta) = 0 \quad \mbox{for all} \quad \sigma \in \mathrm{Aut}(K|k)$.

Proof

Suppose $g(x) = \sum_m b_m x^m$, then $\sigma b_m = b_m$, and hence $$0= \sigma g(\beta) = \sum_m \sigma b_m (\sigma \beta)^m = \sum_m b_m (\sigma \beta)^m = g(\sigma \beta). \qquad \square$$

One particularly important automorphism is complex conjugation. Suppose that $K | \mathbb Q$ is a number field, and $K \cong \mathbb Q(\alpha)$ for some non-real $\alpha$ (that is the minimal polynomial of $K$ has a non-real root in $\mathbb C$). Then, since complex conjugation is an automorphism of $\mathbb C | \mathbb R$, we have it is also an isomorphism on $K(\alpha) | \mathbb Q$. It follows that if $\beta \in \mathbb Q(\alpha)$ then $\overline \beta \in \mathbb Q(\alpha)$ as well and hence, as sets, $\mathbb Q(\alpha)=\mathbb Q(\overline \alpha)$, and algebraic operations in one of these embeddings can be found from the other by complex conjugation.

Let us distinguish the real and complex roots of $f(x)$ by setting $\alpha_1, \ldots, \alpha_{r_1}$ to be the real roots and $\beta_1, \overline{\beta_1}, \ldots, \beta_{r_2}, \overline{\beta_{r_2}}$ be the non-real complex roots. Clearly $r_1 + r_2 = d$. Then the embeddings $\mathbb Q(\alpha_1), \ldots, \mathbb Q(\alpha_{r_1}) , \mathbb Q(\beta_1) , \ldots, \mathbb Q(\beta_{r_2})$ are called the archimedean embeddings of $K$.

Splitting Fields

Galois theory is concerned about the zeros of rational polynomials and how their zeroes are permuted by the automorphisms of certain extensions of $\mathbb Q$ (which will come to be called Galois extensions). We already noted, that the automorphisms in $\mathrm{Aut}(K|k)$ preserve the set of zeroes in $K$ of any given polynomial $g(x) \in k[x]$. However, the construction $K$ makes no guarantee that a generic polynomial $g(x)$ will have a zero in $K$, and even for the minimal polynomial $f(x)$, the construction of $K$ only guarantees the existence of a single zero of $f(x)$.

In general, if we wanted an extension of $k$ that contains all the zeros of $f(x)$, we would first compute $K = k[x]/ f(x) \mathbb Q[x]$. $K$ contains at least one zero of $f(x)$, and if we factor it in $K[x]$ there will be a linear factor for each of those zeroes. We can then sequentially extend $K$ by constructing field extensions from the remaining irreducible factors of $f(x)$. Each time we extend fields by another irreducible factor, we add another zero of $f(x)$ to the resulting field extension. The process terminates after at most $d$ steps to produce the splitting field of $f(x)$. The splitting field of $f(x)$ has degree bounded by $d!$.

It is possible that the degree of the splitting field is as small as $d$, since it is possible, depending on the nature of $f(x)$, that $K[x]/f(x)\mathbb Q(x)$ itself contains $d$ zeros of $f(x)$.

Example

Suppose $p$ is prime and consider the $p$th cyclotomic polynomial $$\Phi_p(x) = x^{p-1} + x^{p-2} + \cdots + x + 1.$$ Suppose $\zeta$ is a zero of $\Phi_p(x)$ in $\mathbb Q[x]/\Phi_p(x) \mathbb Q[x]$, then it is easily verified that $\zeta^p = 1$.It follows that, if $\ell = 1, \ldots, p-1$, \begin{eqnarray}\Phi_p(\zeta^{\ell}) &=& \zeta^{\ell(p-1)} + \zeta^{\ell(p-2)} + \cdots + \zeta^{\ell} + 1 \\ &=& \zeta^{p-1} + \zeta^{p-2} + \cdots + \zeta + 1 \\ &=& 0.\end{eqnarray} It follows that $\zeta, \zeta^2, \ldots, \zeta^{p-1}$ are all $p-1$ roots of $\Phi_p(x)$ and hence $K = \mathbb Q[x]/\Phi_p(x)\mathbb Q[x]$ is the splitting field of $\Phi_p(x)$.

Galois Theory

Definition

If $K | k$ is a splitting field for a polynomial $g(x) \in k[x]$, then $K$ is said to be Galois over $k,$ and the group of automorphisms of $K$ which fix $k$ is called the Galois group and denoted $\mathrm{Gal}(K|k)$.

Claim

$K | k$ is Galois if and only if $\# \mathrm{Aut}(K|k) = [K : k]$.

We won’t prove this claim (though the only if direction is easy) because it is a bit fiddly with separability and involves a diversion into character theory. Some (most?) authors give this as the definition of Galois and prove that it implies the splitting field definition.

The main result in Galois Theory is a correspondence between intermediate fields of $K | k$ and subgroups of $\mathrm{Gal}(K|k)$. Let us write $G = \mathrm{Gal}(K|k)$ and suppose $H < G$ is a subgroup. Define $$K_H = \{ \gamma \in K : \sigma(\gamma) = \gamma \mbox{ for all } \sigma \in H \}.$$ It is easily verified that $K_H$ is a field, and $k \subset K_H \subset K$ (which we might abbreviate $K | K_H | k$). It will turn out that $H \leftrightarrow K_H$ will be a bijection (called the Galois correspondence) between subgroups of $G$ and intermediate fields of $K | k$.

This correspondence goes beyond a bijection, because there is an interpretation for $H$ and $G/H$ (as a subgroup in the case where $H$ is normal, but to some extent even as a set of cosets in the non-normal case) in terms of the groups of automorphisms $\mathrm{Gal}(K|K_H)$ and $\mathrm{Aut}(K_H|k)$. I hope you objected to the notational switch between $\mathrm{Gal}$ and $\mathrm{Aut}$ in the previous sentence, but it is correct. The fact that $K$ is a splitting field for a polynomial in $k[x]$ means that it is also the splitting field for a polynomial in $K_H[x]$ (namely any one of the irreducible factors of the original polynomial in $k[x]$) and hence $K | K_H$ is Galois and we use the notation $\mathrm{Gal}(K | K_H)$ for the group of automorphisms of $K$ preserving $K_H$. This is, unsurprisingly, equal to $H$. On the other hand, just because $K$ is the splitting field of a polynomial in $k[x]$ doesn’t imply that an intermediate field, such as $K_H$, must be a splitting field for that or any other polynomial in $k[x]$. Thus, in general we need to refer to the automorphism group of $K_H | k$ by $\mathrm{Aut}(K_H | k)$. It will turn out that when $H$ is normal in $G$ then $K_H | k$ is Galois, and $\mathrm{Gal}(K_H | k) \cong G/H$. This will all be enumerated in the Fundamental Theorem of Galois Theory, but we need to develop a few results first.

Given $\gamma \in K$, we call $\sigma \gamma; \sigma \in G$ the Galois conjugates of $\gamma$. Moreover, if $L$ is any intermediate field extension, $K | L | k$, then $\sigma$ gives an isomorphism from $L$ onto $\sigma(L)$ (which fixes $k$). In particular $K_H$ is isomorphic to its image $\sigma K_H$. Notice then that if $\psi \in \mathrm{Aut}(K_H | k)$, then $\sigma \psi \sigma^{-1}$ is an element of $\mathrm{Aut}(\sigma K_H | k)$.

Indeed, $\sigma H \sigma^{-1} = \mathrm{Aut}(\sigma K_H | k)$. We can make this more evocative by denoting the action by conjugation of $G$ on $H$ as $\sigma \cdot \psi = \sigma \psi \sigma^{-1}$, in which case, $$\sigma \cdot \mathrm{Aut}(K_H | k) = \mathrm{Aut}(\sigma K_H | k).$$ If $\sigma K_H = K_H$ for all $\sigma \in G$, then $GHG^{-1} = H$, that is $H$ is normal in $G$. On the other hand, if $H$ is normal in $G$, then $\sigma \psi \sigma^{-1} \in H$ and $\sigma K_H = K_H$ for all $\sigma \in G$.

Now, suppose $H$ is normal and $g(x) \in k[x]$ is a polynomial so that $K_H = k[x]/g(x)k[x]$. From the previous discussion, $x + g(x) k[x]$ is a zero of $g(x)$ in $K_H$, as are $\sigma (x + g(x) k[x])$ for all $\sigma \in G$. To establish $K_H | k$ is Galois, we need to show that the orbit of $x + g(x) k[x]$ under $G$ is equal to $[K_H : k]$. We know for $\sigma \in H$, $\sigma(x + g(x) k[x]) = x + g(x)k[x]$. On the other hand, if $\sigma(x + g(x) k[x]) = x + g(x)k[x]$ then $\sigma \in H$ because $\sigma$ is completely determined by its action on $x + g(x) k[x]$. Thus, the automorphisms of $\mathrm{Aut}(K_H|k)$ are in correspondence with $G/H$. We thus have $[K : K_H] = \#H$, $[K : k] = \#G$ and $[K_H : k] = \#G/H$. It follows that $\# \mathrm{Aut}(K_H|k) = [K_H : k]$ and $K_H | k$ is thus Galois.

To be sure, we have glossed over many details. However, many important observations are captured in the Fundamental Theorem of Galois Theory.

Fundamental Theorem of Galois Theory

Suppose $K | k$ is Galois and $G = \mathrm{Gal}(K|k)$.

CORRESPONDENCE

There is an inclusion reversing correspondence between intermediate fields of $K|k$ and subgroups of $H$.

Normality $\leftrightarrow$ Galois

$H$ is normal in $G$ if and only if $L|k$ is Galois. In this situation $\mathrm{Gal}(L|k) \cong G/H$.

The Correspondence Preserves Lattices

Suppose $H_1 \leftrightarrow L_1$ and $H_2 \leftrightarrow L_2$ for $H_1, H_2 \leq G$ and $L_1, L_2$ intermediate fields of $K|k$. Then $\langle H_1, H_2 \rangle \leftrightarrow L_1 \cap L_2$ and $H_1 \cap H_2 \leftrightarrow L_1 L_2$. (Here $\langle H_1, H_2 \rangle$ is the smallest subgroup of $G$ containing both $H_1$ and $H_2$ and $L_1 L_2$ is the smallest field containing both $L_1$ and $L_2$). Moreover the inclusions (e.g. $L_1 \cap L_2 \subset L_1 \subset L_1 L_2$) are reversed under the correspondence.

Here we write arrows for the inclusion map. The correspondence reverses inclusion.

The correspondence between subgroups of $\mathrm{Gal}(K|k)$ and subfields of $K|k$ is complete as the subfields of $\mathbb{Q}(i, \sqrt[8]{2})$ and subgroups of $G = \langle \sigma, \tau : \sigma^8 = \tau^2 = 1, \sigma \tau = \tau \sigma^3 \rangle$. This example was cribbed from Abstract Algebra, second edition by Dummit and Foote.

0

From Measures to Metrics on Pro-finite Completions

The complete 3-nary tree represents the family tree where each individual in a generation spawns (asexually) exactly three progeny in the subsequent generation. The image to the left represents 7 generations beginning from a single ancestor (the root) at the center of the image.

If we imagine the generations continuing ad infinitum, then we arrive at an object called the pro-finite completion of the tree. Loosely speaking this is the topological space which consists of all infinite paths from the root down through the (infinitely many) generations.

The pro-finite completion of the complete 3-nary tree can be put in correspondence with sequences of the form $(m_n)$ where each $m_n \in \{0,1,2\}$ as follows: Each descendent of an individual is labelled either 0, 1 or 2 (this can be done consistently by ordering, say, counterclockwise in the embedding above, but it doesn’t really matter so long as the labels are fixed for all time). A sequence starting, say, $(1,0,2,1,…)$ represents a child of (reading right to left) the first child of the second child of the zeroth child of the first child of the root. Admittedly ‘zeroth child’ sounds awkward, but we think of these as labels and not ordinals.

Visually, we may think of the pro-finite completion to be the boundary of the infinite graph, and the corresponding sequence $(m_n)$ as an address containing the information necessary to describe how to traverse the tree to get to that point on the boundary.

There are other embeddings of the complete 3-nary tree, including the ‘balloon embedding’ on the right. In this embedding the pro-finite completion is visualized as its (fractal) boundary. This embedding gives another construction of the fractal known as Serpienski’s Triangle.

In this embedding you may think of an ‘address’ of a point on the boundary as given by a sequence of ‘Left’, ‘Right’ and ‘Forward’ directions were you to drive to that point from the root along the edges of the graph. A bijection between $\{0,1,2\}$ and {Left, Right, Forward} will produce the sequence $(m_n)$.

Of course there’s nothing special about the 3-nary tree. We could start with any number of descendants per individual per generation. Indeed, we could let the number of descendants vary either between generations, or within a generation. We will see some examples of this soon.

Balloon embeddings of the complete 2-nary (binary), 4-nary and 5-nary trees. In each case the pro-finite completion is the fractal boundary of these graphs, and not the depicted edges and vertices.

Random Trees

There are lots of ways to make random graphs and trees, but here we will concentrate on a sort of random tree that will arise in the study of prime splitting in towers of number fields. We will suppose we start with a single ancestor (the root), and that each individual in the nth generation has an independent, identically distributed, bounded number of children. Note that the bound on the number of children may grow with generations, but for each generation there is some upper bound on the number of children an individual may have.

A simple example of a random tree where each individual has an equal chance of having 1, 2 or 3 offspring. The images are different embeddings of the same random tree.

Suppose the largest number of children an individual in the $n$the generation may have is $b_n$ (for instance, the random tree above has $b_n=3$ for all $n$). We call the sequence $(b_n)$ the sequence of generation bounds, and we call the tree where each individual in the $n$th generation has exactly $b_n$ children the complete $(b_n)$-nary tree. Every random tree with generation bounds $(b_n)$ can be embedded as a subtree in the complete $(b_n)$-nary tree.

As in the non-random case, the pro-finite completion of a random tree is the address, given as the directions necessary to traverse the tree from the root to a point on the `boundary’. Another way of representing the information given in the address is provided by the list of vertices $(v_n)$ one passes through on the voyage from the root. Here one assumes that the vertices are uniquely labelled. If $(v_n)$ is such a list of vertices we will write $v_m | v_n$ for all $m > n$. Loosely speaking, a vertex $v$ divides the vertex $w$ if $v$ is a vertex further down the tree from $w$. Put even more simply, $v | w$ if $v$ is descended from $w$. We will denote the root of the tree by $v_0$ and note that $v | v_0$ for all vertices $v$.

Let $B$ be the pro-finite completion of our (possibly random) tree as represented by sequences of vertices $(v_n)$, one per generation, with $v_m | v_n$ for all pairs $m > n$. For any vertex $w$ we define $$B(w) = \{ (v_n) \in B : w = v_m \mbox{ for some } m \}.$$ Loosely speaking $B(w)$ is the set of points in the pro-finite completion (boundary of the tree) that are downstream from vertex $w$.

$B(w)$ represents the part of the pro-finite completion that lies in the blue disk (left) or arc (right). Note that these are different representations of the same $B(w)$ on different embeddings of the same random tree.

Note that if $u$ and $w$ are different vertices, then either $B(u)$ and $B(w)$ are disjoint, or one is a subset of the other. It is worth supplying your own proof of this, or at least understanding why it is true from a picture.

$\sigma$-algebras and measures on $B$

We eventually want to talk about measures (and metrics) on the pro-finite completion of a random tree, but first we need a suitable $\sigma$-algebra. As usual, we actually define a nice collection of sets that we want to be in our $\sigma$-algebra and consider the smallest $\sigma$-algebra that does the trick. Dynkin’s $\pi$-$\lambda$ Theorem seems particularly salient here, and we define $\mathcal P$ to be the $\pi$-system given by all $B(w)$ for all vertices of our tree. That is $$\mathcal P = \{ B(v) : v \mbox{ is a vertex} \} \cup \emptyset.$$ We have to throw in the empty set, because a $\pi$-system is a collection of sets closed under intersection, and by our previous remarks, it is possible (common, in fact) for elements of $\mathcal P$ to be disjoint. We set $\mathcal D$ to be the $\sigma$-algebra on $B$ generated by $\mathcal P$. And we take $(B, \mathcal D)$ to be the measurable space in which all calculations occur.

Notice that, since the intersection of any two elements of $\mathcal P$ is again an element of $\mathcal P$ we see that elements of $\mathcal D$ are simply (possibly countable) disjoint unions of elements of $\mathcal P$. That is, for each set $A \in \mathcal D$ there is a (finite or) countable collection of vertices $V$, and a (finite or) countable $X \subset B$ such that $$A = \bigsqcup_{v \in V} B(v) \sqcup \bigsqcup_{x \in X} \{x \}.$$ The disjointness of this union implies we may do this in such a way that for any $u, v \in V$, $u \not | \;\; v$, and for any $x \in X$ and $v \in V$, $x \not \in B(v)$. We call $V$ a reduced set of vertices for $A$.

The $\pi$-$\lambda$ Theorem implies that a measure $\mu$ on $(B, \mathcal D)$ is determined completely by its values on $\mathcal P$. Note that if $w_1, \ldots, w_d$ are the child-vertices of vertex $w$, then $\mu(w) = \mu(B(w_1)) + \cdots + \mu(B(w_d))$ (and conversely, any collection of $\{m_v \in [0, \infty] : v \mbox{ a vertex}\}$ satisfying all consistency conditions of the form $m_w = m_{w_1} + \cdots + m_{w_d}$ will determine a measure on $(B, \mathcal D)$). There may be special measures on $(B, \mathcal D)$ depending on the construction of your tree, but for now we maintain complete generality, and see how various aspects of the measure interact to potentially give a metric on $B$.

Recall that an atom of the measure $\mu$ is a set $A \in \mathcal D$ such that $\mu(A) > 0$ and if $C \in \mathcal D$ is a proper subset of $A$ then $\mu(C) = 0$. By our construction of elements of $\mathcal D$ we see that if $A$ is an atom of $\mu$ then either $A = \{x \}$ for some $x \in B$, or $A = B(v)$ for some vertex $v.$ In fact, we will see that this latter situation is impossible. To see why, suppose the vertices $v_1, \ldots, v_d$ are the immediate descendants of $v$. Then, $$\mu(B(v)) = \mu(B(v_1)) + \cdots + \mu(B(v_d)).$$ If $d > 1$, It is not possible for $\mu(B(v)) > 0$ and $\mu(B(v_n)) = 0$ for all $n=1,\ldots, d$. Hence, if $v$ has more than one immediate descendent, $B(v)$ cannot be an atom of $\mu$. If, on the other hand $d = 1$ then $B(v) =B(v_1)$ and we can repeat our argument to show that either $B(v_1)$ is not an atom, or it only has one immediate descendant. It follows that if $B(v)$ is an atom then each descendent of $v$ has only one immediate descendent. That is, $B(v)$ contains only one $x \in B$, and hence if $B(v)$ is an atom, then in fact $B(v) = \{x \}$.

$B(v)$ can be a singleton only when all descendants of $v$ have only one immediate descendent. The only $B(v)$ that can be atoms of $\mu$ are singletons.

If $\mu$ has no atoms, then it is said to be diffuse. If $\mu(B(v)) > 0$ for all $B(v)$ that are not singletons, then we say $\mu$ is a full measure.

A pseudo-ultrametric formed from $\mu$

A metric on $B$ is a function $\delta: B \times B \rightarrow [0, \infty]$ such that for all $x,y,z \in B$,

  1. $\delta(x,x) = 0$
  2. $\delta(x, y) = 0$ implies $x = y$
  3. $\delta(x,y) = \delta(y,x)$
  4. $\delta(x,z) \leq \delta(x,y) + \delta(y,z)$

Note that we allow the possibility that $\delta$ is infinite. This is a slight generalization of the usual notion of a metric, but it disturbs very little. If we enforce the stronger requirement $$4′. \quad \delta(x,z) \leq \max\{\delta(x,y), \delta(y,z)\}$$ then we say $\delta$ is an ultrametric. If we instead, lose requirement (2), then we say $\delta$ is a pseudometric. Thus a pseudo-ultrametric $\delta$ satisfies for all $x,y,z \in B$,

  • $\delta(x,x) = 0$
  • $\delta(x,y) = \delta(y,x)$
  • $\delta(x,z) \leq \max\{ \delta(x,y), \delta(y,z) \}$

The third condition is called the ultrametric inequality or the strong triangle inequality.

Theorem

Given a measure $\mu$ on $(B, \mathcal D)$, define $\delta : B \times B \rightarrow [0,\infty]$ by $\delta(x,x) = 0$ for all $x \in B$, and for $x \neq y$ , $$\delta(x,y) = \inf\{ \mu(A) : A \in \mathcal P \mbox{ with } x, y \in A\}.$$ Then $\delta$ is a pseudo-ultrametric. Moreover, if $\mu$ is a full measure, then $\delta$ is an ultrametric.

Another way of defining $\delta$ is to first define the least common ancestor vertex of any two $x, y \in B$ by $a(x,y)$ in which case $\delta(x,y) = \mu(B(a(x,y))$. This definition only makes sense when $x \neq y$.

Proof

To show that $\delta$ is a pseudo-ultrametric, the only nontrivial condition to check is the ultrametric inequality. That is, for any $x, y, z \in B$, $\delta(x,z) \leq \max\{ \delta(x,y), \delta(y,z) \}$.

There are two cases. The first (slide up) we have $y \not \in B(a(x,z))$. In this case $B(a(x,z)) \subset B(a(y,z))$ and hence $\delta(x,z) \leq \delta(x,y)$. In the second case (slide down) $y \in B(a(x,z))$ and without loss of generality $B(a(x,z)) = B(a(y,z))$.

In the first case we get $B(a(x,z)) \subset B(a(y,z)) = B(a(x,y))$ and hence $\delta(x,z) \leq \max\{ \delta(x,y), \delta(y,z)\}$. (Note that equality is still possible in this case, because it is possible that $\mu(B(a(x,z)) = \mu(B(a(y,z))$ even when $B(a(x,z))$ is a proper subset of $B(a(y,z))$.

The second of these cases is a bit more delicate, but we see that, changing the labels if necessary, $a(x,z) = a(y,z)$. It follows that $\delta(x,z) = \delta(y,z)$ and $\delta(x,y) \leq \delta(y,z)$ which together yield $\delta(x,z) \leq \max\{\delta(x,y), \delta(y,z)\}$ as desired.

Notice that if $\mu$ is full, then $\delta(x,y) = \mu(B(a(x,y)) > 0$ and hence $\delta$ is in fact an ultrametric. $\square$

We say that $x \in B$ is isolated with respect to $\delta$ if there exists $\epsilon > 0$ such that $\delta(x,y) > \epsilon$ for all $y \neq x$. As the next result shows, isolated points come from atoms of $\mu$.

Lemma

Suppose $x \in B$ is such that $\{x\}$ is an atom of $\mu$. Then $x$ is isolated with respect to $\delta$. In particular, if $y \neq x$ then $\delta(x,y) \geq \mu\{x\}$.

Proof

By definition $x \in B(a(x,y))$. It follows that $\mu\{x\} \leq \mu(B(a(x,y)) = \delta(x,y)$. $\square$

0

Clark Honors College, Faculty in Residence application

When I was last on the academic job market in 2008, I was torn between positions at liberal arts colleges and research universities. I had offers from excellent liberal arts schools, including Claremont McKenna College and Bucknell College, but ultimately decided to come to UO so that I had an opportunity to supervise graduate students. I enjoy supervising undergraduate students as well, and have advised four CHC Honors theses, three departmental Honors theses, and several other undergraduate research/reading projects. Supervising students is my favorite aspect of the job. Beyond the usual reward one finds in sharing knowledge with others, getting to know our varied students—understanding their knowledge and skills, their likes and dislikes, and their dreams for the future—is the major driving force keeping me in academia.

I am applying for a Clark Honors College, Faculty in Residence position so that I can pursue the academic work I love in an environment where it is rewarded.

My research lies at the intersection of number theory, probability and mathematical statistical physics. This is a fascinating genre of mathematics research, with many opportunities for undergraduate research. The connection with physics allows intuition to be brought to bear on mathematical problems, which in turn allows undergraduates to make meaningful contributions to mathematical research—at least in the form of conjectures, and discovery of new phenomena.

I also enjoy reading mathematics broadly, and have experience supervising students on mathematics research that is either outside my educational background or applied to other domains of knowledge.

Besides supervision of research, I am also interested in undergraduate mathematics education, especially for students who may not ultimately pursue a degree in a quantitative/scientific field. Mathematics is simultaneously the language of the universe and a ubiquitous tool in modern life. Mathematics education tends to favor the latter, but it is in the former where the rich beauty of mathematics lies. The aesthetics of mathematics is often invisible to individuals who view it only as a tool. I would like to bring this aesthetic vision of mathematics to undergraduates (and others) who may not otherwise experience the sublime beauty of mathematics.

An example of a seminar I would like to offer would be the Development of New Numbers. Such a seminar could trace the history and necessity of new kinds of numbers (natural, integer, rational, algebraic, transcendental, real, complex, etc) as human knowledge has developed. I see such a seminar lying at the intersection of history, philosophy and mathematics, and I would interweave group exercises/projects to motivate the mathematics and inform the necessity (and beauty) of the development of new numbers.

Besides teaching, supervision and research, I also engage heavily in university service. Currently I am the Past President of the University Senate and the President of United Academics, as well as a member of many other committees (including chair of the Core Ed Council). I see some of my current service as fulfillment of certain projects/initiatives started as Senate President. My experience working on core education may be useful in any curricular redesign happening in CHC. While I expect to always be involved in university service, I also expect the level to subside from the current high-water mark. I enjoy the challenge of leadership, but I also wistfully dream of a time when I can fill my days reading, doing math, working with students and doing a sensible amount of service, and hopefully earning the rank of full professor.

Finally, I would like to underscore my commitment to the diversification of mathematics (and science more broadly). Much of this problem arises from enculturation of expectations by society at large, but many issues arise from an old guard of mathematicians who propagate racial and gender disparity via preferential treatment for men and microaggression towards others. These attitudes are incongruent with how I view myself as an educator and scholar, and I look forward to working in a unit that values the various backgrounds and experiences of our students, faculty and staff.



1+

Post-tenure Review Statement

Research

I study the distribution of algebraic numbers, mathematical statistical physics and roots/eigenvalues of random polynomials/matrices. 

Projects in Progress

1The distribution of values of the non-archimedean absolute Vandermonde determinant and the non-archimedean Selberg integral (with Jeff Vaaler). The Mellin transform of the distribution function of the non-archimedean absolute Vandermonde (on the ring of integers of a local field) is related to a non-archimedean analog of the Selberg/Mehta integral. A recursion for this integral allows us to find an analytic continuation to a rational function on a cylindrical Riemann surface. Information about the poles of this rational function allow us to draw conclusions about the range of values of the non-archimedean absolute Vandermonde.

2Non-archimedean electrostatics. The study of charged particles in a non-archimedean local field whose interaction energy is proportional to the log of the distance between particles, at fixed coldness $\beta$. The microcanonical, canonical and grand canonical ensembles are considered, and the partition function is related to the non-archimedean Selberg integral considered in 1. Probabilities of cylinder sets are explicitly computable in both the canonical and grand canonical ensembles.

3Adèlic electrostatics and global zeta functions (with Joe Webster). The non-archimedean Selberg integral/canonical partition function are examples of Igusa zeta functions, and as such local Euler factors in a global zeta function. This global zeta function (the exact definition of which is yet to be determined) is also the partition function for a canonical electrostatic ensemble defined on the adèles of a number field. The archimedean local factors relate to the ordinary Selberg integral, the Mehta integral, and the partition function for the complex asymmetric $\beta$ ensemble. The dream would be a functional equation for the global zeta function via Fourier analysis on the idèles, though any analytic continuation would tell us something about the distribution of energies in the adèlic ensemble.

4Pair correlation in circular ensembles when $\beta$ is an even square integer (with Nate Wells and Elisha Hulbert). This can be expressed in terms of a form in a grading of an exterior algebra, the coefficients of which are products of Vandermonde determinants in integers. Hopefully an understanding of the asymptotics of these coefficients will lead to scaling limits for the pair correlation function for an infinite family of coldnesses via hyperpfaffian/Berezin integral techniques. This would partially generalize the Pfaffian point process arising in COE and CSE. There is a lot of work to do, but there is hope.

5Martingales in the Weil height Banach space (with Nathan Hunter). Allcock and Vaaler produce a Banach space in which $\overline{\mathbb Q}^{\times}/\mathrm{Tor}$ embeds densely in a co-dimension 1 subspace, the (Banach space) norm of which extends the logarithmic Weil height. Field extensions of the maximal abelian extension of $\mathbb Q$ correspond to $\sigma$-algebras, and towers of fields to filtrations. Elements in the Banach space (including those from $\overline{\mathbb Q}^{\times}/\mathrm{Tor}$) represent random variables, and the set up is ready for someone to come along and use martingale techniques—including the optional stopping time theorem—to tell us something about algebraic numbers.

Instruction

I have three current PhD students and one current departmental Honors student. I have supervised two completed PhDs and six completed honors theses. You can find a list of current and completed PhD and honors students on my CV.

My teaching load has been reduced for the last five years (or so) due to an FTE release for serving on the Executive Council of United Academics. As President of United Academics, and Immediate Past President of the University Senate I am not teaching in the 2018 academic year. In AY2019, I am scheduled to teach a two-quarter sequence on mathematical statistical physics.

I take my teaching seriously. I prepare detailed lecture notes for most courses (exceptions being introductory courses, where my notes are better characterized as well-organized outlines). When practical and appropriate I use active learning techniques, mostly through supervised group work. I am a tough, but fair grader.

Service

Service encompasses pretty much everything that an academic does outside of teaching and research. This includes advising, serving on university and departmental committees, reviewing papers, writing letters of recommendation, organizing seminars and conferences, serving on professional boards, etc. The impossibility of doing it all allows academics to decide what types of service they are going specialize based on their interests and abilities.

I have spent the last three years heavily engaged in university level service. I currently serve as the president of United Academics of the University of Oregon, and I am the immediate-past president of the University Senate. Before that I was the Vice President of the Senate and the chair of the Committee on Committees. All of these roles are difficult and require a large investment of thought and energy. The reward for this hard work is a good understanding of how the university works, who to go to when issues need resolution, and who can be safely ignored.

I know what academic initiatives are underway, being involved in several of them. I am spearheading, with the new Core Education Council, the reform of general education at UO. I am working on the New Faculty Success Program—an onboarding program for new faculty—with the Office of the Provost and United Academics. I am currently on the Faculty Salary Equity Committee and its Executive Committee. I have been a bit player in many other projects and initiatives including student evaluation reform, the re-envisioning of the undergraduate multicultural requirement, and the creation of an expedited tenure process to allow the institution alacrity when recruiting imminent scholars. This list is incomplete.

Next year, with high probability, I will be the chair of the bargaining committee for the next collective bargaining agreement between United Academics and the University of Oregon (this assumes I am elected UA president). I will also be working with the Core Ed Council to potentially redefine the BA/BS distinction, with a personal focus on ensuring quantitative/data/information literacy is distributed throughout our undergraduate curriculum. I will also be working to help pilot (and hopefully scale) the Core Ed “Runways” (themed, cohorted clusters of gen ed courses) with the aspirational goal of having 100% of traditional undergraduates in a high-support, high-engagement, uniquely-Oregon first-year experience within the next 3-5 years.

As important as the service I am doing, is the service I am not doing. I do little to no departmental service (though part of this derives from the CAS dean’s interpretation of the CBA) and I avoid non-required departmental functions (for reasons). I do routinely serve on academic committees for graduate/honors students, etc. I decline most requests to referee papers/grants applications, and serve on no editorial boards. The national organizations for which I am an officer are not mathematical organizations, but rather organizations dedicated to shared governance.

Diversity & Equity

The two principles which drive my professional work are truth and fairness.

I remember after a particularly troubling departmental vote, a senior colleague attempted to assuage my anger at the department by explaining that “the world is not fair.” I hate this argument because it removes responsibility from those participating in such decisions, and places blame instead on a stochastic universe. And, while there is stochasticity in the universe, we should be working toward ameliorating inequities caused by chance, and in instances where we have agency, making decisions which do not compound them.

I do not think the department does a very good job at recognizing nor ameliorating inequities. Indeed, there are individuals, policies and procedures that negatively impact diversity. See my recent post Women & Men in Mathematics for examples.

My work on diversity and equity issues has been primarily through the University Senate and United Academics. As Vice-president of the UO Senate, I sat on the committee which vetted the Diversity Action Plans of academic units. I also worked on, or presided over several motions put forth by the University Senate which address equity, diversity and inclusion. Obviously, the work of the Senate involves many people, and in many instances I played only a bit part, but nonetheless I am proud to have supported/negotiated/presided over the following motions which have addressed diversity and equity issues on campus:

Besides my work with the Senate, I have also participated in diversity activities through my role(s) with United Academics of the University of Oregon. United Academics supports both a Faculty of Color and LGBTQ* Caucus which help identify barriers and propose solutions to problems affecting those communities on campus. United Academics bargained a tenure-track faculty equity study, and I am currently serving on a university committee identifying salary inequities based on protected class and proposing remedies for them.

I have attended in innumerable rallies supporting social justice, and marched in countless marches. I flew to Washington D.C. to attend the March for Science. I’ve participated in workshops and trainings on diversity provided by the American Federation of Teachers, and the American Association of University Professors.

I recognize that I am not perfect. I cannot represent all communities nor emulate the diversity of thought on campus. I have occasionally used out-moded words and am generally terrible at using preferred pronouns (though I try). I recognize my short-comings and continually work to address them.

There are different tactics for turning advocacy into action, and individuals may disagree on their appropriateness and if/when escalation is called for. My general outlook is to work within a system to address inequities until it becomes clear that change is impossible from within. In such instances, if the moral imperative for change is sufficient then I work for change from without. This is my current strategy when tackling departmental diversity issues; I work with administrative units, the Senate and the union to put forth/support policies which minimize bias, discrimination and caprice in departmental decisions. I ensure that appropriate administrators know when I feel the department has fallen down on our institutional commitment to diversity, and I report incidents of bias, discrimination and harassment to the appropriate institutional offices (subject to the policy on Student Directed Reporters).

Fairness is as important to me as truth, and I look forward to the day where I can focus more of my time uncovering the latter instead of continually battling for the former.

1+

Diversity and Equity

Notice

This post is part of my post-tenure review. If it seems self-serving, that is because it is.

The two principles which drive my professional work are truth and fairness.

I remember after a particularly troubling departmental vote, a senior colleague attempted to assuage my anger at the department by explaining that “the world is not fair.” I hate this argument because it removes responsibility from those participating in such decisions, and places blame instead on a stochastic universe. And, while there is stochasticity in the universe, we should be working toward ameliorating inequities caused by chance, and in instances where we have agency, making decisions which do not compound them.

I do not think the department does a very good job at recognizing nor ameliorating inequities. Indeed, there are individuals, policies and procedures that negatively impact diversity. See my recent post Women & Men in Mathematics for examples.

My work on diversity and equity issues has been primarily through the University Senate and United Academics. As Vice-president of the UO Senate, I sat on the committee which vetted the Diversity Action Plans of academic units. I also worked on, or presided over several motions put forth by the University Senate which address equity, diversity and inclusion. Obviously, the work of the Senate involves many people, and in many instances I played only a bit part, but nonetheless I am proud to have supported/negotiated/presided over the following motions which have addressed diversity and equity issues on campus:

Besides my work with the Senate, I have also participated in diversity activities through my role(s) with United Academics of the University of Oregon. United Academics supports both a Faculty of Color and LGBTQ* Caucus which help identify barriers and propose solutions to problems affecting those communities on campus. United Academics bargained a tenure-track faculty equity study, and I am currently serving on a university committee identifying salary inequities based on protected class and proposing remedies for them.

I have attended in innumerable rallies supporting social justice, and marched in countless marches. I flew to Washington D.C. to attend the March for Science. I’ve participated in workshops and trainings on diversity provided by the American Federation of Teachers, and the American Association of University Professors.

I recognize that I am not perfect. I cannot represent all communities nor emulate the diversity of thought on campus. I have occasionally used out-moded words and am generally terrible at using preferred pronouns (though I try). I recognize my short-comings and continually work to address them.

There are different tactics for turning advocacy into action, and individuals may disagree on their appropriateness and if/when escalation is called for. My general outlook is to work within a system to address inequities until it becomes clear that change is impossible from within. In such instances, if the moral imperative for change is sufficient then I work for change from without. This is my current strategy when tackling departmental diversity issues; I work with administrative units, the Senate and the union to put forth/support policies which minimize bias, discrimination and caprice in departmental decisions. I ensure that appropriate administrators know when I feel the department has fallen down on our institutional commitment to diversity, and I report incidents of bias, discrimination and harassment to the appropriate institutional offices (subject to the policy on Student Directed Reporters).

Fairness is as important to me as truth, and I look forward to the day where I can focus more of my time uncovering the latter instead of continually battling for the former.

1+