The perfect Nullstellensatz

Question: to what extent can we recover a polynomial from its zeros?

Our goal in this post is to give several answers to this question and its generalisations. In order to obtain elegant answers, we work over the complex field $\mathbb{C}$ (e.g., there are many polynomials, such as $x^{2n} + 1$, that have no real zeros; the fact that they don’t have real zeros tells us something about these polynomials, but there is no way to “recover” these polynomials from their non-existing zeros). We will write $\mathbb{C}[z]$ for the algebra of polynomials in one complex variable with complex coefficients, and consider a polynomial as a function of the complex variable $z \in \mathbb{C}$. We will also write $\mathbb{C}[z_1, \ldots, z_d]$ for the algebra of polynomials in $d$ (commuting) variables, and think of polynomials in $\mathbb{C}[z_1, \ldots, z_d]$ – at least initially – as a functions of the variable $z = (z_1, \ldots, z_d) \in \mathbb{C}^d$

[Update June 24, 2019: contrary to what I thought, the main theorem presented below holds over arbitrary fields, not just over the complex numbers, very much by the same proof. See this post.]

1. Background

Let us begin by recalling that by the Fundamental Theorem of Algebra, every (one variable) polynomial $p$ decomposes into a product of linear factors. Thus, if we know the zeros including their multiplicities then we can determine the polynomial up to multiplicative factor. Moreover, if we know that the zeros of some polynomials $p$ are $c_1, \ldots, c_k$, then we know that $p$ must have the form

(*) $p(z) = a(z-c_1)^{n_1} \cdots (z-c_k)^{n_k}$,

where the $n_1, \ldots, n_k$ can, in principle, be any positive integers.

Let us reformulate the above observation in a slightly different language, which generalizes well to the multivariable setting. If $p$ is polynomial, we write

$Z(p) = \{z \in \mathbb{C} : p(z) = 0\}.$

Every polynomial $p$ generates a principal ideal $\langle p \rangle = \{qp : q \in \mathbb{C}[z]\}$. Conversely, every ideal $J \triangleleft \mathbb{C}[z]$ in $\mathbb{C}[z]$ is principal. For an ideal $J$ we write

$Z(J) = \{z \in \mathbb{C} : q(z) = 0$  for all $q \in J \}$.

If $J = \langle p \rangle$, then $Z(J) = Z(p)$. Now, if we begin with a polynomial $p$ as in (*), and we are given $q \in \mathbb{C}[z]$ such that $Z(q) = Z(p)$, what can we say about $q$? Well, if we knew that the zeros of $q$ have the same multiplicities as those of $p$, then we would know that $q = \alpha p$ for some nonzero scalar $\alpha$, and in particular we would know that $q \in \langle p \rangle$ (and vice versa, of course). However, in general, $Z(q) = Z(p)$ only implies that $q \in \langle (z - c_1) \cdots (z - c_k) \rangle$, which is usually larger than $\langle p \rangle$. Note that if $N = \max\{n_1, \ldots, n_k\}$, then $q^N \in \langle p \rangle$, because $q^N$ is clearly equal to the product of $p$ and some other polynomial.

Now let us consider the much richer case of polynomials in several commuting variables. For brevity, let us write $z$ for the vector variable $(z_1, \ldots, z_d)$, and let us write $A = \mathbb{C}[z_1, \ldots, z_d]$. Since this algebra is not a principal ideal domain (that’s an easy exercise), it turns out to be more appropriate to talk about ideals rather than single polynomials. Let us define the zero locus $Z(J)$ similarly to as above:

$Z(J) = \{z \in \mathbb{C}^d : p(z) = 0$ for all $p \in J\}$.

We also introduce the following notation: given $S \subseteq \mathbb{C}^d$, we write

$I(S) = \{p \in A : p(z) = 0$ for all $z \in S\}$.

Note that $I(S)$ is always an ideal.

The question now becomes: to what extent can we recover $J$ from $Z(J)$? A slightly different but related question is: what is the gap between $J$ and $I(Z(J))$? We know already from the one variable case that we cannot hope to fully recover an ideal from its zero locus, but it turns out that a rather satisfactory solution can be given.

Suppose that $f$ is a polynomial which it not necessarily contained in $J$, but that $f^n \in J$ for some $n$ (think, for example, of $J = \langle z_1^2, z_2 \rangle$ and $f(z) = z_1$). Then since $f(z)^n = 0$, we also have that $f(z) = 0$, so $f \in I(Z(J))$. So the ideal $I(Z(J))$ contains at least all polynomials $f$ such that $f^n \in J$.

Definition: Let $J \triangleleft A$. The radical of $J$ is the ideal

$rad(J) = \sqrt{J} := \{f \in A :$ there exists some $n$ such that $f^n \in J \}$.

(On the left hand side, there are two different commonly used notations for the radical).

Exercise: The radical of an ideal is an ideal.

Theorem (Hilbert’s Nullstellensatz): For every $J \triangleleft A$,

$I(Z(J)) = \sqrt{J}$.

Nullstellensatz means “theorem of zero locus” in German, and we can all agree that this is an apropriate name for this theorem. We shall not prove this theorem; it is usually proved in a first or second graduate course in commutative algebra. It is a beautiful theorem, indeed, but it is not perfect. Below we shall obtain a perfect Nullstellensatz, that is one in which the ideal is completely recovered by the zeros, with no need to take a radical. Of course, we will need to change the meaning of “zeros”.

2. An introduction to noncommutative commutativity

My recent work in operator algebras and noncommutative analysis has led me, together with my collaborators Guy Salomon and Eli Shamovich, to discover another Nullstellensatz (actually, we have a couple of Nullstellensatze, but I’ll tell you only about one). This result has already been known to some algebraists in one form or another – after we proved it, we found that it can be dug out of a paper of Eisenbud and Hochester – but does not seem to be well known. I will write the result and its proof in a language that I (and therefore, hopefully, anyone who’s had some graduate commutative algebra) can understand and appreciate.

Let $M_d(n)$ denote the set of all $d$-tuples of $n \times n$ matrices. We let $M_d = \sqcup_{n=1}^\infty M_d(n)$ be the disjoint union of all $d$-tuples of $n \times n$ matrices, where $n$ runs from $1$ to $\infty$. That is, we are looking at all $d$-tuples of commuting matrices of all sizes. This set $M_d$ is referred to in some places as “the noncommutative universe”. Elements of $M_d$ can be plugged into polynomials in noncommuting variables, and subsets of $M_d$ are where most of the action in “noncommutative function theory” takes place. We leave that story to be told another day.

Similarly, we let $CM_d(n)$ denote the set of all commuting $d$-tuples of $n \times n$ matrices. Note that we can consider $M_d(n)$ to be the space $\mathbb{C}^{n^2 d}$, and then $CM_d(n)$ is an algebraic variety in $\mathbb{C}^{n^2 d}$ given as the joint zero locus of $\frac{d(d-1)}{2}n^2$ quadratic equations in $n^2 d$ variables. We let $CM_d = \sqcup_{n=1}^\infty CM_d(n)$. Now we are looking at all commuting $d$-tuples of commuting matrices of all sizes. This can be considered as the “commutative noncommutative universe”, or the “free commutative universe”.  Another way of thinking about $CM_d$, is as the “noncommutative variety” cut out in $M_d$ by the $\frac{d(d-1)}{2}$ equations (in $d$ noncommuting variables)

$Z_i Z_j - Z_j Z_i = 0$    ,    $1\leq i.

Points in $CM_d$ can be simply plugged in any polynomial $p \in A$, for example, if $d = 2$ and $p(z) = 1+ 2 z_1 + 3z_1^2z_2^3$, then for $X = (X_1, X_2) \in CM_d$, we put

$p(X) = I + 2 X_1 + 3X_1^2 X_2^3$,

where $I$ is the identity of the same size as $X$ (that is, if $X \in CM_d(n)$, then the correct identity to use is $I_n$). In fact, points in $CM_d$ can be naturally identified with the space of finite dimensional representations of $A$, by

$X \in CM_d \longleftrightarrow \Big(ev_X : p \mapsto p(X)\Big)$.

(We shall use the word “representation” to mean a homomorphism of an algebra or ring into $M_n(\mathbb{C})$ for some $n$).

Now, given an ideal $J \triangleleft A$, we can consider its zero set in $CM_d$:

$Z(J) = Z_{CM_d}(J) = \{X \in CM_d : p(X) = 0$ for all $p \in J\}$.

(We will omit the subscript $CM_d$ for brevity.) In the other direction, given a subset $S \subseteq CM_d$, we can define the ideal of functions that vanish on it:

$I(S) = \{p \in A : p(X) = 0$ for all $X \in S\}$.

Tautologically, for every ideal $J \triangleleft A$,

$I(Z(J)) \supseteq J$,

because every polynomial in $J$ annihilates every tuple on which every polynomial in $J$ is zero, right? The beautiful (and maybe surprising) fact is the converse.

3. The perfect Nullstellensatz – statement and full proof

We are now ready to state the free commutative Nullstellensatz. The following formulation is taken from Corollary 11.7 from the paper “Algebras of bounded noncommutative analytic functions on subvarieties of the noncommutative unit ball” by Guy Salomon, Eli Shamovich and myself (which I already advertised in an earlier blog post).

Theorem (free commutative Nullstellensatz): For every $J \triangleleft A$,

$J = I(Z(J))$.

Proof: This proof should be accessible to someone who took a graduate course in commutative algebra (but not too long ago!). We shall split it into several steps, including some review of required material. Someone who is fluent in commutative algebra will be able to understand the proof by just reading the headlines of the steps without going into the explanations. Recall that we are using the notation $A = \mathbb{C}[z_1, \ldots, z_d]$.

Step I: Changing slightly the point of view: what we shall prove is the following proposition:

Proposition: Let $p \in A$, and suppose that for every unital representation $\varphi : A/J \longrightarrow M_n$

$\varphi(p+J) = 0$

Then $p \in J$

Noting that

1. Representations of $A/J$ are precisely the representations of $A$ that annihilate $J$, and
2. Representations of $A$ are precisely point evaluations at points $X \in CM_d$, thus
3. Representations of $A/J$ are precisely points in $Z(J)$,

we see that if we prove the proposition, we obtain that it means precisely that if $p \in I(Z(J))$ then $p \in J$, which the direction of the Nullstellensatz that we need to prove.

Thus our goal is to prove the proposition.

Step II: A refresher on localization.

We shall require the notion of a localization of a ring. Let $R$ be a commutative ring with unit (any ring we shall consider henceforth will be commutative and with unit) and let $m$ be a maximal ideal in $R$.  Define $S = R \setminus m$ (the complement – not quotient – of $m$ in $R$). The localization of $R$ at $m$ is a ring that is denoted as $R_m$ (or $S^{-1}R$) that contains “a copy of $R$” and in which, loosely speaking, all elements of $S$ are invertible. Thus, still loosely speaking, the localization $R_m$ is the ring formed from all fractions $\frac{r}{s}$ where $r \in R$ and $s \in S$.

More precisely, $R_m$ is the quotient of the set $R \times S$ by the equivalence relation

$(r,s) ~ (r',s')$  if and only if  $\exists u \in S . u(rs' - r's) = 0$.

Sometimes the pair $(r,s)$ is written as $\frac{r}{s}$ , and then multiplication is defined such that addition and multiplication are defined so as to agree with the usual formulas for addition and multiplication for fractions, that is,

$\frac{r}{s} + \frac{r'}{s'} = \frac{rs'+sr'}{ss'}$ ,

and

$\frac{r}{s} \cdot \frac{r'}{s} = \frac{rr'}{ss'}$ .

We define a map $i_m : R \to R_m$ by $i_m(a) = \frac{a}{1}$. Clearly, $i_m(1)$ is the unit of $R_m$, and $R_m$ is again a commutative ring with unit.

We shall require the following two facts, which can be taken as exercises:

Fact I: The localization $R_m$ of $R$ at a maximal ideal is a local ring, that is, it is a ring with a unique maximal ideal.

Fact II: If $a \in R$ is such that $i_m(a) = 0$ for every maximal ideal $m$ in $R$, then $a = 0$.

As we briefly mentioned in Fact I above, we remind ourselves that a local ring is a ring that has a unique maximal ideal. A commutative ring $R$ is said to be Noetherian if every ideal in $R$ is finitely generated.

We shall also require the following theorem, which is not really an exercise. If $I$ is an ideal in a ring $R$, we write $I^n$ for the ideal generated by all elements of the form $a_1 a_2 \cdots a_n$, where $a_i \in I$ for all $i=1, \ldots, n$.

Krull’s intersection theorem: Let $R$ be a commutative Noetherian local ring with identity. If $m$ is the maximal ideal in $R$, then

$\cap_{n=0}^\infty m^n = \{0\}$.

Take it on faith for now (or see Wikipedia).

Step III: A lemma on local algebras.

Recall that a ring is said to be a $\mathbb{C}$algebra (or and algebra over $\mathbb{C}$) if it is a complex vector space. We say that an algebra is local if it is a local ring. A commutative algebra $R$ is said to be Noetherian if its underlying ring is a Noetherian ring.

Lemma: Let $R$ be a local $\mathbb{C}$-algebra with a maximal ideal $m$, and fix $a \in R$. Suppose that for every homomorphism $\pi : R \to M_n(\mathbb{C})$

$\pi(a) = 0$

Then $a = 0$

Proof: First, note that $a \in m$, because the quotient $R/m$ is isomorphic to $\mathbb{C} = M_1(\mathbb{C})$, so $a$ must be mapped to zero under this map. Since $R$ is Noetherian, $m$ is finitely generated, as is also every power $m^n$. It follows by induction that for every $n$, the algebra $R/m^n$ is a finite dimensional vector space. Hence the quotient map $R \to R/m^n$ can also be considered as a finite dimensional representation, so it annihilates $a$. Thus $a \in m^n$ for all $n$. By Krull’s intersection theorem, $a = 0$.

Step IV and conclusion: proof of the proposition.

We now prove the above proposition, which, as explained in Step I, proves the free commutative Nullstellensatz. Let $J$ be an ideal in $A = \mathbb{C}[z_1, \ldots, z_d]$, and let $p \in A$ be an element such that $\varphi(p+J) = 0$ for every representation of $A/J$. We wish to prove that $p \in J$, or equivalently, that $a:= p+J = 0$. By Fact II above, it suffices to show that $i_m(a) = 0$ for every maximal ideal $m$ in $R:= A/J$.

Now let $m$ be any maximal ideal in $R$. By the lemma of Step III (which is applicable, thanks to Fact I), $i_m(a) = 0$ if and only if its image under every representation of $R_m$ is zero. But every representation $\pi : R_m \to M_n(\mathbb{C})$ gives rise to a representation $\varphi = \pi \circ i_m : R \to M_n(\mathbb{C})$, which, by assumption, annihilates $a$. It follows that $i_m(a) = 0$ for every maximal $m$ in $A/J$, whence (Fact II) $a = 0$, and $p \in J$ as required. That concludes the proof.

Remark: The proof presented here is from my paper with Guy and Eli. I mentioned above that the theorem follows from the results in a paper of Eisenbud and Hochester. Our proof is simpler then theirs, but they prove more: our result says that $p \in J$ if $p(X)$ for every $d$-tuple of commuting $n \times n$ matrices that annihilate $J$, where in principle one might have to consider all $n \in \mathbb{N}$. Eisenbud and Hochester’s result implies that there exists some $N$ (depending on $J$ of course) such that, if $p(X) = 0$ for all $Z(J)$ of size less than or equal to $N$, then $p \in J$. (If you are asking yourself why we are proving in our paper a weaker result then one that already appears in the literature, let me say that this theorem is a rather peripheral result in our paper, and serves a motivational and contextual purpose, rather than supporting the main line of investigation).

4. The perfect Nullstellensatz in the one variable case

We now treat the Theorem (the free commutative Nullstellensatz) in the case of one variable. This really should be understood by everyone. The short explanation is that  matrix zeros of a polynomial determine not only the location of the zeros but also their multiplicity.

Let $p(x) = a(x-\alpha_1)^{n_1} \cdots (x-\alpha_k)^{n_k} \in \mathbb{C}[x]$, and let $q \in I(Z(p))$. So we know that $q(A) = 0$ for every square matrix $A$ that annihilates $p$ (that is, every $A$ such that $p(A) = 0$). Our goal is to understand why this is equivalent to $q$ belonging to the ideal $\langle p \rangle$ generated by $p$. One direction is immediate: if $q = g p$, and $p(A) = 0$, then $q(A) = g(A) p(A) = 0$.

In the other direction, we need to show that if $q \in I(Z(p))$, then $(x-\alpha_i)^{n_i}$ is a factor of $q$ for all $i=1, \ldots, k$. Everything boils down to understanding how polynomials operate on Jordan blocks. Consider a $d \times d$ Jordan block

$J= J_d(\lambda) = \begin{pmatrix} \lambda & 1 & 0 & \cdots & 0 \\ 0 & \lambda & 1 & \cdots & 0 \\ & & \ddots & & \\ & & & & \lambda \end{pmatrix}$ ,

and consider the polynomial $g(x) = (x-\mu)^l$. Then one checks readily:

1. $g(J)$ is invertible if and only if $\mu \neq \lambda$.
2. $g(J) = 0$ if and only if $\mu = \lambda$ and $l\geq d$.

It follows (assuming the form $p(x) = a(x-\alpha_1)^{n_1} \cdots (x-\alpha_k)^{n_k}$)  that $p(J) = 0$ if and only if $\lambda = \alpha_i$ for some $i$, and $n_i \geq d$. Since every matrix $A$ has a unique canonical Jordan form (up to a permutation of the blocks), we can understand precisely what matrices belong to $Z(p)$: it is those matrices whose Jordan blocks have eigenvalues in the set $\alpha_1, \ldots, \alpha_k$, each of whose sizes are no bigger than $n_i$.

So, if $q \in I(Z(p))$, then $q(J) = 0$ for every Jordan block $J$ for which $p(J) = 0$ (and vice versa). So letting $J = J_{n_i}(\alpha_i)$ , we see that $(x-\alpha_i)^{n_i}$ must be a factor of $q$, that is, $q$ has the form $q(x) = (x-\alpha_i)^{n_i}f(x)$. Since this holds for all $i = 1, \ldots, k$, we have that $q \in \langle p \rangle$.

Remark: Note that the proof also shows that to conclude that $q \in \langle p \rangle$, one needs to know only that $q(A) = 0$ for all $A \in Z(p)$ of size less than or equal to $N \times N$ for $N = \max\{n_1, \ldots, n_k\}$.

5. Further questions

The beautiful theorem we proved raises two important questions:

1. Why is it interesting (besides the plain reason that it is evidently interesting). What questions does this kind of theorem help to answer?
2. What does the set of commuting tuples of matrices look like? In order for the above theorem to be “useful” we will need to understand this set well.

I hope to write two posts addressing these issues soon.

Remark: I should also mention the following very well known observation, which also explains how evaluation on Jordan blocks can identify the zeros of a polynomial including their multiplicity. If $J$ is a Jordan block:
$J= J_n(\lambda) = \begin{pmatrix} \lambda & 1 & 0 & \cdots & 0 \\ 0 & \lambda & 1 & \cdots & 0 \\ & & \ddots & & \\ & & & & \lambda \end{pmatrix}$ ,
and $f$ is an analytic function, then
$f(J) = \begin{pmatrix} f(\lambda) & f'(\lambda) & \frac{f''(\lambda)}{2!} & \cdots & \frac{f^{(n-1)}(\lambda)}{(n-1)!} \\ 0 & f(\lambda) & f'(\lambda) & \cdots & \frac{f^{(n-2)}(\lambda)}{(n-2)!} \\ & & \ddots & & \\ & & & & f(\lambda) \end{pmatrix}$ .