The complex matrix cube problem – new results from summer projects

by Orr Shalit

In this post I will summarize the results obtained by my group in the “Summer Projects Week” that took place two weeks ago at the Technion. As in last time (see here for a summary of last year’s project) the title of the project I suggested was “Numerical Explorations of Open Problems from Operator Theory”. This time, I was lucky to have Malte Gerhold and Satish Pandey, my postdocs, volunteer to help me with the mentoring. The students who chose our project were Matan Gibson and Ofer Israelov, and they did fantastic work.

1. Background

(I will repeat some background material from these two (one, two) posts from last year).

If A is an operator on a Hilbert space H (we will henceforth write this as A \in B(H)), another operator B on a larger space K (we assume K \supseteq H) is said to be a dilation of A if

B = \begin{pmatrix} A & * \\ * & * \end{pmatrix}

where we have written B in block operator form with respect to the decomposition K = H \oplus H^\perp. In this case, A is said to be a compression of B. We then write A \prec B. If A = (A_1, \ldots, A_d) and B = (B_1, \ldots, B_d) are d-tuples of operators, we say that B is a dilation of A, and that A is a compression of B, if A_i \prec B_i for all i=1, \ldots, d (it is implicit that we use the same embedding of H in K). We then write A \prec B.

A d-tuple B = (B_1, \ldots, B_d) is said to be normal if B_i is normal and B_i B_j = B_j B_i for all i,j=1, \ldots, d. Normal tuples of operators are the best understood ones, thanks to the spectral theorem. When a normal tuple B acts on a finite dimensional space then it is simultaneously unitarily diagonalizable.

For an operator A \in B(H) we define its norm \|A\| to be the operator norm of A, that is: \|A\| = \sup\{\|Ax\| : \|x\|=1, x \in H\}. An operator A is said to be a contraction if \|A\| \leq 1.

If A = (A_1, \ldots, A_d) is a tuple of contractions, we write C(A) = C(A_1, \ldots, A_d) for the minimal constant C such that there exists a normal tuple N = (N_1, \ldots, N_d) such that A \prec N and \|N_i\| \leq C for all i.

In other words, C(A) is the price we have to pay (in terms of norm) in order to be able dilate A to a normal tuple. We call C(A) the dilation constant for A.

Now we can define the universal dilation constant for d-tuples of contractions to be

C_d = \sup\{C(A) : A is a d-tuple of contractions \}.

That is, C_d is a dilation constant that works for all d-tuples of contractions.

The complex matrix cube problem: What is C_d? In particular, what is C_2?

In other words, we ask: what is the minimal constant C such that given a tuple A = (A_1, \ldots, A_d) of contractions, one can find a normal dilation N = (N_1, \ldots, N_d) of A such that \|N_i\| \leq C for all i=1, \ldots, d

It was proved that the constant C = \sqrt{d} always works if A_1, \ldots, A_d are all selfadjoint, and that this constant is optimal for selfadjoints; see this paper by Passer, Solel, and myself. Passer later proved that the constant C = \sqrt{2d} holds for arbitrary tuples, however it is not known whether Passer’s constant is optimal. In particular we knew that C_2 \in [\sqrt{2},2]. In last year’s project we found that C_2 is (at least slightly) strictly bigger than \sqrt{2}, and during the last year we improved that to C_2 \in [1.54,2], but there is a large interval where C_2 may be and we do not know what it is.

2. The basic approach

Suppose we are given a d-tuple of contractions A = (A_1, \ldots, A_d). We wish to know whether it is true or false that A has a normal dilation N such that \|N_i\| \leq C for all i=1, \ldots, d.

The first observation is that it is enough to consider only tuples of unitaries. Indeed, if T is a contraction (meaning that \|T\|\leq 1) then

\begin{pmatrix} T & \sqrt{I - TT^*} \\ \sqrt{I - T^* T} & -T^* \end{pmatrix}

is a unitary dilation of T. So given a d-tuple A of contractions, we can find a d-tuple U of unitaries such that T \prec U. Thus, we may as well assume that A is a tuple of unitaries, and ask whether we can dilate A.

In order to be able to carry out calculations on a computer, we approximated a universal normal tuple (which can dilate anything up to a scale factor) by normal tuples of matrices N = (N_1, \ldots, N_d) with joint eigenvalues at the vertices of the polytope Q = P_k \times \cdots \times P_k, where P_k is a regular polygon with k vertices that is circumscribed in the unit circle \mathbb{T} = \partial\overline{\mathbb{D}}. When k is moderately large, the boundary of Q is very close to \mathbb{T} \times \cdots \times \mathbb{T}. One can analyze easily bounds on the error that arises from this approximation.

Our first approach was to randomly select a tuple of n \times n unitaries A and to check whether it has a normal dilation of norm at most \rho. Up to a small error that arises from the approximation mentioned in the previous paragraph, the question is whether A can be dilated to some ampliation of \rho N, where N is the tuple of normals constructed above. Basically, modulo some equivalences within the theory, we know that A has the required dilation of size at most \rho, if and only if there exists a UCP map sending \rho N_i to A_i for i=1,\ldots, d. This, modulo some more equivalences (and as been noted in this paper of Helton, Klep and McCullough) is equivalent to the existence of positive semidefinite n \times n matrices C_1, \ldots, C_m  such that

\sum_{j=1}^m \rho \lambda_{i}^j C_j = A_i for i=1, \ldots, d

where N_i = (\lambda_i^1, \lambda_i^2, \ldots, \lambda_i^m), for i=1,\ldots, d, and

\sum_{j=1}^m C_j = I_n.

The existence of such semidefinite matrices C_1, \ldots, C_m can be interpreted as the feasibility of a certain semidefinite program (SDP). In fact, we decided to treat the full semidefinite program as follows

minimize \rho > 0

such that

C_j \geq 0 , j = 1, \ldots, m

\sum_{j=1}^d \lambda_i^j C_j = \rho^{-1} A_jj=1,\ldots, m

\sum_{j=1}^m C_j = I_n.

Note that we moved \rho to the right hand side, to make the equality constraint affine in the variables C_1, \ldots, C_m and \sigma := \rho^{-1}. Recall that N_i = (\lambda_i^1, \ldots, \lambda_i^m) and A_i are all fixed. In the implementation we actually defined this as a maximization problem

maximize \sigma > 0

such that

C_j \geq 0 , j = 1, \ldots, m

\sum_{j=1}^d \lambda_i^j C_j = \sigma A_jj=1,\ldots, m

\sum_{j=1}^m C_j = I_n.

This is the same semidefinite optimization problem that we used last year as well. (Last year we solved it using CVX on MATLAB. This year we used the package cvxpy in Python for the semidefinite problem. Our code is available on Google Colaboratory, see the links below).

3. Failure/success of the basic approach and some observations

Currently the best known lower bound for C_2 is C_2 \geq 1.54, which is obtained from C(U_\theta, V_\theta) \approx 1.54, where \theta \approx 2 \pi (\sqrt{2} - 1) and U_\theta, V_\theta is a pair of unitaries such that V_\theta U_\theta =  e^{i\theta} U_\theta V_\theta. We say that U_\theta and V_\theta are \theta-commuting unitaries, or that they \theta-commute. The dilation constants for all \theta-commuting unitary pairs were determined in a paper by Malte Gerhold and myself. It is worth noting that this paper of Gerhold and me was inspired by last year’s project, where Mattya Ben-Efraim and Yuval Yifrach computed C(U_{2 \pi/3}, V_{2 \pi/3}) > \sqrt{2}. However, we do not believe that C_2 \approx 1.54, we think that it is bigger.

So the first easy to state goal for this year was to search long enough and find an example of a pair of unitaties U, V such that C(U,V) > 1.54, or at least as big as possible. Let me say straight away that we did not find such an example. In fact, for all the random pairs U,V that we tested, we typically got C(U,V) significantly lower, usually near \sqrt{2} (see below).

However, already last year we observed that concentration of measure phenomena form an obstruction to finding high dilation constants. Roughly speaking, “concentration of measure” means that a Lipschitz function on a high dimensional probability space will obtain values close to its mean with very high probability.

It turns out that the above heuristic has strong theoretical evidence. A paper of Collins and Male (which relies on an earlier breakthrough paper of Haagerup and Thorbjornsen) states that a tuple of independently sampled d-tuples of random N \times N unitaries (sampled from the group \mathcal{U}_N given the Haar probability measure) U^N = (U^N_1, \ldots, U^N_d) converges strongly to the tuple u_f = (u_{f1}, \ldots, u_{fd}) given by the generators of the reduced C*-algebra of the free group F_d. I won’t define what strong convergence is (see the attached papers), but it implies that \liminf C(U^N_1, U^N_2) \geq C(u_{f1}, u_{f2}), and we think that we should be able to prove that there is actually equality in the limit.

4. Some results and additional approaches

One chief outcome of our project is to have collected significant evidence that for a random (Haar distributed) pair of unitaries U,V, we have that C(U,V) tends to be close to \sqrt{2} (up to the error bounds in our finite dimensional approximation). As remarked at the end of the previous section, this is also strong evidence that c(u_{f1}, u_{f2}) = \sqrt{2}. Indeed we found that as the size of the matrices increases, the standard deviation from the mean goes down. For example, during one of the nights of the project-week, we ran an experiment with the following outcome. (Recall that k is the number of vertices in the polygon approximating the disc, so that the dilation constant that we calculate has a relative error of at most \cos(\pi/k)^{-1} - 1 \approx 0.0094. This means that for every pair U,V for which we compute a dilation constant C, the true dilation constant C(U,V) lies between C and C \times 1.0094. Also, d is just the number of operators in our d-tuples, and n is the size of the sampled unitaries).

k = 23, d = 2, no. of random samples for each n is 50
-----------------------------------------------------

n = 2
max_C = 1.397229, mean_C = 1.252607, std = 0.089505
n = 3
max_C = 1.435168, mean_C = 1.338410, std = 0.062456
n = 4
max_C = 1.442803, mean_C = 1.377033, std = 0.049362
n = 5
max_C = 1.431040, mean_C = 1.398623, std = 0.015876
n = 6
max_C = 1.433795, mean_C = 1.404585, std = 0.012104
n = 7
max_C = 1.428610, mean_C = 1.408119, std = 0.012655
n = 8
max_C = 1.439640, mean_C = 1.410235, std = 0.010174
n = 9
max_C = 1.425398, mean_C = 1.406535, std = 0.008882
n = 10
max_C = 1.427183, mean_C = 1.408760, std = 0.006846
n = 11
max_C = 1.425355, mean_C = 1.409243, std = 0.007134

Here are some more results, with smaller value of k, which allows us to run more experiments with larger n. Here we use the value k = 12, which gives a relative error of 0.035. The result below suggest that C(U^N_1, U^N_2) for large N tends to be between 1.379 and 1.427 (and we believe it is \sqrt{2}).

k = 12, d = 2, no. of random samples for each n is 50
---------------------------------------------------

n = 2
max_C = 1.389380, mean_C = 1.216731, std = 0.095579
n = 3
max_C = 1.426683, mean_C = 1.337595, std = 0.047033
n = 4
max_C = 1.417624, mean_C = 1.356411, std = 0.031138
n = 5
max_C = 1.418676, mean_C = 1.362433, std = 0.027081
n = 6
max_C = 1.397214, mean_C = 1.372076, std = 0.014774
n = 7
max_C = 1.406423, mean_C = 1.372238, std = 0.012304
n = 8
max_C = 1.412893, mean_C = 1.376826, std = 0.010334
n = 9
max_C = 1.395714, mean_C = 1.377344, std = 0.008770
n = 10
max_C = 1.390549, mean_C = 1.377275, std = 0.006920
n = 11
max_C = 1.395763, mean_C = 1.377383, std = 0.009049
n = 12
max_C = 1.398410, mean_C = 1.379734, std = 0.006276
n = 13
max_C = 1.398447, mean_C = 1.379338, std = 0.005721
n = 14
max_C = 1.390360, mean_C = 1.379050, std = 0.005425
n = 15
max_C = 1.393251, mean_C = 1.380928, std = 0.005420
n = 16
max_C = 1.389973, mean_C = 1.379776, std = 0.004730
n = 17
max_C = 1.393275, mean_C = 1.379678, std = 0.005481
n = 18
max_C = 1.392536, mean_C = 1.380155, std = 0.004127
n = 19
max_C = 1.389623, mean_C = 1.379491, std = 0.003916
n = 20
max_C = 1.390470, mean_C = 1.379793, std = 0.004657
n = 21
max_C = 1.391533, mean_C = 1.379091, std = 0.004251
n = 22
max_C = 1.390274, mean_C = 1.379855, std = 0.004138
n = 23
max_C = 1.385381, mean_C = 1.379334, std = 0.002905

Note that the maximal values of the dilation constant are (perhaps counter intuitively) attained for small values of n. This is because the standard deviation diminishes with n, and one is less likely to find a large counter example by chance. However, since we see that the standard deviation becomes smaller as n increases, this means that to get a good feel of what happens we don’t need to run many tests. We can sample a single pair U_1, U_2 of random unitaries of large size (say 50 \times 50) and compute their dilation constant. With very high probability, you will get something that is equal to \sqrt{2} up to the error due to the fact that we are using finite k. You can try it yourself, with the code linked to at the end of this post.

Here is a quick test I ran, with k = 16 and one sampled pair of unitaries for every value of n = 10, 20, 30, 40, 50 (this took one hour to run):

n = 10, C = 1.3991621945198882

n = 20, C = 1.4022442529059544

n = 30, C = 1.3974791375625744

n = 40, C = 1.4001629040203714

n = 50, C = 1.400105201371202

Although we have not found new examples with large dilation constants, the results we obtained have implications also for the dilation constant C_2, because in joint work-in-progress with Gerhold, Pandey and Solel, we proved that C_2 \leq \frac{2}{\sqrt{3}} C(u_{f1},u_{f2}), thus in particular C(u_{f1}, u_{f2})  < \sqrt{3} would imply that C_2 < 2, which is an improvement to the best known upper bound C_2 \leq 2.

The numerical results mentioned above made us gain confidence in the conjecture C(u_{f1},u_{f2}) = \sqrt{2}, and hence C_2 < 2. We were happy to quickly guess that C(u_{f1}, \ldots, u_{fd}) = \sqrt{d}. We then ran some tests for the case d = 3 and found out that this is probably not true, and that probably C(u_{f1}, u_{f2}, u_{f3}) < \sqrt{3} (for the reader’s convenience: \sqrt{3} \approx 1.732):

d = 3, k = 12 (error margin = 0.035276), no. samples for each n = 20
-------------------------------------------------------------------

n = 2
max_C = 1.596420, mean_C = 1.410624, std = 0.100716
n = 12
max_C = 1.603867, mean_C = 1.589800, std = 0.011682
n = 20
max_C = 1.605948, mean_C = 1.591090, std = 0.007812

Finally, instead of randomly sampling pairs of Haar distributed unitaries, we constructed finite dimensional compressions of u_{f1}, u_{f2} (we took the subspace of \ell^2(F_2) generated by all words of length less than or equal to some m) and computed the dilation constant of these compressions. The idea was that if c(u_{f1}, u_{f2}) > \sqrt{2}, then we should be able to find this by checking a certain (large enough, but finite) compression. However, for very small m the running time and memory requirements blow up, and the results do not teach us much:

m (maximal word length) = 1, n (dimension) = 5, k = 10
-------------------------------------------------------
C = 0.951057 (runtime: 1 seconds)

m (maximal word length) = 2, n (dimension) = 17, k = 10
-------------------------------------------------------
C = 1.164818 (runtime: 4 seconds)

m (maximal word length) = 3, n (dimension) = 53, k = 10
-------------------------------------------------------
C = 1.242592 (runtime: 45 seconds)

m (maximal word length) = 4, n (dimension) = 161, k = 10
-------------------------------------------------------
C = 1.279138 (runtime: 657 seconds)

For m = 5 the computer chokes. Perhaps it would be worth trying this with enlarged resources.

5. The numerical range of random unitary matrices

A final service that Matan and Ofer did for us, was to help us understand the shape of a certain intriguing geometrical object that arise in the theory.

Given two operators A, B \in B(H), their numerical range is the set

W_1(A,B) := \{(\phi(A), \phi(B)) : \phi : B(H) \to \mathbb{C} is a state \}.

This set contains significant information on the structure of the operator space generated by A and B. We can prove (work-in-progress with Gerhold) that for pairs of unitaries sampled randomly and independently from the Haar measure on \mathcal{U}_N, the numerical range W_1(U^N_1, U^N_2) converges in the Hausdorff metric to W_1(u_{1f}, u_{f2}) almost surely. Thus, W_1(u_{1f}, u_{f2}) is what a random matrix range of looks like. A paper of Collins, Gawron, Litvak and Zyczkowski showed that if one samples a series of two independent matrices A^N_1, A^N_2 from the Gaussian Unitary Ensemble (for example), then W_1(A^N_1, A^N_2) converges almost surely to the ball. We asked: what does W_1(U^N_1, U^N_2) converge to (almost surely)?

My first guess was that the limit (which we can prove is W_1(u_{1f}, u_{f2})) is a bidisc. But rather quickly, using formulas from a paper of Lehner, we realized that it is not a bidisc. What does it look like? Is it a ball?

This was computed for us by Matan and Ofer (who used the formulas from Lehner’s paper in the computation). Here is the projection onto the real parts:

W_1(u_{f1}, u_{f2})

This looks like the unit ball in the \ell^p norm for p>2 (the “TV screen”), so they tested to see what \ell^p ball this is closest to. It seems to be somewhere between p=4 and p=5. However, we don’t really expect this to be an \ell^p ball.

Actually, Lehner’s formulas do not directly give us W:=W_1(u_{f1}, u_{f2}) but only its polar dual W^\circ = \{(z_1, z_2) : Re z_1 w_1 + z_2 w_2 \leq 1 for all (w_1, w_2) \in W\}. So first Matan and Ofer computed W^\circ, (the real part of) which is illustrated below, and then they computed the polr dual.

The polar dual of W_1(u_{f1}, u_{f2})

6. Summary of outcomes

The outcomes (and non-outcomes) of the project are:

  1. We have not found A,B with C(A,B) larger than 1.5, and in particular we do not have a bigger lower bound for C_2 than what we already knew.
  2. On the other hand, we have evidence that C(U^N_1, U^N_2) converges to \sqrt{2} as N \to \infty. Repeating our experiments gives the same result. Therefore,
  3. Using theoretically-based heuristic, we have evidence that C(u_{f1}, u_{f2}) = \sqrt{2}. Therefore,
  4. Using proved theoretical results, we have evidence that C_2 \leq 2 \sqrt{\frac{2}{3}} < 2.
  5. We have not been able to obtain non-random lower bounds for C(u_{f1}, u_{f2}) by compression u_{f1}, u_{f2} to the subspace of words of length at most m. The best we get from this, adding the error, is a lower bound of at most 1.33.
  6. We have a program to draw W_1(u_{f1}, u_{f2}) to any precision, and we know that it looks kind of like a “tv screen”, or “Android app”.
  7. We have open source (courtesy of Matan and Ofer) code that is able to reproduce our results and expand our data.

A short presentation of the project by Matan and Ofer can be found here (the presentation they used made a combination of slides and whiteboard, so the above text hopefully makes up for the missing oral presentation).

Our code for calculating the dilation constants can be found on colaboratory here. Our code for calculating the numerical range of the free Haar unitaries can be found on colaboratory here. NOTE: in both files one has to play with the parameters to see significant results, in the code files the values of n, for example, are small, so that one can get it running fast. (The same remarks hold for the resolution used to draw the numerical range in the second link.) If you want to carry out serious experimentation, you need to give thought to the parameters