Reconstruction Bias

The claim by Efstathiou, Ma, and Hanson that reconstructing the full sky is better than using a partial sky is clearly nonsense on many levels. I have been trying to understand/quantify this. To this end I have done some simulations to show that things (obviously) go wrong, but am still trying to understand where the argument goes wrong.

Simulations

In the simulations I have generated $\Lambda$CDM skies and analyzed them at various resolutions. The analysis has been both a reconstruction and using spice (called pixel based in the figures). To study how the value of $C_2$ affects the results I rescaled $C_2$ in the simulations. The procedure is as follows:

  1. Generate a random sky at Nside=512.
  2. Extract the $a_{2,m}$ and calculate $C_2$ from them.
  3. Rescale the $a_{2,m}$ so that $C_2$ has a fixed value, for example, $100\;(\mathrm{\mu K})^2$.
  4. Repeat this rescaling for a range of $C_2$ values. Thus the exact same map is used with only the magnitude of the $a_{2,m}$ changed.
  5. Smooth/degrade the map as desired for the given test I am doing. Note that in one set of tests I replace the masked region with the WMAP ILC7 masked region prior to smoothing and degrading. This shows the leakage of the masked region into the unmasked region.

For the simulations I use Nside=16 to be consistent with the previous work. I also use Nside=128. This is the maximum Nside I can reasonably calculate. I want to use the highest resolution possible because I prefer not to smooth and the higher the resolution the less important I think smoothing must be.

Results

Here are plots of the results. There are 3 plots per file. All contain the 68 percentile ranges for values. The first plot is of the $C_2$. I have (I think) 10,000 realizations per $C_2$ bin. The second plot is of $C_3$. The last of of $S_{1/2}$. The true value is the known value from the simulation.

KQ75y7 Mask
Results for the kq75y7 mask
KQ85y7 Mask
Results for the kq85y7 mask

Just looking at $C_2$ we see interesting results. Clearly the reconstruction is biased (more below) where as the pixel based is largely unbiased. It is true that for $C_2$ near the expected theoretical value, $\sim1000\;(\mathrm{\mu K})^2$, that the reconstruction does a good job. Unfortunately, near the actual value, $\sim100\;(\mathrm{\mu K})^2$ the reconstruction is biased upward. Also notice that the error bars on the reconstructed values are smaller, as they should be. Clearly if the sky turned out to be closer to the theoretical expectation the reconstruction would be fine. Since it hasn't, the reconstruction biases the results.

When we place the ilc in the masked region and smooth the results are even more peculiar. The information from the mask clearly leaks out. Given that even the full sky ilc has a low $C_2$ the reconstruction suppresses power when the actual value is high and enhances it when it is low. This can clearly be related to the leakage. Even in this case the pixel based approach is less sensitive.

The other plots show similar biases. Since there are not equal number of realizations in each bin in these plots the ranges are much more ragged.

Explanation?

On the one hand these results are obvious. On the other, it is not clear where the argument saying that the reconstruction should be an unbiased, optimal estimator breaks down.

To begin, the reconstruction is actually designed to produce an unbiased, optimal estimator for the $a_{\ell,m}$, not the $C_\ell$. I have verified that this is the case. Given that the $C_\ell$ are quadratic in the $a_{\ell,m}$ the bias seems obvious; an unbiased, optimal estimator for a quantity does not provide an unbiased, optimal estimator for the square of that quantity.

Of course Efstathiou and others know this! An argument is given in the paper (and else where) that the correlations are small. In particular, it is claimed that the Fisher matrix is nearly diagonal so that \[ F_{\ell,\ell'} \approx \frac{2\ell+1}{2 C_\ell^2}\delta_{\ell,\ell'}. \] I have tried to check this and it does seem to be the case.

Why is this relevant? Without going into the gory details the reconstruction procedure is as follow. Let $C$ be the correlation matrix. This is the two point correlation function but only includes modes the we are not reconstructing. In my case I have reconstructed up to $\ell=10$. Then let \[ \Sigma= (Y^TC^{-1} Y)^{-1}, \quad W=\Sigma Y^T C^{-1}. \] With this an unbiased estimator for the $a_{\ell,m}$ is \[ \vec a = W\vec x \] where $\vec x$ represents the map (or part of the map we are using for the reconstruction). The $C_\ell$ can then properly be reconstructed as \[ C_\ell = [F^{-1}]_{\ell,\ell'} y_{\ell'} \] where \[ y_\ell \equiv \vec a_\ell^T \Sigma_\ell^{-2} \vec a_{\ell}. \] Here the $\ell$ subscript on the right hand side means we sum over the appropriate $m$ but the $\ell$ value is fixed. It thus seems that the relevant quantity is not just the Fisher matrix but $\Sigma$. I find that $\Sigma$ is definitely not diagonal. I have to better understand what Efstathiou is saying in the paper. I find it hard to imagine that he is making a trivial mistake.

To fill in some of the details (pages 8 and 9, roughly equations 22-29), Efstathiou et al. define things as above (I have used the notation of de Oliviera-Costa and Tegmark 2006). They call the estimated values $a_{\ell,m}^e$ and use these to calculate $C_\ell^e$ in the usual way, \[ C_\ell^e=\frac1{2\ell+1}\sum_m |a_{\ell,m}^e|^2. \] They then say that if we instead define weighted harmonic coefficients \[ \vec\beta = \Sigma^{-1} \vec a^e \] the power spectrum can be calculated from \[ y_\ell = \frac12 \sum_m |\beta_{\ell,m}|^2. \] This is the same expression we have above. Thus all of this is consistent. They then go on to say that for a complete sky with noiseless data \[ \Sigma_{\mathrm{full}} = C_\ell \delta_{\ell,\ell'} \delta_{m,m'}, \] which, of course, is true, and the Fisher matrix is nearly diagonal (which also seems to be true). However they do not say any more about $\Sigma$, in particular what its structure is on a partial sky. I may be missing something, but this seems to be the flaw in their argument. Am I missing something?