Spectral Factorization and Innovations

Why Spectral Factorization?

The non-causal Wiener filter was easy because convolution over all of Z\mathbb{Z} turned into multiplication in the frequency domain. The causal Wiener filter is harder because the Wiener-Hopf equation now holds only for β„“β‰₯0\ell \geq 0, and a half-axis convolution equation has no one-line Fourier solution.

The trick β€” due to Wiener and Hopf themselves, refined by Kolmogorov β€” is to whiten the observation first. If we can write YnY_n as a causal LTI image of a white process JnJ_n, then causal estimation from YY becomes causal estimation from JJ, and because JJ is uncorrelated across time, causal estimation from JJ is trivial: just project on each JkJ_k independently. Spectral factorization is the tool that performs this whitening.

Definition:

Paley-Wiener Condition

A WSS process {Yn}\{Y_n\} with PSD Py(f)P_y(f) satisfies the Paley-Wiener condition if βˆ«βˆ’1/21/2log⁑Py(f) df>βˆ’βˆž.\int_{-1/2}^{1/2} \log P_y(f)\, df > -\infty. This is the integrability condition that log⁑Py\log P_y must satisfy for the causal factor to exist. It fails, for example, if Py(f)P_y(f) vanishes on an interval of positive measure.

Paley-Wiener is the condition that separates processes that can be generated as causal LTI outputs of a white noise driver (and hence causally predicted to positive accuracy) from those that cannot. Band-limited processes, for instance, are not Paley-Wiener.

Theorem: Spectral Factorization

Let {Yn}\{Y_n\} be WSS with PSD Py(f)P_y(f) satisfying the Paley-Wiener condition. Then there exist functions Py+(f)P_y^+(f) and Pyβˆ’(f)P_y^-(f) such that Py(f)=Py+(f) Pyβˆ’(f),Pyβˆ’(f)=(Py+(f))βˆ—,P_y(f) = P_y^+(f)\, P_y^-(f), \qquad P_y^-(f) = \big(P_y^+(f)\big)^*, where (i) Py+(f)P_y^+(f) is causal: its inverse DTFT p+[n]p^+[n] is supported on nβ‰₯0n \geq 0; (ii) Pyβˆ’(f)P_y^-(f) is anti-causal: its inverse DTFT is supported on n≀0n \leq 0; (iii) 1/Py+(f)1/P_y^+(f) is also causal, and 1/Pyβˆ’(f)1/P_y^-(f) is also anti-causal. We call Py+(f)P_y^+(f) the minimum-phase (or causal, or spectral) factor.

Think of Py+(f)P_y^+(f) as the frequency response of a stable, causal, minimum-phase filter whose squared magnitude is Py(f)P_y(f). Because both the filter and its inverse are causal and stable, passing the observation through 1/Py+(f)1/P_y^+(f) whitens it without losing any causal information: the filtering is invertible in real time.

,

Example: Spectral Factorization of the AR(1)+Noise PSD

For the observation Yn=Xn+ZnY_n = X_n + Z_n with XX an AR(1) process of coefficient aa and innovation variance Οƒu2\sigma_u^2, and ZZ white of variance Οƒz2\sigma_z^2, find the spectral factor Py+(f)P_y^+(f) explicitly.

Definition:

Innovations Process

Let {Yn}\{Y_n\} be WSS with Paley-Wiener-positive PSD and let Py+(f)P_y^+(f) be its minimum-phase spectral factor. The innovations process {Jn}\{J_n\} is the output of passing YnY_n through the whitening filter 1/Py+(f)1/P_y^+(f): Jn=βˆ‘kβ‰₯0w[k] Ynβˆ’k,whereΒ βˆ‘kw[k]eβˆ’j2Ο€fk=1Py+(f).J_n = \sum_{k \geq 0} w[k]\, Y_{n-k}, \qquad \text{where } \sum_k w[k] e^{-j 2\pi f k} = \frac{1}{P_y^+(f)}. The innovations satisfy (i) E[Jn]=0\mathbb{E}[J_n] = 0, (ii) E[JnJmβˆ—]=Ξ΄[nβˆ’m]\mathbb{E}[J_n J_m^*] = \delta[n-m] (white, unit variance), and (iii) span{Jk:k≀n}=span{Yk:k≀n}\text{span}\{J_k : k \leq n\} = \text{span}\{Y_k : k \leq n\} (causal information equivalence).

Because 1/Py+1/P_y^+ is causal and invertible, JnJ_n can be computed from YnY_n in real time, and vice versa. The innovations are the "fresh news" contained in each new observation β€” the part that could not be predicted from the past. This is the discrete-time analog of the continuous-time innovations representation of Wold (1938).

Pole-Zero Map of the Spectral Factors

For AR(1)+noise, plot the poles and zeros of Py+(z)P_y^+(z) (inside the unit circle: minimum phase) and Pyβˆ’(z)P_y^-(z) (outside: reciprocal partners). The reciprocal symmetry is the signature of spectral factorization. Vary aa and SNR to see the zero bb move.

Parameters
0.80
00.95
10
-1030

Spectral Factorization and the Causal Cone

Geometric unfolding of Py(f)=∣Py+(f)∣2P_y(f) = |P_y^+(f)|^2 via pole-zero placement. The minimum-phase factor Py+(f)P_y^+(f) collects all poles and zeros inside the unit circle; its inverse is the whitening filter that produces the innovations JnJ_n. This whitening is the key step that converts a colored-noise estimation problem into a white-noise one.

Historical Note: Wiener at MIT During the War

1940-1945

Norbert Wiener (1894-1964) derived what we now call the Wiener filter in 1942 as part of a classified report to the National Defense Research Committee, titled "Extrapolation, Interpolation, and Smoothing of Stationary Time Series." The motivating problem was anti-aircraft fire control: given a noisy radar track, predict where the aircraft would be when the shell arrived. Wiener's report circulated only among insiders β€” engineers called it "the yellow peril" because of its difficulty and its yellow binding β€” and was not openly published until 1949.

The filter was ahead of its time in several ways. It was one of the first systematic uses of second-order statistics in signal processing, and it introduced the spectral factorization machinery that would later become a central tool in control theory, filter design, and spectral estimation. Wiener, already famous for his work on Brownian motion and cybernetics, characteristically wrote the report in a dense style that required engineer Julian Bigelow to produce an accessible companion document for practitioners.

Historical Note: Kolmogorov's Independent Discovery

1941

Independently of Wiener and a year earlier, Andrey Kolmogorov (1903-1987) published "Stationary Sequences in Hilbert Space" in the Bulletin of Moscow State University in 1941. Kolmogorov worked in the discrete-time setting (which is where we are in this chapter) and approached the problem from pure Hilbert-space geometry: the optimal linear predictor of XnX_n from its past is the orthogonal projection onto the closed subspace generated by {Xm:m<n}\{X_m : m < n\}. He derived what we now call the Kolmogorov-Szego formula for the one-step prediction variance, Οƒp2=exp⁑∫log⁑Py(f) df\sigma_p^2 = \exp\int \log P_y(f)\, df, a result of remarkable elegance.

Kolmogorov's work reached the West only after the war. For a period in the 1950s there was a gentle priority dispute, but the dust settled on calling the continuous-time smoothing filter "Wiener" and the discrete-time predictor "Kolmogorov" or "Wiener-Kolmogorov." The fact that the same structure was discovered twice, on opposite sides of a world war, from a radar problem and from pure probability, is characteristic of deep mathematical ideas.

Common Mistake: Spectral Factorization Is Not Unique Without Causality

Mistake:

Writing Py(f)=∣G(f)∣2P_y(f) = |G(f)|^2 for any GG and calling GG the spectral factor.

Correction:

There are infinitely many ways to factor Py(f)P_y(f) as a squared magnitude: multiply any such G(f)G(f) by an all-pass filter A(f)A(f) with ∣A(f)∣=1|A(f)| = 1 and you get another factor. The unique choice is the minimum-phase factor Py+(f)P_y^+(f), which is the one whose inverse is also causal and stable. Causality pins down the factor up to a constant scalar.

Innovations

The white process JnJ_n obtained by passing an observation YnY_n through the causal whitening filter 1/Py+(f)1/P_y^+(f). The innovation JnJ_n represents the new information in YnY_n beyond what could be predicted from Ynβˆ’1,Ynβˆ’2,…Y_{n-1}, Y_{n-2}, \ldots.

Related: Minimum-Phase Spectral Factor, Whitening Filter

Minimum-Phase Spectral Factor

The function Py+(f)P_y^+(f) in the factorization Py(f)=Py+(f)Pyβˆ’(f)P_y(f) = P_y^+(f) P_y^-(f) whose inverse DTFT is supported on nβ‰₯0n \geq 0 (causal) and whose reciprocal 1/Py+(f)1/P_y^+(f) is also causal. For rational PSDs it corresponds to placing all poles and zeros inside the unit circle.

Related: Innovations, Paley-Wiener Condition

Paley-Wiener Condition

The requirement βˆ«βˆ’1/21/2log⁑Py(f) df>βˆ’βˆž\int_{-1/2}^{1/2} \log P_y(f)\,df > -\infty, ensuring that the PSD does not vanish on a set of positive measure. It is the necessary and sufficient condition for spectral factorization to exist and for the process to admit a causal white-noise representation.

Related: Minimum-Phase Spectral Factor

Whitening Filter

A causal filter whose output has unit-variance white autocorrelation when driven by a given colored input. For WSS YnY_n the whitening filter has frequency response 1/Py+(f)1/P_y^+(f).

Related: Innovations