Ferkans — Interactive Telecom Tutor

Why Spectral Factorization?

The non-causal Wiener filter was easy because convolution over all of $\mathbb{Z}$ turned into multiplication in the frequency domain. The causal Wiener filter is harder because the Wiener-Hopf equation now holds only for $\ell \geq 0$ , and a half-axis convolution equation has no one-line Fourier solution.

The trick — due to Wiener and Hopf themselves, refined by Kolmogorov — is to whiten the observation first. If we can write $Y_n$ as a causal LTI image of a white process $J_n$ , then causal estimation from $Y$ becomes causal estimation from $J$ , and because $J$ is uncorrelated across time, causal estimation from $J$ is trivial: just project on each $J_k$ independently. Spectral factorization is the tool that performs this whitening.

Definition:
Paley-Wiener Condition

A WSS process $\{Y_n\}$ with PSD $P_y(f)$ satisfies the Paley-Wiener condition if $\int_{-1/2}^{1/2} \log P_y(f)\, df > -\infty.$ This is the integrability condition that $\log P_y$ must satisfy for the causal factor to exist. It fails, for example, if $P_y(f)$ vanishes on an interval of positive measure.

Paley-Wiener is the condition that separates processes that can be generated as causal LTI outputs of a white noise driver (and hence causally predicted to positive accuracy) from those that cannot. Band-limited processes, for instance, are not Paley-Wiener.

Theorem: Spectral Factorization

Let $\{Y_n\}$ be WSS with PSD $P_y(f)$ satisfying the Paley-Wiener condition. Then there exist functions $P_y^+(f)$ and $P_y^-(f)$ such that $P_y(f) = P_y^+(f)\, P_y^-(f), \qquad P_y^-(f) = \big(P_y^+(f)\big)^*,$ where (i) $P_y^+(f)$ is causal: its inverse DTFT $p^+[n]$ is supported on $n \geq 0$ ; (ii) $P_y^-(f)$ is anti-causal: its inverse DTFT is supported on $n \leq 0$ ; (iii) $1/P_y^+(f)$ is also causal, and $1/P_y^-(f)$ is also anti-causal. We call $P_y^+(f)$ the minimum-phase (or causal, or spectral) factor.

Think of $P_y^+(f)$ as the frequency response of a stable, causal, minimum-phase filter whose squared magnitude is $P_y(f)$ . Because both the filter and its inverse are causal and stable, passing the observation through $1/P_y^+(f)$ whitens it without losing any causal information: the filtering is invertible in real time.

Proof

Define the log-spectrum

Let $\Psi(f) = \log P_y(f)$ . By Paley-Wiener, $\Psi \in L^1[-1/2, 1/2]$ . Expand in Fourier series: $\Psi(f) = \sum_{n \in \mathbb{Z}} c_n e^{-j 2\pi f n}$ , with $c_n = \int_{-1/2}^{1/2} \Psi(f) e^{j 2\pi f n} df$ . Since $P_y$ is real and positive, $\Psi$ is real, so $c_{-n} = c_n^*$ .

Split into causal and anti-causal halves

Write $\Psi(f) = c_0 + \Psi^+(f) + \Psi^-(f)$ where $\Psi^+(f) = \sum_{n \geq 1} c_n e^{-j 2\pi f n}$ (causal, strictly) and $\Psi^-(f) = \sum_{n \leq -1} c_n e^{-j 2\pi f n}$ (anti-causal, strictly). Note $\Psi^-(f) = (\Psi^+(f))^*$ by the conjugate symmetry of $c_n$ .

Exponentiate and identify the factors

Define $P_y^+(f) = e^{c_0/2} \exp(\Psi^+(f))$ and $P_y^-(f) = e^{c_0/2} \exp(\Psi^-(f))$ . Then $P_y^+(f) P_y^-(f) = e^{c_0} \exp(\Psi^+ + \Psi^-) = e^{c_0} \cdot e^{\Psi - c_0} = e^{\Psi(f)} = P_y(f)$ .

Verify causality of $P_y^+$

$\Psi^+$ is a sum of $e^{-j 2\pi f n}$ with $n \geq 1$ only. Its exponential $\exp(\Psi^+) = \sum_{k=0}^{\infty} (\Psi^+)^k / k!$ is a power series in $e^{-j 2\pi f}$ — each term is a polynomial in $e^{-j 2\pi f}$ (with non-negative exponents). The Fourier coefficients therefore vanish for $n < 0$ , establishing causality. The same argument on $\log(1/P_y^+) = -\Psi^+/2 - c_0/2$ shows $1/P_y^+$ is causal as well.

,

Example: Spectral Factorization of the AR(1)+Noise PSD

For the observation $Y_n = X_n + Z_n$ with $X$ an AR(1) process of coefficient $a$ and innovation variance $\sigma_u^2$ , and $Z$ white of variance $\sigma_z^2$ , find the spectral factor $P_y^+(f)$ explicitly.

Solution

Write the PSD as a ratio of trig polynomials

$P_y(f) = \dfrac{\sigma_u^2}{|1 - a e^{-j 2\pi f}|^2} + \sigma_z^2 = \dfrac{\sigma_u^2 + \sigma_z^2 |1 - a e^{-j 2\pi f}|^2}{|1 - a e^{-j 2\pi f}|^2}.$ Expanding: $|1 - a e^{-j 2\pi f}|^2 = 1 - 2a \cos(2\pi f) + a^2$ . The numerator becomes $N_0 - 2 N_1 \cos(2\pi f)$ with $N_0 = \sigma_u^2 + \sigma_z^2(1 + a^2)$ and $N_1 = \sigma_z^2 a$ .

Match a minimum-phase numerator

Write the numerator as $|\beta(1 - b e^{-j 2\pi f})|^2 = \beta^2 (1 + b^2 - 2 b \cos(2\pi f))$ and match: $\beta^2(1 + b^2) = N_0$ , $\beta^2 b = N_1 = \sigma_z^2 a$ . The latter gives $\beta^2 = \sigma_z^2 a / b$ ; substituting into the former yields a quadratic in $b$ : $b^2 - \dfrac{N_0}{\sigma_z^2 a}\, b + 1 = 0$ . The two roots are reciprocals; pick the one with $|b| < 1$ .

Write the spectral factor

$P_y^+(f) = \dfrac{\beta (1 - b e^{-j 2\pi f})}{1 - a e^{-j 2\pi f}}.$ Both the zero at $z = b$ and the pole at $z = a$ lie strictly inside the unit circle, so $P_y^+$ and $1/P_y^+$ are both causal and stable.

Definition:
Innovations Process

Let $\{Y_n\}$ be WSS with Paley-Wiener-positive PSD and let $P_y^+(f)$ be its minimum-phase spectral factor. The innovations process $\{J_n\}$ is the output of passing $Y_n$ through the whitening filter $1/P_y^+(f)$ : $J_n = \sum_{k \geq 0} w[k]\, Y_{n-k}, \qquad \text{where } \sum_k w[k] e^{-j 2\pi f k} = \frac{1}{P_y^+(f)}.$ The innovations satisfy (i) $\mathbb{E}[J_n] = 0$ , (ii) $\mathbb{E}[J_n J_m^*] = \delta[n-m]$ (white, unit variance), and (iii) $\text{span}\{J_k : k \leq n\} = \text{span}\{Y_k : k \leq n\}$ (causal information equivalence).

Because $1/P_y^+$ is causal and invertible, $J_n$ can be computed from $Y_n$ in real time, and vice versa. The innovations are the "fresh news" contained in each new observation — the part that could not be predicted from the past. This is the discrete-time analog of the continuous-time innovations representation of Wold (1938).

Pole-Zero Map of the Spectral Factors

For AR(1)+noise, plot the poles and zeros of $P_y^+(z)$ (inside the unit circle: minimum phase) and $P_y^-(z)$ (outside: reciprocal partners). The reciprocal symmetry is the signature of spectral factorization. Vary $a$ and SNR to see the zero $b$ move.

Parameters

AR coefficient a0.80

00.95

SNR (dB)10

-1030

Spectral Factorization and the Causal Cone

Geometric unfolding of

P_y(f) = |P_y^+(f)|^2

via pole-zero placement. The minimum-phase factor

P_y^+(f)

collects all poles and zeros inside the unit circle; its inverse is the whitening filter that produces the innovations

J_n

. This whitening is the key step that converts a colored-noise estimation problem into a white-noise one.

Historical Note: Wiener at MIT During the War

1940-1945

Norbert Wiener (1894-1964) derived what we now call the Wiener filter in 1942 as part of a classified report to the National Defense Research Committee, titled "Extrapolation, Interpolation, and Smoothing of Stationary Time Series." The motivating problem was anti-aircraft fire control: given a noisy radar track, predict where the aircraft would be when the shell arrived. Wiener's report circulated only among insiders — engineers called it "the yellow peril" because of its difficulty and its yellow binding — and was not openly published until 1949.

The filter was ahead of its time in several ways. It was one of the first systematic uses of second-order statistics in signal processing, and it introduced the spectral factorization machinery that would later become a central tool in control theory, filter design, and spectral estimation. Wiener, already famous for his work on Brownian motion and cybernetics, characteristically wrote the report in a dense style that required engineer Julian Bigelow to produce an accessible companion document for practitioners.

Historical Note: Kolmogorov's Independent Discovery

1941

Independently of Wiener and a year earlier, Andrey Kolmogorov (1903-1987) published "Stationary Sequences in Hilbert Space" in the Bulletin of Moscow State University in 1941. Kolmogorov worked in the discrete-time setting (which is where we are in this chapter) and approached the problem from pure Hilbert-space geometry: the optimal linear predictor of $X_n$ from its past is the orthogonal projection onto the closed subspace generated by $\{X_m : m < n\}$ . He derived what we now call the Kolmogorov-Szego formula for the one-step prediction variance, $\sigma_p^2 = \exp\int \log P_y(f)\, df$ , a result of remarkable elegance.

Kolmogorov's work reached the West only after the war. For a period in the 1950s there was a gentle priority dispute, but the dust settled on calling the continuous-time smoothing filter "Wiener" and the discrete-time predictor "Kolmogorov" or "Wiener-Kolmogorov." The fact that the same structure was discovered twice, on opposite sides of a world war, from a radar problem and from pure probability, is characteristic of deep mathematical ideas.

Common Mistake: Spectral Factorization Is Not Unique Without Causality

Mistake:

Writing $P_y(f) = |G(f)|^2$ for any $G$ and calling $G$ the spectral factor.

Correction:

There are infinitely many ways to factor $P_y(f)$ as a squared magnitude: multiply any such $G(f)$ by an all-pass filter $A(f)$ with $|A(f)| = 1$ and you get another factor. The unique choice is the minimum-phase factor $P_y^+(f)$ , which is the one whose inverse is also causal and stable. Causality pins down the factor up to a constant scalar.

Innovations

The white process $J_n$ obtained by passing an observation $Y_n$ through the causal whitening filter $1/P_y^+(f)$ . The innovation $J_n$ represents the new information in $Y_n$ beyond what could be predicted from $Y_{n-1}, Y_{n-2}, \ldots$ .

Minimum-Phase Spectral Factor

The function $P_y^+(f)$ in the factorization $P_y(f) = P_y^+(f) P_y^-(f)$ whose inverse DTFT is supported on $n \geq 0$ (causal) and whose reciprocal $1/P_y^+(f)$ is also causal. For rational PSDs it corresponds to placing all poles and zeros inside the unit circle.

Paley-Wiener Condition

The requirement $\int_{-1/2}^{1/2} \log P_y(f)\,df > -\infty$ , ensuring that the PSD does not vanish on a set of positive measure. It is the necessary and sufficient condition for spectral factorization to exist and for the process to admit a causal white-noise representation.

Related: Minimum-Phase Spectral Factor

Whitening Filter

A causal filter whose output has unit-variance white autocorrelation when driven by a given colored input. For WSS $Y_n$ the whitening filter has frequency response $1/P_y^+(f)$ .

Related: Innovations

Spectral Factorization and Innovations