This introduction is directly adapted from Matlab documentation regarding the Signal Processing Toolbox. More detailed description, including the whole document referred before, can be found typing:
>> doc signal
in the Matlab terminal.

Background Information

The goal of spectral estimation is to describe the distribution (over frequency) of the power contained in a signal, based on a finite set of data. Estimation of power spectra is useful in a variety of applications, including the detection of signals buried in wide-band noise.

Power Spectral Density Function

The power spectral density (PSD) of a stationary random process xn is mathematically related to the correlation sequence by the discrete-time Fourier transform. In terms of normalized frequency, this is given by

This can be written as a function of physical frequency f (e.g., in hertz) by using the relation ω = 2πf/fs, where fs is the sampling frequency.

The correlation sequence can be derived from the PSD by use of the inverse discrete-time Fourier transform:

The average power of the sequence xn over the entire Nyquist interval is represented by

The average power of a signal over a particular frequency band , , can be found by integrating the PSD over that band:

You can see from the above expression that Pxx(ω) represents the power content of a signal in an infinitesimal frequency band, which is why it is called the power spectral density.

The units of the PSD are power (e.g., watts) per unit of frequency. In the case of Pxx(ω), this is watts/radian/sample or simply watts/radian. In the case of Pxx(f), the units are watts/hertz. Integration of the PSD with respect to frequency yields units of watts, as expected for the average power .

For real signals, the PSD is symmetric about DC, and thus Pxx(ω) for is sufficient to completely characterize the PSD. However, to obtain the average power over the entire Nyquist interval, it is necessary to introduce the concept of the one-sided PSD.

The one-sided PSD is given by

The average power of a signal over the frequency band , , can be computed using the one-sided PSD as

Cross-Spectral Density Function

The PSD is a special case of the cross spectral density (CPSD) function, defined between two signals xn and yn as

As is the case for the correlation and covariance sequences, the toolbox estimates the PSD and CPSD because signal lengths are finite.

To estimate the cross-spectral density of two equal length signals x and y using Welch's method, the cpsd function forms the periodogram as the product of the FFT of x and the conjugate of the FFT of y. Unlike the real-valued PSD, the CPSD is a complex function. cpsd handles the sectioning and windowing of x and y in the same way as the pwelch function:

Sxy = cpsd(x,y,nwin,noverlap,nfft,fs)

Transfer Function Estimate

One application of Welch's method is nonparametric system identification. Assume that H is a linear, time invariant system, and x(n) and y(n) are the input to and output of H, respectively. Then the power spectrum of x(n) is related to the CPSD of x(n) and y(n) by

An estimate of the transfer function between x(n) and y(n) is

This method estimates both magnitude and phase information. The tfe function uses Welch's method to compute the CPSD and power spectrum, and then forms their quotient for the transfer function estimate. Use tfe the same way that you use the cpsd function.

Coherence Function

The magnitude-squared coherence between two signals x(n) and y(n) is

This quotient is a real number between 0 and 1 that measures the correlation between x(n) and y(n) at the frequency ω.

The cohere function takes sequences x and y, computes their power spectra and CPSD, and returns the quotient of the magnitude squared of the CPSD and the product of the power spectra. Its options and operation are similar to the cpsd and tfestimate functions.

If the input sequence length nfft, window length window, and the number of overlapping data points in a window numoverlap, are such that cohere operates on only a single record, the function returns all ones. This is because the coherence function for linearly dependent data is one.

Spectral Estimation Methods

The various methods of spectrum estimation available in the toolbox are categorized as follows:

Nonparametric methods are those in which the PSD is estimated directly from the signal itself. The simplest such method is the periodogram. An improved version of the periodogram is Welch's method [8].

Nonparametric Methods

The following sections discuss the periodogram, modified periodogram, and Welch methods of nonparametric estimation.

Periodogram

In general terms, one way of estimating the PSD of a process is to simply find the discrete-time Fourier transform of the samples of the process (usually done on a grid with an FFT) and take the magnitude squared of the result. This estimate is called the periodogram.

The periodogram estimate of the PSD of a length-L signal xL[n] is

where

The actual computation of XL(f) can be performed only at a finite number of frequency points, N, and usually employs the FFT. In practice, most implementations of the periodogram method compute the N-point PSD estimate

where

It is wise to choose N > L so that N is the next power of two larger than L. To evaluate XL[fk], we simply pad xL[n] with zeros to length N. If L > N, we must wrap xL[n] modulo-N prior to computing XL[fk].

Performance of the Periodogram

The following sections discuss the performance of the periodogram with regard to the issues of leakage, resolution, bias, and variance.

Spectral Leakage.   Consider the PSD of a finite-length signal xL[n], as discussed in the Periodogram section. It is frequently useful to interpret xL[n] as the result of multiplying an infinite signal, x[n], by a finite-length rectangular window, wR[n]:

Because multiplication in the time domain corresponds to convolution in the frequency domain, the Fourier transform of the expression above is

The expression developed earlier for the periodogram,

illustrates that the periodogram is also influenced by this convolution.

The effect of the convolution is best understood for sinusoidal data. Suppose that x[n] is composed of a sum of M complex sinusoids:

Its spectrum is

which for a finite-length sequence becomes

So in the spectrum of the finite-length signal, the Dirac deltas have been replaced by terms of the form , which corresponds to the frequency response of a rectangular window centered on the frequency fk.

The frequency response of a rectangular window has the shape of a sinc signal, as shown below.

The plot displays a main lobe and several side lobes, the largest of which is approximately 13.5 dB below the mainlobe peak. These lobes account for the effect known as spectral leakage. While the infinite-length signal has its power concentrated exactly at the discrete frequency points fk, the windowed (or truncated) signal has a continuum of power "leaked" around the discrete frequency points fk.

Because the frequency response of a short rectangular window is a much poorer approximation to the Dirac delta function than that of a longer window, spectral leakage is especially evident when data records are short.

It is important to note that the effect of spectral leakage is contingent solely on the length of the data record. It is not a consequence of the fact that the periodogram is computed at a finite number of frequency samples.

Resolution.   Resolution refers to the ability to discriminate spectral features, and is a key concept on the analysis of spectral estimator performance.

In order to resolve two sinusoids that are relatively close together in frequency, it is necessary for the difference between the two frequencies to be greater than the width of the mainlobe of the leaked spectra for either one of these sinusoids. The mainlobe width is defined to be the width of the mainlobe at the point where the power is half the peak mainlobe power (i.e., 3 dB width). This width is approximately equal to fs / L.

In other words, for two sinusoids of frequencies f1 and f2, the resolvability condition requires that

In the example above, where two sinusoids are separated by only 10 Hz, the data record must be greater than 100 samples to allow resolution of two distinct sinusoids by a periodogram.

The above discussion about resolution did not consider the effects of noise since the signal-to-noise ratio (SNR) has been relatively high thus far. When the SNR is low, true spectral features are much harder to distinguish, and noise artifacts appear in spectral estimates based on the periodogram. The example below illustrates this:

randn('state',0)
fs = 1000;                  % Sampling frequency
t = (0:fs/10)./fs;          % One-tenth second worth of samples
A = [1 2];                  % Sinusoid amplitudes
f = [150;140];              % Sinusoid frequencies
xn = A*sin(2*pi*f*t) + 2*randn(size(t));
Hs=spectrum.periodogram;
psd(Hs,xn,'Fs',fs,'NFFT',1024)

Bias of the Periodogram.   The periodogram is a biased estimator of the PSD. Its expected value can be shown to be

which is similar to the first expression for XL(f) in Spectral Leakage, except that the expression here is in terms of average power rather than magnitude. This suggests that the estimates produced by the periodogram correspond to a leaky PSD rather than the true PSD.

Note that essentially yields a triangular Bartlett window (which is apparent from the fact that the convolution of two rectangular pulses is a triangular pulse). This results in a height for the largest sidelobes of the leaky power spectra that is about 27 dB below the mainlobe peak; i.e., about twice the frequency separation relative to the non-squared rectangular window.

The periodogram is asymptotically unbiased, which is evident from the earlier observation that as the data record length tends to infinity, the frequency response of the rectangular window more closely approximates the Dirac delta function (also true for a Bartlett window). However, in some cases the periodogram is a poor estimator of the PSD even when the data record is long. This is due to the variance of the periodogram, as explained below.

Variance of the Periodogram.   The variance of the periodogram can be shown to be approximately

which indicates that the variance does not tend to zero as the data length L tends to infinity. In statistical terms, the periodogram is not a consistent estimator of the PSD. Nevertheless, the periodogram can be a useful tool for spectral estimation in situations where the SNR is high, and especially if the data record is long.

The Modified Periodogram

The modified periodogram windows the time-domain signal prior to computing the FFT in order to smooth the edges of the signal. This has the effect of reducing the height of the sidelobes or spectral leakage. This phenomenon gives rise to the interpretation of sidelobes as spurious frequencies introduced into the signal by the abrupt truncation that occurs when a rectangular window is used. For nonrectangular windows, the end points of the truncated signal are attenuated smoothly, and hence the spurious frequencies introduced are much less severe. On the other hand, nonrectangular windows also broaden the mainlobe, which results in a net reduction of resolution.

The periodogram function allows you to compute a modified periodogram by specifying the window to be used on the data. For example, compare a default rectangular window and a Hamming window:

randn('state',0)
fs = 1000;                  % Sampling frequency
t = (0:fs/10)./fs;          % One-tenth second worth of samples
A = [1 2];                  % Sinusoid amplitudes
f = [150;140];              % Sinusoid frequencies
xn = A*sin(2*pi*f*t) + 0.1*randn(size(t));
Hrect = spectrum.periodogram;
psd(Hrect,xn,'Fs',fs,'NFFT',1024);

Hhamm = spectrum.periodogram('Hamming');
psd(Hhamm,xn,'Fs',fs,'NFFT',1024);

You can verify that although the sidelobes are much less evident in the Hamming-windowed periodogram, the two main peaks are wider. In fact, the 3 dB width of the mainlobe corresponding to a Hamming window is approximately twice that of a rectangular window. Hence, for a fixed data length, the PSD resolution attainable with a Hamming window is approximately half that attainable with a rectangular window. The competing interests of mainlobe width and sidelobe height can be resolved to some extent by using variable windows such as the Kaiser window.

Nonrectangular windowing affects the average power of a signal because some of the time samples are attenuated when multiplied by the window. To compensate for this, the periodogram function normalizes the window to have an average power of unity. This way the choice of window does not affect the average power of the signal.

The modified periodogram estimate of the PSD is

where U is the window normalization constant

which is independent of the choice of window. The addition of U as a normalization constant ensures that the modified periodogram is asymptotically unbiased.

Welch's Method

An improved estimator of the PSD is the one proposed by Welch [8]. The method consists of dividing the time series data into (possibly overlapping) segments, computing a modified periodogram of each segment, and then averaging the PSD estimates. The result is Welch's PSD estimate.

Welch's method is implemented in the toolbox by the spectrum.welch object or pwelch function. By default, the data is divided into eight segments with 50% overlap between them. A Hamming window is used to compute the modified periodogram of each segment.

The averaging of modified periodograms tends to decrease the variance of the estimate relative to a single periodogram estimate of the entire data record. Although overlap between segments tends to introduce redundant information, this effect is diminished by the use of a nonrectangular window, which reduces the importance or weight given to the end samples of segments (the samples that overlap).

However, as mentioned above, the combined use of short data records and nonrectangular windows results in reduced resolution of the estimator. In summary, there is a trade-off between variance reduction and resolution. One can manipulate the parameters in Welch's method to obtain improved estimates relative to the periodogram, especially when the SNR is low.

For a more detailed discussion of Welch's method of PSD estimation, see Kay [2] and Welch [8].

Bias and Normalization in Welch's Method

Welch's method yields a biased estimator of the PSD. The expected value can be found to be

where Ls is the length of the data segments and U is the same normalization constant present in the definition of the modified periodogram. As is the case for all periodograms, Welch's estimator is asymptotically unbiased. For a fixed length data record, the bias of Welch's estimate is larger than that of the periodogram because Ls < L.

The variance of Welch's estimator is difficult to compute because it depends on both the window used and the amount of overlap between segments. Basically, the variance is inversely proportional to the number of segments whose modified periodograms are being averaged.

References

[1] Hayes, M.H. Statistical Digital Signal Processing and Modeling. New York: John Wiley & Sons, 1996.

[2] Kay, S.M. Modern Spectral Estimation. Englewood Cliffs, NJ: Prentice Hall, 1988.

[3] Marple, S.L. Digital Spectral Analysis. Englewood Cliffs, NJ: Prentice Hall, 1987.

[4] Orfanidis, S.J. Introduction to Signal Processing. Upper Saddle River, NJ: Prentice Hall, 1996.

[5] Percival, D.B., and A.T. Walden. Spectral Analysis for Physical Applications: Multitaper and Conventional Univariate Techniques. Cambridge: Cambridge University Press, 1993.

[6] Proakis, J.G., and D.G. Manolakis. Digital Signal Processing: Principles, Algorithms, and Applications. Englewood Cliffs, NJ: Prentice Hall, 1996.

[7] Stoica, P., and R. Moses. Introduction to Spectral Analysis. Upper Saddle River, NJ: Prentice Hall, 1997.

[8] Welch, P.D. " The Use of Fast Fourier Transform for the Estimation of Power Spectra: A Method Based on Time Averaging Over Short, Modified Periodograms.IEEE Trans. Audio Electroacoust. Vol. AU-15 (June 1967). Pgs.70-73.