When ``colored" continuum power spectrum components resulting from source variability are present, the statistical distribution of the corresponding power estimates cannot be derived in general from first principles. In the presence of extensive and repeated observations, the statistical properties of these components could be obtained directly from the data. In practice this is difficult to do, because of the limited duration of the observation and the characteristic red-noise spectra that are commonly found. An additional limitation derives from the fact that many cosmic sources display different activity states, often characterised by different luminosity and/or energy spectrum properties. A given activity state can last for time intervals as short as minutes; in same cases this imposes the tightest constraint on the lowest frequencies that can be studied in the sample spectrum, without violating the hypothesis of stationarity.
A single sample spectrum is often calculated over the
entire observation duration, T, in order to explore the lowest possible
frequencies, while maintaining the highest Fourier resolution
(
). In this
case only one power estimate is obtained for each Fourier frequency and the
statistical distribution of the noise component(s) from the source remains
unexplored.
Alternatively the observation can be divided in a series of M consecutive
intervals and the distribution of the power estimates investigated over the
ensemble of the sample spectra from individual intervals of duration T/M.
Clearly this approach does not allow the sampling of the
frequencies between 1/T and M/T, i.e. the lowest frequencies in the
power spectrum and degrades by a factor of M the original Fourier
resolution. To illustrate this, a
5.5 hr long observation
of the accreting black hole candidate Cyg X-1 in its so-called ``low
state'', one of the most variable X-ray binary sources in the sky, was
analysed.
The 7.8 ms resolved X-ray light curve (1-20 keV energy range) was
divided into M=1244 intervals of 16 s and a sample spectrum calculated
for each interval. The sample spectrum obtained by averaging these M spectra
is given in Figure 3.1 (upper panel).
Figure 3.1 (lower panels) shows the distribution of the power estimates for
selected frequencies over the M sample spectra. Each distribution is
normalised by
, where
is the estimate of the average
power at the j-th Fourier frequency
. It is apparent that in
all cases the distribution is close to a
probability
distribution function (hereafter pdf) (also plotted for comparison). A
Kolmogorov-Smirnov test
gives a probability of
20-60% that the observed distributions
result from a
pdf. Similar results were obtained for
the sample spectra from the light curves of a few other accreting
compact stars in X-ray binaries.
By extrapolating these results, we assume that, (for a given
activity state), the ``colored" noise components in the sample spectra of
cosmic sources also follow a rescaled
-distribution (see also van der Klis 1989b). Some caution is
necessary for red noise spectra with a power law slope steeper than -2.
In these cases the source variability on timescales comparable to those
over which the sample spectrum is calculated, can cause a substantial
low-frequency leakage, which in turn might alter the distribution of the
power estimates. To limit the effects of this leakage the technique
includes the possibility of subtracting polynomial trends from the light
curves (see Deeter 1984).
where
is the mean of y(t) and E[z(t)]=0.
The power spectrum,
of a linear process is given by:
where
is the power spectrum of z(t) and
is the frequency response function of the linear filter h(t).
The power spectrum is the average over the realisations of the
sample spectrum, i.e.
and
. The late equality implies that the sample
spectrum of the input white noise is normalised such as to follow a
pdf
(e.g. Jenkins & Watts 1968).
Given a white noise source and a suitable linear filter it is then
possible to
generate a random process with arbitrary spectrum.
In particular, it follows from eq. (3.2) that for a given frequency
,
the sample spectrum,
, of the linear process follows the same
distribution of the sample spectrum of the input white noise
, except for a rescaling factor of
.
Therefore, the pdf of
is
Based on the discussion above we adopt linear processes to model the sample power spectra (and their pdf) resulting from ``colored" noise variability of cosmic sources.
In practical applications the sample spectrum of astronomical time series
include a white noise component resulting from measurement
uncertainties (Poisson noise in the case of photon counting detectors). The
power estimates of the white noise resulting from Poisson statistics are
distributed according to a
pdf, if the normalisation
is adopted, where
is the total number of photons in the light
curve and
the complex Fourier amplitudes (see e.g. Leahy et al.
1983).
In the case of a Gaussian instrumental noise with mean zero and
variance
,
is to be replaced by
, where N is the number of points in the light curve.
Therefore in the regions of the
sample spectrum which are dominated by instrumental (white) noise,
the power estimates will follow a
pdf.
We assume that this instrumental white noise component
can be interpreted as the input process z(t), such that
eq. (
)
still holds. This assumption involves no
(statistical) approximation and allows to simplify considerably
the treatment.
In particular
it follows that if the square modulus of the frequency response function
were known, then multiplying
eq. (
) by
the spectrum of the (instrumental)
white
noise would be recovered. In this case
the search for significant power spectrum peaks
arising from a periodic signal could be carried out by using standard
techniques.
In practice
must be estimated through the sample spectrum.
One possibility would be to model the power spectrum continuum components
by adopting an appropriate maximum likelihood technique (Anderson,
Duvall & Jeffries 1990; Stella et al. 1996; Arlandi, Stella & Tagliaferri
1996) and use the best fit function to estimate
. This approach,
however, faces difficulties with the subjective choice of the model function
and, more crucially, the estimate of the statistical uncertainties of
the best fit at any given frequency.
Therefore we prefer to evaluate
through a suitable smoothing
algorithm.
As the goal of
any periodicity search
is to detect a sharp peak over the underlying sample spectrum continuum,
the power in a (possible)
peak should not affect the estimate of the continuum
(otherwise the sensitivity of the search would be reduced).
This implies that for each frequency
the continuum should be
estimated through an interpolation of the
sample spectrum which excludes
itself and uses the power
estimates over a range of nearby frequencies at the left and right of
. In the language of the smoothing functions, this corresponds to
a well-known class of spectral windows which are zero-valued at the central
frequency.
We adopt for simplicity a rectangular window (with a central gap)
that extends over a total of I Fourier frequencies, giving a
width of
.
A far better result is obtained if the smoothing over I Fourier frequencies
is distributed such that its logarithmic frequency width is
(approximately) the
same on both sides of
, i.e.
.
This approach builds on the obvious fact that a power law
spectrum is a straight line in a log-log representation.
Considering that
=
it follows that
In this scheme the smoothed spectrum
,
that we adopt as the estimator of 2
, is calculated as follows
where
and
(rounded to the nearest integers)
are the number of Fourier frequencies
in
and
, respectively.
By propagating eq. (
) we obtain the variance
of the
variables over the smoothing formula (cf. eq.
)
The solid lines in Figure 3.2 show the estimate of the continuum power
spectrum components (and therefore of 2
) obtained by using the
above technique; it is apparent that also the low-frequency end of
spectra B and C is reproduced quite well, and that edge effects are
nearly absent.
In general I, the number of Fourier frequencies defining the smoothing
width, is
to be adjusted so as to closely follow the sharpest continuous features
of the
sample spectrum (something that favors low values of I), while
maintaining the noise of the smoothed spectrum as low as possible
(something that favors high
values of I). To this aim we consider the divided sample spectrum
, i.e. the ratio of the sample spectrum and the smoothed
spectrum for a
range of different values of I. If
provides a reasonably good
estimate of
, then
will approximately follow the
-distribution of the input white noise, at least for
relatively small values of
(
). A
Kolmogorov-Smirnov (KS) test can be used in order to derive out of
different trial values of the width
I the one that makes the distribution of
closest to a
pdf. The KS test is especially sensitive to differences away from the
tails of the distributions (see e.g. Press et al. 1992).
In the following we adopt
, the smoothed sample
spectrum with a width
that maximises the probability of the KS test described above.
In practice values of
between 30-40 and the number of Fourier
frequencies in the sample spectrum are to be used.
The smoothed sample spectrum
provides the estimate of
, in the sense previously discussed.
Therefore
we adopt the divided spectrum
as the
estimator of the
white noise spectrum of the input linear process. The search for
coherent periodicities in the data thus translates into the problem of
detecting significant peaks in
. This, in turn, requires a detailed
knowledge of the expected pdf of
, especially for high values.
For each Fourier frequency
,
is to be regarded as the
ratio of the random variables
and
.
is distributed like a
pdf rescaled to an expectation
value of
.
By approximating
with
we have:
The distribution of
is in general a suitable linear combination
of the
random variables
used in the smoothing. These, in turn,
are distributed like a rescaled
(cf. eq. 3.8).
For sufficiently high values of
, one can appeal to the
central limit theorem and approximate the distribution of
with a
Gaussian distribution of mean
and variance
(cf. eq.
and
), i.e.
Note that
and
can be
regarded, for any given j, as statistically independent variables
(indeed
is not used in the computation of
).
In this case the pdf of
can be written as
(e.g. Mood, Graybill & Boes 1974)
where we have used the fact that the joint pdf
is given by the product of
and
.
It is clear from the discussion above that except for the first and last
5-6 Fourier frequencies of each spectrum, the approximate pdf derived
above
for
provides accurate results, as long as
.
The technique described was developed in order
to approximate through a suitable smoothing
the ``colored'' noise components from the source variability and recover a
white
noise sample spectrum in which the search for narrow peaks can be carried out
by applying the pdf of the divided spectrum.
Given a sample spectrum
, the divided spectrum
is
calculated for a given smoothing width and its distribution compared to
a
pdf by using a KS test. This is repeated for smoothing
widths ranging from a maximum of
twice the number of Fourier frequencies in the sample spectrum
to a minimum of 30-40, with a spacing
of half an octave. The smoothing width
that is found to produce the
highest KS probability is then adopted for the rest of the analysis. The
divided spectrum
is then searched for significant peaks testifying
to the presence of a periodic modulation. The detection threshold is
determined by the expected distribution of the divided sample spectrum
, which is worked out based on the approximate pdf of
eq. (3.10).
Note
that, unlike the case of a standard search in the presence of
simple white noise,
the detection threshold depends on the trial frequency (cf.
eq.
). By definition the threshold is given by the set of
values
that
will not be exceeded by chance at any of the
frequencies
examined,
with a confidence level C. The (small) probability 1-C that
at least one of the
values of
exceeds the detection
threshold is given by
where
is the chance probability of
exceeding the detection threshold
. By solving the equation
for
, the detection threshold is obtained for each j.
Once a peak
above the detection threshold is found, the
corresponding signal power
is obtained from the
prescription
of Groth (1975) and Vaughan et al. (1994) (see below).
The sinusoidal amplitude, A, is
defined by assuming that the signal is given by
, with
the average count rate
and
the phase. A
is then derived by using standard methods; for binned data we have (see
e.g.
Leahy et al. 1983).
where N is the number of bins in the light curve.
In the absence of a positive detection, the threshold
must be
converted
to an upper limit on the (sinusoidal) signal amplitude. To this aim it is
first necessary to calculate the sensitivity of the search, i.e. the
weakest signal
that will produce, with confidence C, a
value of
exceeding the detection threshold
.
This is done by using the prescription of Groth (1975) and Vaughan et al.
(1994) to calculate the probability distribution of the total power
resulting
from the signal and the (white) noise. Their procedure is adapted
to our case by using
in place of their detection threshold
for the
power.
Strictly speaking the pdf of the noise in the divided spectrum
substantially exceeds the pdf of a simple
distribution for high
values (see Fig. 3.4). However, the difference is only minor for
values of a few (2-3) standard deviations above the average. The pdf
derived by Groth (1975) is therefore expected to apply to the noise in the
divided spectrum, for any reasonable value
of the confidence level C.
needs to be worked out for each frequency j, because of
the j-dependence of
. For the same reason it is not possible to
derive an upper limit which applies to all j's, based on the highest
observed value of
as described by van der Klis (1988)
and
Vaughan et al. (1994).
The set of
obtained in this way are then converted to
the corresponding powers through
and then to the sinusoidal signal amplitude
by using eq. (3.13).
A similar procedure is used to derive confidence intervals
on the power
, and therefore the amplitude A,
of a detected signal (
is used
in place of
in this case).
\2box16.68 Summary -- This newly developed power spectrum analysis technique for the detection of periodic signals in the presence of ``colored" noise components presents the following main advantages: (i) it does not require any reduction of the Fourier frequency resolution; (ii) it can be used also in the presence of the relatively steep red noise components (power law slopes as low as -2) which are commonly found in nature; (iii) it takes into account the statistical uncertainties in the estimator of the continuum power spectrum components. Extensive numerical simulations carried out in order to test the reliability of the technique and define the range of applicability of the adopted approximations, proved that very good results are obtained if the first and the last 5-6 Fourier frequencies of the sample spectrum are excluded from the analysis and the width of the smoothing is larger than 30-40 Fourier frequencies. Potentially significant peaks are selected by using a preliminary detection threshold based on relatively simple statistics (this is not CPU-intensive). The significance of candidate peaks is then reassessed on the basis of a complete application of the technique. Computer programs based on the new technique will be made available to the community through the timing analysis package Xronos (Stella & Angelini 1992a,b).