An Example of Extracting Frequency-domain and Time-domain Statistical Features

Oct. 24, 2022

Introduction

博客Generate Synthetic Signals Using Conditional GAN in MATLAB - What a starry night~中分析了一个在MATLAB中使用CGAN生成pump振动信号的示例。该示例中包含了一段提取振动信号时域特征和频域特征的代码,本文以其中一条振动信号为例详细分析这段代码的功能。


Time-domain Profile

首先,绘制振动信号的时域波形:

1
2
3
4
5
6
7
8
load simulatedDataset.mat

flow1 = flow(:, 1);
flow1 = flow1-mean(flow1);
plot(flow1)
set(gca, "XLim", [0, numel(flow1)-1])
xlabel("Observations")
ylabel("Signal")

image-20221024131228983

注意,在提取频域特征之前,需要信号的偏移,使信号在零值附近波动,上图展示了这一点。


Spectrum , pspectrum Function

之后,提取信号的频域特征,首先需要得到信号的频谱,这里使用的是pspectrum函数:

pspectrum函数官方文档:pspectrum - MathWorks.

1
[flowSpectrum, flowFrequencies] = pspectrum(flow1, 1000, 'FrequencyLimits', [2 250]);

其中:

  • 输入的1000是信号采样率;FrequencyLimits表示Frequency band limits;
  • 输出的flowSpectrum表示信号的功率谱,flowFrequencies表示功率谱所对应的频率;

可以简单看一下频谱的图像:

1
plot(flowFrequencies, flowSpectrum)

image-20221024133757786

但其实,pspectrum函数本身也可以进行绘图:

1
pspectrum(flow1, 1000, 'FrequencyLimits', [2 250])

image-20221024135228834

但是可以看到两者的差别。

之所以差别如此巨大,是因为[flowSpectrum, flowFrequencies] = pspectrum(flow1, 1000, 'FrequencyLimits', [2 250]);所返回的flowSpectrum并不是以dB为单位的,如果想要得到和pspectrum绘制一致的图像,需要使用pow2db函数将功率转换为dB的单位:

1
2
3
4
5
6
7
8
9
10
11
figure('Units', 'pixels', 'Position', [391,284,1105,420])
tiledlayout(1, 2)
nexttile
pspectrum(flow1, 1000, 'FrequencyLimits', [2 250])
nexttile
plot(flowFrequencies, pow2db(flowSpectrum))
set(gca, "XLim", [0, 250])
grid(gca, "on")
box(gca, "on")
xlabel("Frequency")
ylabel("Power Specturm")

image-20221024135406917

两者就是一致的了。但是本示例在提取频域特征的时候,还是使用的原始的单位,并不使用dB单位,即基于该图像进行提取:

image-20221024133757786


Extract Time-domain and Frequency-domain Features

在得到功率谱flowSpectrum和对应的频率flowFrequencies就可以进行特征提取了:

1
ci = extractCI(flow1, flowSpectrum, flowFrequencies);

使用的函数为extractCI函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
function ci = extractCI(flow, flowP, flowF)
% Compute signal statistical characteristics.

% Find frequency of peak magnitude in power spectrum
pMax = max(flowP);
fPeak = min(flowF(flowP==pMax));

% Compute power in low-frequency range 10 Hz-20 Hz
fRange = flowF >= 10 & flowF <= 20;
pLow = sum(flowP(fRange));

% Compute power in mid-frequency range 40 Hz-60 Hz
fRange = flowF >= 40 & flowF <= 60;
pMid = sum(flowP(fRange));

% Compute power in high-frequency range >100 Hz
fRange = flowF >= 100;
pHigh = sum(flowP(fRange));

% Find frequency of spectral kurtosis peak
[pKur,fKur] = pkurtosis(flow, 1000);
pKur = fKur(pKur == max(pKur));
pKurtosis =  pKur(1);

% Compute flow cumulative sum range
csFlow = cumsum(flow);
csFlowRange = max(csFlow)-min(csFlow);
qCSRange = csFlowRange;

% Collect features and feature values in cell array.
qMean = mean(flow);
qVar = var(flow);
qSkewness = skewness(flow);
qKurtosis = kurtosis(flow);
qPeak2Peak = peak2peak(flow);
qCrest = peak2rms(flow);
qRMS = rms(flow);
qMAD = mad(flow);

ci = [fPeak, pLow, pMid, pHigh, pKurtosis, ...
    qMean, qVar, qSkewness, qKurtosis, ...
    qPeak2Peak, qCrest, qRMS, qMAD, qCSRange];
end

该函数的输入输出很明晰,不再赘述。

(1) Frequency of peak magnitude in power spectrum

代码片段:

1
2
3
% Find frequency of peak magnitude in power spectrum
pMax = max(flowP);
fPeak = min(flowF(flowP==pMax));

提取频谱中的最大值点的横纵坐标(即频率和幅值):

1
2
3
4
5
6
figure
hold(gca, "on")
grid(gca, "on")
box(gca, "on")
plot(flowF, flowP)
scatter(fPeak, pMax, "filled")

image-20221024141038963

但是也可以看到,最后只将频率fPeak作为特征之一。

(2~4) Power in Low-frequency, mid-frequency, and high-frequency

代码片段:

1
2
3
4
5
6
7
8
9
% Compute power in low-frequency range 10 Hz-20 Hz
fRange = flowF >= 10 & flowF <= 20;
pLow = sum(flowP(fRange));
% Compute power in mid-frequency range 40 Hz-60 Hz
fRange = flowF >= 40 & flowF <= 60;
pMid = sum(flowP(fRange));
% Compute power in high-frequency range >100 Hz
fRange = flowF >= 100;
pHigh = sum(flowP(fRange));

计算低频(10-20 Hz)、中频(40-60 Hz)、高频(>100 Hz)频谱能量和。

(5) Frequency of spectral kurtosis peak, pkurtosis Function

代码片段:

1
2
3
4
% Find frequency of spectral kurtosis peak
[pKur, fKur] = pkurtosis(flow, 1000);
pKur = fKur(pKur == max(pKur));
pKurtosis =  pKur(1);

提取谱峰度的峰频率,使用了MATLAB的pkurtosis函数。

pkurtosis函数官方文档:pkurtosis - MathWorks.

其中:

  • 输入中的1000,为采样率;

  • pKur为谱峰度(Spectral kurtosis, SK),使用的是短时傅里叶变换(short-time Fourier transform, STFT)进行计算:

    image-20221024142817121

  • fKur为frequencies for pKur

之后,将fKur作为特征之一。

(6) Flow cumulative sum range

之后,就是计算信号的时域特征了。首先是信号的累计和范围(cumulative sum range):

1
2
3
4
% Compute flow cumulative sum range
csFlow = cumsum(flow);
csFlowRange = max(csFlow)-min(csFlow);
qCSRange = csFlowRange;

(7-14) Other statistical time-domain features

1
2
3
4
5
6
7
8
qMean = mean(flow);
qVar = var(flow);
qSkewness = skewness(flow);
qKurtosis = kurtosis(flow);
qPeak2Peak = peak2peak(flow);
qCrest = peak2rms(flow);
qRMS = rms(flow);
qMAD = mad(flow);

其中:

  • 1
    2
    3
    4
    
    qMean = mean(flow);
    qVar = var(flow);
    qSkewness = skewness(flow);
    qKurtosis = kurtosis(flow);
    

    计算的是信号的均值、方差、偏度、峰度;

  • 1
    
    qPeak2Peak = peak2peak(flow);
    

    计算的是信号的峰峰值,其实就是Maximum-to-minimum difference;

  • 1
    
    qCrest = peak2rms(flow);
    

    计算的是Peak-magnitude-to-RMS ratio:

    image-20221024144235942

  • 1
    
    qRMS = rms(flow);
    

    计算的是信号的均方根值(Root-mean-square value):

    image-20221024144437125

  • 1
    
    qMAD = mad(flow);
    

    计算的是平均绝对偏差(mean absolute deviation)。