Condition Monitoring and Prognostics based on Vibration Signals in MATLAB
Introduction
本博客中的示例来自MATLAB Predictive Maintenance Toolbox,Condition Monitoring and Prognostics Using Vibration Signals1。该示例使用振动信号的频域特征(mean peak frequency)进行轴承的故障诊断和预测,使用的模型是一个时间序列二阶模型,本质上也是拟合参数的问题,即不断使用新数据进行参数估计,估计完模型参数后进行预测。但实际上,这个例子更像是系统辨识的例子:
This example uses functionality from Signal Processing Toolbox™ and System Identification Toolbox™, and does not require Predictive Maintenance Toolbox™.1
另外需要强调的是,该示例使用的训练数据也是从健康状态到故障状态全过程的数据。
Work flow
Data visualization
pdmBearingConditionMonitoringData.mat
文件中的数据内容:
1
2
3
4
Name Size Bytes Class Attributes
data 600x1 96062400 cell
defectDepthVec 600x1 4800 double
expTime 600x1 4800 double
其中,
- 变量
data
中包含着多个不同health conditions的震动信号片段,每个元素对应1秒的数据,采样率为20kHz; - 变量
defectDepthVec
为相应的故障程度,为逐渐增大的数值; - 变量
expTime
为对应的时间,单位是minutes。
绘制故障程度随时间的变化曲线:
1
2
3
plot(expTime, defectDepthVec);
xlabel('Time (min)');
ylabel('Defect depth (m)');
绘制健康状态的震动信号(第一条信号)和故障状态的震动信号(最后一条信号):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
time = linspace(0, 1, fs)';
% healthy bearing signal
subplot(2, 1, 1);
plot(time,data{1});
xlabel('Time(s)');
ylabel('Acceleration(m/s^2)');
legend('Healthy bearing signal');
% faulty bearing signal
subplot(2, 1, 2);
plot(time,data{end});
xlabel('Time(s)');
ylabel('Acceleration(m/s^2)');
legend('Faulty bearing signal');
Feature extraction
用于轴承诊断的典型特征包括时域特征(方均根值、峰值、信号峭度等等)或者频域特征(峰频率,平均频率等等)。
在数据探索阶段,可以绘制震动信号的谱,看一看震动信号在时域、频域或者时频域的特点,这有利于探索能够反映故障发展的特征。
首先,计算健康数据的频谱,窗口长度为500,重叠度为90%(即450各数据点),并且设置FFT的点数为512。
1
[~, fvec, tvec ,P0] = spectrogram(data{1}, 500, 450, 512, fs);
其中,P0
为频谱,fvec
为频率向量,tvec
为时间向量。
绘制轴承健康状态下的震动信号的频谱图:
1
2
3
4
5
6
7
clf;
imagesc(tvec, fvec, P0)
xlabel('Time(s)');
ylabel('Frequency(Hz)');
title('Healthy Bearing Signal Spectrogram');
colorbar
axis xy
之后,计算并绘制轴承故障状态下震动信号的频谱图:
1
2
3
4
5
6
7
[~, fvec, tvec, Pfinal] = spectrogram(data{end}, 500, 450, 512, fs);
imagesc(tvec, fvec, Pfinal)
xlabel('Time(s)');
ylabel('Frequency(Hz)');
title('Faulty Bearing Signal Spectrogram');
colorbar
axis xy
可以看到,此时的信号能量集中的更高的频率。
因此,在健康状态下和故障状态下震动信号的频谱是不同的,可以用于故障诊断。在本例中,从频谱中提取mean peak frequencies作为health indicator。
设频谱为$P(t,\omega)$,peak frequency可以表示为:
\[PeakFreq(t)=\arg\max_\omega P(t,\omega)\notag\]则mean peak frequency为peak frequency的均值:
\[meanPeakFreq=\dfrac1T\int_0^TPeakFreq(t)\mathrm{d}t\notag\]计算健康震动信号的mean peak frequency:
1
2
[~, I0] = max(P0); % Find out where the peak frequencies are located.
meanPeakFreq0 = mean(fvec(I0)) % Calculate mean peak frequency.
1
2
meanPeakFreq0 =
666.4602
之后计算故障震动信号的mean peak frequency:
1
2
[~, Ifinal] = max(Pfinal);
meanPeakFreqFinal = mean(fvec(Ifinal))
1
2
meanPeakFreqFinal =
2.8068e+03
可以看到mean peak frequency从大概650Hz增加到了将近3kHz。
之后,可以再探索一下中间阶段震动信号的mean peak frequency:
1
2
3
4
5
6
7
[~, fvec, tvec, Pmiddle] = spectrogram(data{end/2}, 500, 450, 512, fs);
imagesc(tvec, fvec, Pmiddle)
xlabel('Time(s)');
ylabel('Frequency(Hz)');
title('Bearing Signal Spectrogram');
colorbar
axis xy
可以看到,图像中布满了高频噪声。这个现象是由原始震动和由小缺陷引起的震动所共同引起的。为了计算mean peak frequency,需要将数据进行滤波处理,移除这些高频噪声,再次进行可视化:
1
2
3
4
5
6
7
8
9
dataMiddleFilt = medfilt1(data{end/2}, 3);
[~, fvec, tvec, Pmiddle] = spectrogram(dataMiddleFilt, 500, 450, 512, fs);
imagesc(tvec, fvec, Pmiddle)
xlabel('Time(s)');
ylabel('Frequency(Hz)');
title('Filtered Bearing Signal Spectrogram');
colorbar
axis xy
之后,对所有的震动信号进行相同的特征提取步骤:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
% Define a progress bar.
h = waitbar(0, 'Start to extract features');
% Initialize a vector to store the extracted mean peak frequencies.
meanPeakFreq = zeros(numSamples,1);
for k = 1:numSamples
% Get most up-to-date data.
curData = data{k};
% Apply median filter.
curDataFilt = medfilt1(curData,3);
% Calculate spectrogram.
[~,fvec,tvec,P_k] = spectrogram(curDataFilt,500,450,512,fs);
% Calculate peak frequency at each time instance.
[~,I] = max(P_k);
meanPeakFreq(k) = mean(fvec(I));
% Show progress bar indicating how many samples have been processed.
waitbar(k/numSamples,h,'Extracting features');
end
close(h);
并绘制mean peak frequencies随时间变化的趋势:
1
2
3
4
plot(expTime, meanPeakFreq);
xlabel('Time(min)');
ylabel('Mean peak frequency (Hz)');
grid on;
Condition monitoring and prognostics
把mean peak frequency当作time series,我们可以估计一个关于mean peak frequency的时间序列模型,使用这个模型预测未来值。使用前200个mean peak frequency值创建一个初始时间序列模型,之后一旦有有了10个新值,就使用后100个值更新时间序列模型。这种更新时间序列模型的batch mode可以捕捉instantaneous trends。更新的时间序列模型用于预测后面10个步长的数据。
1
2
3
4
tStart = 200; % Start Time
timeSeg = 100; % Length of data for building dynamic model
forecastLen = 10; % Define forecast time horizon
batchSize = 10; % Define batch size for updating the dynamic model
为了进行预测性维修,我们需要一个阈值来决定何时停掉机器。在这个例子中,我们基于仿真得到的轴承健康情况和故障情况的统计数据(保存在pdmBearingConditionMonitoringStatistics.mat
文件中)得到这个阈值。
pdmBearingConditionMonitoringStatistics.mat
文件的数据内容如下所示:
1
2
3
4
5
6
7
Name Size Bytes Class Attributes
dataFaulty 100x1 800 double
dataNormal 100x1 800 double
pFaulty 3001x1 24008 double
pFreq 3001x1 24008 double
pNormal 3001x1 24008 double
绘制健康轴承和故障轴承震动信号的mean peak frequency的概率分布:
1
2
3
4
plot(pFreq, pNormal, 'g--', pFreq, pFaulty, 'r', LineWidth=1.5);
xlabel('Mean peak frequency(Hz)');
ylabel('Probability Distribution');
legend('Normal Bearing','Faulty Bearing');
基于图像,我们将mean peak frequency的阈值设置为2000 Hz,以分辨轴承的正常状态和故障状态,同时最大化轴承的使用。
1
threshold = 2000;
之后,计算采样间隔,并将其单位转换为秒(原本的单位是分钟)。
1
samplingTime = 60*(expTime(2)-expTime(1)); % unit: seconds
之后使用iddata
函数将meanPeakFreq
前tStart
(200个)数据转换为iddata
类型的变量tsFeature
:
1
tsFeature = iddata(meanPeakFreq(1:tStart), [], samplingTime);
iddata
函数2将数据转换为可以在时域和频域内进行system identification(系统辨识)的数据类型。
之后绘制出这200个initial mean peak frequency data:
1
plot(tsFeature.y)
图像表明,初始数据大致是常数加上一定的噪声。这是预期内的,因为在初始时刻轴承是健康的,mean peak frequency不会大幅度地变化。
然后,使用前200个数据去拟合一个second-order state-space model:
1
past_sys = ssest(tsFeature, 2, 'Ts', samplingTime, 'Form', 'canonical')
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
past_sys =
Discrete-time identified state-space model:
x(t+Ts) = A x(t) + K e(t)
y(t) = C x(t) + e(t)
A =
x1 x2
x1 0 1
x2 0.9605 0.03942
C =
x1 x2
y1 1 0
K =
y1
x1 -0.003899
x2 0.003656
Sample time: 600 seconds
Parameterization:
CANONICAL form with indices: 2.
Disturbance component: estimate
Number of free coefficients: 4
Use "idssdata", "getpvec", "getcov" for parameters and their uncertainties.
Status:
Estimated using SSEST on time domain data "tsFeature".
Fit to estimation data: 0.2763% (prediction focus)
FPE: 640, MSE: 602.7
ssest
函数的官方文档见参考3。
可以看到,这个initial estimated dynamic model的拟合效果不好。拟合效果的评判指标是normalized root mean square error (NRMSE)(即上面的Fit to estimation data: 0.2763%(prediction focus)
,计算公式为:
这个数值越大越好。而初始的前200个数据点大致是常数和噪声的叠加,此时$x_{pred}\approx mean(x_{true})$,此时NRMSE接近于0。
为了验证这个模型,绘制autocorrelation of residuals:
1
resid(tsFeature, past_sys)
图像表明,残差是不相关的,得到的模型是有效的。
之后,使用该模型past_sys
预测未来10个点并计算标准差:
1
[yF, ~, ~, yFSD] = forecast(past_sys, tsFeature, forecastLen);
并绘制图像(过去值,预测值,故障阈值,95%置信区间):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
tHistory = expTime(1:tStart); % The initial 200 time stamps
forecastTimeIdx = (tStart+1):(tStart+forecastLen);
tForecast = expTime(forecastTimeIdx);% 200~210 time stamps
% Plot historical data, forecast value and 95% confidence interval.
plot(tHistory, meanPeakFreq(1:tStart), 'b',...
tForecast, yF.OutputData, 'kx',...
[tHistory; tForecast], threshold*ones(1,length(tHistory)+forecastLen), 'r--',...
tForecast, yF.OutputData+1.96*yFSD, 'g--',...
tForecast, yF.OutputData-1.96*yFSD, 'g--');
ylim([400, 1.1*threshold]);
ylabel('Mean peak frequency (Hz)');
xlabel('Time (min)');
legend({'Past Data', 'Forecast', 'Failure Threshold', '95% C.I'},...
'Location', 'northoutside', 'Orientation', 'horizontal');
grid on
注:这里在计算95%置信区间时使用的方式是直接乘上1.96,这是查正态分布表得到的。这里默认的区间估计的对象是什么?具体是如何查找的?目前还不是很清楚。
之后,随着出现新的观测点(10个)而更新模型参数,并且重新估计预测值,并且创建一个alarm,如果信号或者预测值超过了故障阈值(2000 Hz)就报警:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
figure
ax = subplot(1, 1, 1);
for tCur = tStart:batchSize:numSamples % batchSize = 10;
% latest features into iddata object.
tsFeature = iddata(meanPeakFreq((tCur-timeSeg+1):tCur), [], samplingTime);
% Update system parameters when new data comes in. Use previous model
% parameters as initial guesses.
sys = ssest(tsFeature, past_sys);
past_sys = sys;
% Forecast the output of the updated state-space model. Also compute
% the standard deviation of the forecasted output.
[yF, ~, ~, yFSD] = forecast(sys, tsFeature, forecastLen);
% Find the time corresponding to historical data and forecasted values.
tHistory = expTime(1:tCur);
forecastTimeIdx = (tCur+1):(tCur+forecastLen);
tForecast = expTime(forecastTimeIdx);
cla(ax)
hold(ax, 'on')
% Plot historical data, forecasted mean peak frequency value and 95%
% confidence interval.
plot(ax, tHistory, meanPeakFreq(1:tCur),'b',...
tForecast, yF.OutputData,'kx',...
[tHistory; tForecast], threshold*ones(1,length(tHistory)+forecastLen), 'r--',...
tForecast, yF.OutputData+1.96*yFSD,'g--',...
tForecast, yF.OutputData-1.96*yFSD,'g--');
ylim([400, 1.1*threshold]);
ylabel('Mean peak frequency (Hz)');
xlabel('Time (min)');
legend({'Past Data', 'Forecast', 'Failure Threshold', '95% C.I'},...
'Location','northoutside','Orientation','horizontal');
grid on;
% Display an alarm when actual monitored variables or forecasted values exceed
% failure threshold.
if(any(meanPeakFreq(tCur-batchSize+1:tCur)>threshold))
disp('Monitored variable exceeds failure threshold');
break;
elseif(any(yF.y>threshold))
% Estimate the time when the system will reach failure threshold.
tAlarm = tForecast(find(yF.y>threshold,1));
disp(['Estimated to hit failure threshold in ' num2str(tAlarm-tHistory(end)) ' minutes from now.']);
break;
end
% pause(0.1)
end
此时的时间序列模型:
1
sys
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
sys =
Discrete-time identified state-space model:
x(t+Ts) = A x(t) + K e(t)
y(t) = C x(t) + e(t)
A =
x1 x2
x1 0 1
x2 0.2624 0.746
C =
x1 x2
y1 1 0
K =
y1
x1 0.3902
x2 0.3002
Sample time: 600 seconds
Parameterization:
CANONICAL form with indices: 2.
Disturbance component: estimate
Number of free coefficients: 4
Use "idssdata", "getpvec", "getcov" for parameters and their uncertainties.
Status:
Estimated using SSEST on time domain data "tsFeature".
Fit to estimation data: 92.53% (prediction focus)
FPE: 499.3, MSE: 442.7
可以看到,模型的预测拟合优度达到92.53%,并且正确地捕捉了趋势。用训练集训练出的最终的这个模型,就可以进行实际的预测。
In closing
在这个例子中,时间序列预测模型实际上就是一个second-order state-space model,其中的核心步骤是根据数据使用ssest
函数拟合出这个动态模型的参数,实际上就是拟合出一个二阶函数。
References