Correct baseline of signal with peaks


Yout = msbackadj(X, Intensities)
Yout = msbackadj(X, Intensities, ...'WindowSize', WindowSizeValue, ...)
Yout = msbackadj(X, Intensities, ...'StepSize', StepSizeValue, ...)
Yout = msbackadj(X, Intensities, ...'RegressionMethod', RegressionMethodValue, ...)
Yout = msbackadj(X, Intensities, ...'EstimationMethod', EstimationMethodValue, ...)
Yout = msbackadj(X, Intensities, ...'SmoothMethod', SmoothMethodValue, ...)
Yout = msbackadj(X, Intensities, ...'QuantileValue', QuantileValueValue, ...)
Yout = msbackadj(X, Intensities, ...'PreserveHeights', PreserveHeightsValue, ...)
Yout = msbackadj(X, Intensities, ...'ShowPlot', ShowPlotValue, ...)


X Vector of separation-unit values for a set of signals with peaks. The number of elements in the vector equals the number of rows in the matrix Intensities. The separation unit can quantify wavelength, frequency, distance, time, or m/z depending on the instrument that generates the signal data.
Intensities Matrix of intensity values for a set of peaks that share the same separation-unit range. Each row corresponds to a separation-unit value, and each column corresponds to either a set of signals with peaks or a retention time. The number of rows equals the number of elements in vector X.



Use the following syntaxes with data from any separation technique that produces signal data, such as spectroscopy, NMR, electrophoresis, chromatography, or mass spectrometry.

Yout = msbackadj(X, Intensities) adjusts the variable baseline of a raw signal with peaks by following steps:

  1. Estimates the baseline within multiple shifted windows of width 200 separation units

  2. Regresses the varying baseline to the window points using a spline approximation

  3. Adjusts the baseline of the peak signals supplied by Intensities

Yout = msbackadj(X, Intensities, ...'PropertyName', PropertyValue, ...) calls msbackadj with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:

Yout = msbackadj(X, Intensities, ...'WindowSize', WindowSizeValue, ...) specifies the width for the shifting window. WindowSizeValue can also be a function handle. The function is evaluated at the respective X values and returns a variable width for the windows. This option is useful for cases where the resolution of the signal is dissimilar at different regions. The default value is 200 (baseline point estimated for windows with a width of 200 separation units).


The result of this algorithm depends on carefully choosing the window size and the step size. Consider the width of your peaks in the signal and the presence of possible drifts. If you have wider peaks toward the end of the signal, you may want to use variable parameters.

Yout = msbackadj(X, Intensities, ...'StepSize', StepSizeValue, ...) specifies the steps for the shifting window. The default value is 200 separation units (baseline point is estimated for windows placed every 200 separation units). StepSizeValue can also be a function handle. The function is evaluated at the respective separation-unit values and returns the distance between adjacent windows.

Yout = msbackadj(X, Intensities, ...'RegressionMethod', RegressionMethodValue, ...) specifies the method to regress the window estimated points to a soft curve. Enter 'pchip' (shape-preserving piecewise cubic interpolation), 'linear' (linear interpolation), or 'spline' (spline interpolation). The default value is 'pchip'.

Yout = msbackadj(X, Intensities, ...'EstimationMethod', EstimationMethodValue, ...) specifies the method for finding the likely baseline value in every window. Enter 'quantile' (quantile value is set to 10%) or 'em' (assumes a doubly stochastic model). With em, every sample is the independent and identically distributed (i.i.d.) draw of any of two normal distributed classes (background or peaks). Because the class label is hidden, the distributions are estimated with an Expectation-Maximization algorithm. The ultimate baseline value is the mean of the background class.

Yout = msbackadj(X, Intensities, ...'SmoothMethod', SmoothMethodValue, ...) specifies the method for smoothing the curve of estimated points and eliminating the effects of possible outliers. Enter 'none', 'lowess' (linear fit), 'loess' (quadratic fit), 'rlowess' (robust linear), or 'rloess' (robust quadratic fit). Default is 'none'.

Yout = msbackadj(X, Intensities, ...'QuantileValue', QuantileValueValue, ...) specifies the quantile value. The default value is 0.10.

Yout = msbackadj(X, Intensities, ...'PreserveHeights', PreserveHeightsValue, ...), when PreserveHeightsValue is true, sets the baseline subtraction mode to preserve the height of the tallest peak in the signal. The default value is false and peak heights are not preserved.

Yout = msbackadj(X, Intensities, ...'ShowPlot', ShowPlotValue, ...) plots the baseline-estimated points, the regressed baseline, and the original signal. When you call msbackadj without output arguments, the signal is plotted unless ShowPlotValue is false. When ShowPlotValue is true, only the first signal in Intensities is plotted. ShowPlotValue can also contain an index to one of the signals in Intensities.


  1. Load a MAT-file, included with the Bioinformatics Toolbox™ software, that contains some sample data.

    load sample_lo_res
  2. Adjust the baseline for a group of spectra and show only the third spectrum and its estimated background.

     YB = msbackadj(MZ_lo_res,Y_lo_res,'SHOWPLOT',3);

  3. Plot the estimated baseline for the fourth spectrum in Y_lo_res using an anonymous function to describe an m/z dependent parameter.

     wf = @(mz) 200 + .001 .* mz;

Introduced before R2006a