mzapy.peaks

This module defines functions for performing basic signal processing and peak finding.

1D Peak Fitting

Two functions are provided for performing peak fitting on 1-dimensional data: mzapy.peaks.find_peaks_1d_localmax and mzapy.peaks.find_peaks_1d_gauss which use local maximum or sequential gaussian fitting methods, respectively. The local maximum method uses the scipy.signal.find_peaks function internally. The gaussian fitting method consists of performing successive rounds of least-squares fits using a gaussian function to the data, with each successive round of fitting performed using the residuals of the previous fit. This process continues until a stopping criterion is reached:

  • peak height is lower than a specified absolute or relative threshold

  • a maximum number of peaks have been found

The following image is an demonstration of the sequential gaussian fitting method, performed using dense and sparse spectral data:

Graphs of spectral data displaying successive fitting and subtraction of gaussian peaks

Which of the two fitting functions is best to use depends on the type of signal being fit, and both will most often require some trial and error to determine the best parameters for a given application. For very clean signals containing generally well-formed and well-resolved peaks, both methods will produce roughly equal results. If the peaks have clearly defined apexes but have significant shoulders, tailing, or other deviations from ideal peak shape, the localmax method with generally provide better estimates of the mean x value of the peak. However, if the peaks are very broad or saturated then the sequential gaussian fit method will provide more reliable estimates of the mean x value of the peak. In cases where peaks have a clear multimodal distribution (up to 2 or 3 peaks that are not completely resolved), the sequential gaussian fit method generally does a better job of deconvoluting the unresolved peaks.

The following image shows some examples of the different outcomes of the two peak fitting methods using dense and sparse spectral data:

graphs of dense and sparse spectral data annotated with fitted peaks using localmax or gaussian methods

Module Reference

Interpolation

mzapy.peaks.lerp_1d(x, y, x_min, x_max, density, threshold_y=True)

performs linear interpolation of arbitrary x, y data returning new x and y values between x_min and x_max at a specified density (# points / units of x) any interpolated values < 0 (e.g., when extrapolating beyond range of input values) get set to 0

Parameters:
xlist(float)

input x values

ylist(float)

input y values

x_minfloat

minimum x value for interpolation

x_maxfloat

maximum x value for interpolation

densityfloat

interpolation density (# points / units of x)

threshold_ybool, default=True

ensure that no y values go below 0

Returns:
x_interpnumpy.ndarray(float)

interpolated x values

y_interpnumpy.ndarray(float)

interpolated y values

mzapy.peaks.lerp_2d(x, y, z, x_min, x_max, x_density, y_min, y_max, y_density, threshold_z=True)

performs linear interpolation of arbitrary x, y, z data returning new gridded x, y and z values between x_min, x_max, y_min, y_max at a specified density (# points / units of x, y) any interpolated values < 0 (e.g., when extrapolating beyond range of input values) get set to 0

Parameters:
xlist(float)

input x values

ylist(float)

input y values

zlist(float)

input z values

x_minfloat

minimum x value for interpolation

x_maxfloat

maximum x value for interpolation

x_densityfloat

x interpolation density (# points / units of x)

y_minfloat

minimum y value for interpolation

y_maxfloat

maximum y value for interpolation

y_densityfloat

y interpolation density (# points / units of y)

threshold_zbool, default=True

ensure that no z values go below 0

Returns:
x_interpnumpy.ndarray(float)

interpolated x values

y_interpnumpy.ndarray(float)

interpolated y values

zz_interpnumpy.ndarray(float)

interpolated z values, 2D grid data

Peak Finding

mzapy.peaks.find_peaks_1d_localmax(x, y, min_rel_height, min_abs_height, fwhm_min, fwhm_max, min_dist)

find peaks in x, y data using local maximum method (scipy.signal.find_peaks) requires dense x data, uniformly spaced, monotonically increasing

Parameters:
xnp.array(float)

x data

ynp.array(float)

y data

min_rel_heightfloat

minimum height of peaks (relative to max y)

min_abs_heightfloat

minimum absolute height

fwhm_minfloat

minimum peak width (FWHM)

fwhm_maxfloat

maximum peak width (FWHM)

min_distfloat

minimum distance (in units of x) between consecutive peaks

Returns:
peak_meansnumpy.ndarray(float)

mean x values for peaks

peak_heightsnumpy.ndarray(float)

heights for peaks

peak_fwhmsnumpy.ndarray(float)

widths (FWHM) for peaks

mzapy.peaks.find_peaks_1d_gauss(x, y, min_rel_height, min_abs_height, fwhm_min, fwhm_max, max_peaks, truncate_y_resids)

find peaks in x, y data using a successive gaussian function fit method requires dense x data, uniformly spaced, monotonically increasing. the most intense peak in the data is fitted using a gaussian function, then the residuals are fitted for the next peak. This process is repeated until a stopping criterion is reached:

  • peak height is lower than a specified absolute or relative threshold

  • a maximum number of peaks have been found

Parameters:
xnumpy.ndarray(float)

x data

ynumpy.ndarray(float)

y data

min_rel_heightfloat

minimum height of peaks (relative to max y)

min_abs_heightfloat

minimum absolute height

fwhm_minfloat

minimum peak width (FWHM)

fwhm_maxfloat

maximum peak width (FWHM)

max_peaksint

maximum number of peaks to find

truncate_y_residsbool

ensure that residuals do not have any intensity values below 0

Returns:
peak_meansnumpy.ndarray(float)

mean x values for peaks

peak_heightsnumpy.ndarray(float)

heights for peaks

peak_fwhmsnumpy.ndarray(float)

widths (FWHM) for peaks

Miscellaneous

mzapy.peaks.calc_gauss_psnr(x, y, peak_params)

Computes an estimate of the peak signal to noise ratio (pSNR) by comparing the peak height of a fitted peak (valid only for gaussian peak fitting method) to the RMS of the residuals from the peak fitting. This method of estimating the pSNR is sensitive to the intensity of the peak relative to background noise AND the goodness of fit of the gaussian function. Any deviation of the signal from the ideal gaussian shape and/or presence of multiple peaks will decrease the pSNR, even if the peak height is much greater than the true magnitude of noise.

Parameters:
xnumpy.ndarray(float)

x data

ynumpy.ndarray(float)

y data

peak_paramstuple(float)

gaussian function fit parameters (mean, height, fwhm) defining the peak

Returns:
psnrfloat

estimate of the peak signal to noise ratio

mzapy.peaks.calc_peak_area(peak_height, peak_fwhm)

compute the peak area from peak height and FWHM, assumes gaussian peak shape (also works with values from localmax method, but may be less accurate)

Parameters:
peak_heightfloat

peak height

peak_fwhmfloat

peak FWHM

Returns:
areafloat

peak area