`mzapy.peaks`

This module defines functions for performing basic signal processing and peak finding.

1D Peak Fitting

Two functions are provided for performing peak fitting on 1-dimensional data: mzapy.peaks.find_peaks_1d_localmax and mzapy.peaks.find_peaks_1d_gauss which use local maximum or sequential gaussian fitting methods, respectively. The local maximum method uses the scipy.signal.find_peaks function internally. The gaussian fitting method consists of performing successive rounds of least-squares fits using a gaussian function to the data, with each successive round of fitting performed using the residuals of the previous fit. This process continues until a stopping criterion is reached:

peak height is lower than a specified absolute or relative threshold
a maximum number of peaks have been found

The following image is an demonstration of the sequential gaussian fitting method, performed using dense and sparse spectral data:

Graphs of spectral data displaying successive fitting and subtraction of gaussian peaks

Which of the two fitting functions is best to use depends on the type of signal being fit, and both will most often require some trial and error to determine the best parameters for a given application. For very clean signals containing generally well-formed and well-resolved peaks, both methods will produce roughly equal results. If the peaks have clearly defined apexes but have significant shoulders, tailing, or other deviations from ideal peak shape, the localmax method with generally provide better estimates of the mean x value of the peak. However, if the peaks are very broad or saturated then the sequential gaussian fit method will provide more reliable estimates of the mean x value of the peak. In cases where peaks have a clear multimodal distribution (up to 2 or 3 peaks that are not completely resolved), the sequential gaussian fit method generally does a better job of deconvoluting the unresolved peaks.

The following image shows some examples of the different outcomes of the two peak fitting methods using dense and sparse spectral data:

graphs of dense and sparse spectral data annotated with fitted peaks using localmax or gaussian methods

Module Reference

Interpolation

mzapy.peaks.lerp_1d(x, y, x_min, x_max, density, threshold_y=True)

performs linear interpolation of arbitrary x, y data returning new x and y values between x_min and x_max at a specified density (# points / units of x) any interpolated values < 0 (e.g., when extrapolating beyond range of input values) get set to 0

Parameters:

xlist(float): input x values
ylist(float): input y values
x_minfloat: minimum x value for interpolation
x_maxfloat: maximum x value for interpolation
densityfloat: interpolation density (# points / units of x)
threshold_ybool, default=True: ensure that no y values go below 0

Returns:

x_interpnumpy.ndarray(float): interpolated x values
y_interpnumpy.ndarray(float): interpolated y values

mzapy.peaks.lerp_2d(x, y, z, x_min, x_max, x_density, y_min, y_max, y_density, threshold_z=True)

performs linear interpolation of arbitrary x, y, z data returning new gridded x, y and z values between x_min, x_max, y_min, y_max at a specified density (# points / units of x, y) any interpolated values < 0 (e.g., when extrapolating beyond range of input values) get set to 0

Parameters:

xlist(float): input x values
ylist(float): input y values
zlist(float): input z values
x_minfloat: minimum x value for interpolation
x_maxfloat: maximum x value for interpolation
x_densityfloat: x interpolation density (# points / units of x)
y_minfloat: minimum y value for interpolation
y_maxfloat: maximum y value for interpolation
y_densityfloat: y interpolation density (# points / units of y)
threshold_zbool, default=True: ensure that no z values go below 0

Returns:

x_interpnumpy.ndarray(float): interpolated x values
y_interpnumpy.ndarray(float): interpolated y values
zz_interpnumpy.ndarray(float): interpolated z values, 2D grid data

Peak Finding

mzapy.peaks.find_peaks_1d_localmax(x, y, min_rel_height, min_abs_height, fwhm_min, fwhm_max, min_dist)

find peaks in x, y data using local maximum method (scipy.signal.find_peaks) requires dense x data, uniformly spaced, monotonically increasing

Parameters:

xnp.array(float): x data
ynp.array(float): y data
min_rel_heightfloat: minimum height of peaks (relative to max y)
min_abs_heightfloat: minimum absolute height
fwhm_minfloat: minimum peak width (FWHM)
fwhm_maxfloat: maximum peak width (FWHM)
min_distfloat: minimum distance (in units of x) between consecutive peaks

Returns:

peak_meansnumpy.ndarray(float): mean x values for peaks
peak_heightsnumpy.ndarray(float): heights for peaks
peak_fwhmsnumpy.ndarray(float): widths (FWHM) for peaks

mzapy.peaks.find_peaks_1d_gauss(x, y, min_rel_height, min_abs_height, fwhm_min, fwhm_max, max_peaks, truncate_y_resids)

find peaks in x, y data using a successive gaussian function fit method requires dense x data, uniformly spaced, monotonically increasing. the most intense peak in the data is fitted using a gaussian function, then the residuals are fitted for the next peak. This process is repeated until a stopping criterion is reached:

peak height is lower than a specified absolute or relative threshold
a maximum number of peaks have been found

Parameters:

xnumpy.ndarray(float): x data
ynumpy.ndarray(float): y data
min_rel_heightfloat: minimum height of peaks (relative to max y)
min_abs_heightfloat: minimum absolute height
fwhm_minfloat: minimum peak width (FWHM)
fwhm_maxfloat: maximum peak width (FWHM)
max_peaksint: maximum number of peaks to find
truncate_y_residsbool: ensure that residuals do not have any intensity values below 0

Returns:

peak_meansnumpy.ndarray(float): mean x values for peaks
peak_heightsnumpy.ndarray(float): heights for peaks
peak_fwhmsnumpy.ndarray(float): widths (FWHM) for peaks

Miscellaneous

mzapy.peaks.calc_gauss_psnr(x, y, peak_params)

Computes an estimate of the peak signal to noise ratio (pSNR) by comparing the peak height of a fitted peak (valid only for gaussian peak fitting method) to the RMS of the residuals from the peak fitting. This method of estimating the pSNR is sensitive to the intensity of the peak relative to background noise AND the goodness of fit of the gaussian function. Any deviation of the signal from the ideal gaussian shape and/or presence of multiple peaks will decrease the pSNR, even if the peak height is much greater than the true magnitude of noise.

Parameters:

xnumpy.ndarray(float): x data
ynumpy.ndarray(float): y data
peak_paramstuple(float): gaussian function fit parameters (mean, height, fwhm) defining the peak

Returns:

psnrfloat: estimate of the peak signal to noise ratio

mzapy.peaks.calc_peak_area(peak_height, peak_fwhm)

compute the peak area from peak height and FWHM, assumes gaussian peak shape (also works with values from localmax method, but may be less accurate)

Parameters:

peak_heightfloat: peak height
peak_fwhmfloat: peak FWHM

Returns:

areafloat: peak area

mzapy.peaks

1D Peak Fitting

Module Reference

Interpolation

Peak Finding

Miscellaneous

`mzapy.peaks`