mzapy.peaks
This module defines functions for performing basic signal processing and peak finding.
1D Peak Fitting
Two functions are provided for performing peak fitting on 1-dimensional data:
mzapy.peaks.find_peaks_1d_localmax and mzapy.peaks.find_peaks_1d_gauss which use local maximum
or sequential gaussian fitting methods, respectively. The local maximum method uses the
scipy.signal.find_peaks
function internally. The gaussian fitting method consists of performing successive rounds of least-squares
fits using a gaussian function to the data, with each
successive round of fitting performed using the residuals of the previous fit. This process continues until
a stopping criterion is reached:
peak height is lower than a specified absolute or relative threshold
a maximum number of peaks have been found
The following image is an demonstration of the sequential gaussian fitting method, performed using dense and sparse spectral data:
Which of the two fitting functions is best to use depends on the type of signal being fit, and both will most often require some trial and error to determine the best parameters for a given application. For very clean signals containing generally well-formed and well-resolved peaks, both methods will produce roughly equal results. If the peaks have clearly defined apexes but have significant shoulders, tailing, or other deviations from ideal peak shape, the localmax method with generally provide better estimates of the mean x value of the peak. However, if the peaks are very broad or saturated then the sequential gaussian fit method will provide more reliable estimates of the mean x value of the peak. In cases where peaks have a clear multimodal distribution (up to 2 or 3 peaks that are not completely resolved), the sequential gaussian fit method generally does a better job of deconvoluting the unresolved peaks.
The following image shows some examples of the different outcomes of the two peak fitting methods using dense and sparse spectral data:
Module Reference
Interpolation
- mzapy.peaks.lerp_1d(x, y, x_min, x_max, density, threshold_y=True)
performs linear interpolation of arbitrary x, y data returning new x and y values between x_min and x_max at a specified density (# points / units of x) any interpolated values < 0 (e.g., when extrapolating beyond range of input values) get set to 0
- Parameters:
- x
list(float) input x values
- y
list(float) input y values
- x_min
float minimum x value for interpolation
- x_max
float maximum x value for interpolation
- density
float interpolation density (# points / units of x)
- threshold_y
bool, default=True ensure that no y values go below 0
- x
- Returns:
- x_interp
numpy.ndarray(float) interpolated x values
- y_interp
numpy.ndarray(float) interpolated y values
- x_interp
- mzapy.peaks.lerp_2d(x, y, z, x_min, x_max, x_density, y_min, y_max, y_density, threshold_z=True)
performs linear interpolation of arbitrary x, y, z data returning new gridded x, y and z values between x_min, x_max, y_min, y_max at a specified density (# points / units of x, y) any interpolated values < 0 (e.g., when extrapolating beyond range of input values) get set to 0
- Parameters:
- x
list(float) input x values
- y
list(float) input y values
- z
list(float) input z values
- x_min
float minimum x value for interpolation
- x_max
float maximum x value for interpolation
- x_density
float x interpolation density (# points / units of x)
- y_min
float minimum y value for interpolation
- y_max
float maximum y value for interpolation
- y_density
float y interpolation density (# points / units of y)
- threshold_z
bool, default=True ensure that no z values go below 0
- x
- Returns:
- x_interp
numpy.ndarray(float) interpolated x values
- y_interp
numpy.ndarray(float) interpolated y values
- zz_interp
numpy.ndarray(float) interpolated z values, 2D grid data
- x_interp
Peak Finding
- mzapy.peaks.find_peaks_1d_localmax(x, y, min_rel_height, min_abs_height, fwhm_min, fwhm_max, min_dist)
find peaks in x, y data using local maximum method (scipy.signal.find_peaks) requires dense x data, uniformly spaced, monotonically increasing
- Parameters:
- x
np.array(float) x data
- y
np.array(float) y data
- min_rel_height
float minimum height of peaks (relative to max y)
- min_abs_height
float minimum absolute height
- fwhm_min
float minimum peak width (FWHM)
- fwhm_max
float maximum peak width (FWHM)
- min_dist
float minimum distance (in units of x) between consecutive peaks
- x
- Returns:
- peak_means
numpy.ndarray(float) mean x values for peaks
- peak_heights
numpy.ndarray(float) heights for peaks
- peak_fwhms
numpy.ndarray(float) widths (FWHM) for peaks
- peak_means
- mzapy.peaks.find_peaks_1d_gauss(x, y, min_rel_height, min_abs_height, fwhm_min, fwhm_max, max_peaks, truncate_y_resids)
find peaks in x, y data using a successive gaussian function fit method requires dense x data, uniformly spaced, monotonically increasing. the most intense peak in the data is fitted using a gaussian function, then the residuals are fitted for the next peak. This process is repeated until a stopping criterion is reached:
peak height is lower than a specified absolute or relative threshold
a maximum number of peaks have been found
- Parameters:
- x
numpy.ndarray(float) x data
- y
numpy.ndarray(float) y data
- min_rel_height
float minimum height of peaks (relative to max y)
- min_abs_height
float minimum absolute height
- fwhm_min
float minimum peak width (FWHM)
- fwhm_max
float maximum peak width (FWHM)
- max_peaks
int maximum number of peaks to find
- truncate_y_resids
bool ensure that residuals do not have any intensity values below 0
- x
- Returns:
- peak_means
numpy.ndarray(float) mean x values for peaks
- peak_heights
numpy.ndarray(float) heights for peaks
- peak_fwhms
numpy.ndarray(float) widths (FWHM) for peaks
- peak_means
Miscellaneous
- mzapy.peaks.calc_gauss_psnr(x, y, peak_params)
Computes an estimate of the peak signal to noise ratio (pSNR) by comparing the peak height of a fitted peak (valid only for gaussian peak fitting method) to the RMS of the residuals from the peak fitting. This method of estimating the pSNR is sensitive to the intensity of the peak relative to background noise AND the goodness of fit of the gaussian function. Any deviation of the signal from the ideal gaussian shape and/or presence of multiple peaks will decrease the pSNR, even if the peak height is much greater than the true magnitude of noise.
- Parameters:
- x
numpy.ndarray(float) x data
- y
numpy.ndarray(float) y data
- peak_params
tuple(float) gaussian function fit parameters (mean, height, fwhm) defining the peak
- x
- Returns:
- psnr
float estimate of the peak signal to noise ratio
- psnr
- mzapy.peaks.calc_peak_area(peak_height, peak_fwhm)
compute the peak area from peak height and FWHM, assumes gaussian peak shape (also works with values from localmax method, but may be less accurate)
- Parameters:
- peak_height
float peak height
- peak_fwhm
float peak FWHM
- peak_height
- Returns:
- area
float peak area
- area