Preprocessing Methods Reference

Comprehensive reference for all 40+ preprocessing methods available in the application.

Table of Contents

Baseline Correction
Smoothing and Denoising
Normalization
Derivatives
Feature Engineering
Advanced Methods

Baseline Correction

Remove fluorescence background and baseline drift from spectra.

AsLS (Asymmetric Least Squares)

Full Name: Asymmetric Least Squares Smoothing

Purpose: Remove baseline while preserving peaks using asymmetric weighting

Theory: Fits a smoothed baseline by minimizing:

Σ w_i(y_i - z_i)² + λ Σ (Δ²z_i)²

where w_i are asymmetric weights (lower for peaks, higher for baseline)

Parameters:

Parameter	Type	Range	Default	Description
`lambda` (λ)	float	1e2 - 1e9	1e5	Smoothness parameter. Higher = smoother baseline
`p`	float	0.001 - 0.1	0.01	Asymmetry parameter. Lower = fit valleys better
`max_iter`	int	5 - 50	10	Maximum iterations for convergence
`tol`	float	1e-6 - 1e-3	1e-6	Convergence tolerance

Parameter Guide:

# Conservative (preserves more peaks)
lambda = 1e5, p = 0.01

# Standard (balanced)
lambda = 1e6, p = 0.01

# Aggressive (removes more baseline)
lambda = 1e7, p = 0.001

Usage Example:

from functions.preprocess import apply_asls

# Apply AsLS baseline correction
corrected = apply_asls(
    spectra,
    lam=1e5,
    p=0.01,
    max_iter=10
)

Interpretation:

λ (lambda): Controls baseline smoothness
- Too low → Follows peaks (underfitting)
- Too high → Over-smooths (may miss real baseline curvature)
- Sweet spot: 1e5 - 1e6 for most Raman spectra
p: Controls asymmetry
- Lower → Treats peaks as outliers (good for sharp peaks)
- Higher → Fits through peaks (use if baseline has structure)

Common Issues:

Peaks removed: λ too high or p too high → Reduce λ or p
Baseline remains: λ too low → Increase λ
Slow convergence: Increase tol or reduce max_iter
Oscillations: p too low → Increase p slightly

When to Use:

✓ General-purpose baseline correction
✓ Fluorescence background
✓ Unknown baseline shape
✗ Not for: Very noisy data (smooth first)

Reference: Eilers & Boelens (2005). “Baseline Correction with Asymmetric Least Squares Smoothing”

AirPLS (Adaptive Iteratively Reweighted Penalized Least Squares)

Full Name: Adaptive Iteratively Reweighted Penalized Least Squares

Purpose: Automatic baseline correction with minimal parameter tuning

Theory: Iteratively fits baseline by:

Penalized least squares fitting
Adaptive weighting based on residuals
Iteration until convergence

Parameters:

Parameter	Type	Range	Default	Description
`lambda` (λ)	float	1e2 - 1e7	1e5	Smoothness parameter
`porder`	int	1, 2	1	Difference order for penalty
`max_iter`	int	10 - 100	15	Maximum iterations
`min_diff`	float	1e-6 - 1e-3	1e-5	Convergence criterion

Parameter Guide:

# Fast (fewer iterations)
lambda = 1e4, max_iter = 10

# Balanced (recommended)
lambda = 1e5, max_iter = 15

# Thorough (more iterations)
lambda = 1e6, max_iter = 30

Usage Example:

from functions.preprocess import apply_airpls

corrected = apply_airpls(
    spectra,
    lam=1e5,
    porder=1,
    max_iter=15
)

Interpretation:

Automatic adaptation: Weights adjust to separate peaks from baseline
Less sensitive to p: No asymmetry parameter needed
porder=1: First-order differences (smoother)
porder=2: Second-order differences (more flexible)

Common Issues:

Negative baseline: Normal behavior, corrects fluorescence
Slow: Reduce max_iter or increase λ
Incomplete correction: Increase max_iter or reduce λ

When to Use:

✓ Automatic baseline correction
✓ Batch processing
✓ When AsLS parameters unclear
✓ Complex baseline shapes

Reference: Zhang et al. (2010). “Baseline correction using adaptive iteratively reweighted penalized least squares”

Polynomial Baseline

Purpose: Fit and subtract polynomial baseline

Theory: Fits polynomial of degree n:

baseline = a₀ + a₁x + a₂x² + ... + aₙxⁿ

Parameters:

Parameter	Type	Range	Default	Description
`degree`	int	1 - 10	3	Polynomial degree

Parameter Guide:

degree = 1  # Linear baseline
degree = 2  # Quadratic
degree = 3  # Cubic (most common)
degree = 4-5  # Higher order (flexible)
degree > 5  # Risky (may overfit)

Usage Example:

from functions.preprocess import apply_polynomial_baseline

corrected = apply_polynomial_baseline(
    spectra,
    degree=3
)

Common Issues:

Underfitting: degree too low → Increase degree
Overfitting: degree too high → Reduce degree
Peaks affected: Use robust fitting or lower degree

When to Use:

✓ Simple, smooth baseline
✓ Known polynomial shape
✓ Fast processing needed
✗ Not for: Complex baseline, fluorescence

Whittaker Smoothing

Purpose: Smooth baseline using penalized least squares

Parameters:

Parameter	Type	Range	Default	Description
`lambda` (λ)	float	1e2 - 1e9	1e5	Smoothness parameter
`differences`	int	1, 2, 3	2	Order of differences

Usage Example:

from functions.preprocess import apply_whittaker

corrected = apply_whittaker(
    spectra,
    lam=1e5,
    differences=2
)

When to Use:

✓ Smooth baseline
✓ Preserving peak shapes
✓ Alternative to polynomial

Reference: Eilers (2003). “A Perfect Smoother”

FABC (Fully Automatic Baseline Correction)

Purpose: Completely automatic baseline correction with no tuning

Parameters:

Parameter	Type	Range	Default	Description
`window_length`	int	100 - 500	200	Window size for local fitting
`iterations`	int	1 - 20	10	Number of correction iterations

Usage Example:

from functions.preprocess.fabc_fixed import apply_fabc

corrected = apply_fabc(
    spectra,
    window_length=200,
    iterations=10
)

When to Use:

✓ No user expertise required
✓ Batch processing
✓ Standardized pipelines
✓ Unknown baseline characteristics

Butterworth High-Pass Filter

Purpose: Remove low-frequency baseline components using frequency domain filtering

Parameters:

Parameter	Type	Range	Default	Description
`cutoff`	float	0.001 - 0.1	0.01	Cutoff frequency (normalized)
`order`	int	1 - 10	4	Filter order

Usage Example:

from functions.preprocess import apply_butterworth_highpass

corrected = apply_butterworth_highpass(
    spectra,
    cutoff=0.01,
    order=4
)

When to Use:

✓ Frequency-domain baseline removal
✓ Uniform low-frequency drift
✗ Not ideal for: Non-uniform baseline

Smoothing and Denoising

Reduce noise while preserving spectral features.

Savitzky-Golay Filter

Purpose: Polynomial smoothing that preserves peak shape and height

Theory: Fits local polynomials using least squares within a moving window

Parameters:

Parameter	Type	Range	Default	Description
`window_length`	int	5 - 51 (odd)	11	Size of smoothing window
`polyorder`	int	2 - 5	3	Polynomial order
`deriv`	int	0 - 2	0	Derivative order (0=smooth, 1=1st deriv, 2=2nd deriv)

Parameter Guide:

# Light smoothing
window_length = 7, polyorder = 3

# Moderate smoothing (recommended)
window_length = 11, polyorder = 3

# Heavy smoothing
window_length = 21, polyorder = 3

# Peak sharpening (1st derivative)
window_length = 11, polyorder = 3, deriv = 1

# Peak resolution (2nd derivative)
window_length = 11, polyorder = 3, deriv = 2

Usage Example:

from functions.preprocess import apply_savgol

# Smoothing
smoothed = apply_savgol(
    spectra,
    window_length=11,
    polyorder=3,
    deriv=0
)

# First derivative
derivative = apply_savgol(
    spectra,
    window_length=11,
    polyorder=3,
    deriv=1
)

Interpretation:

window_length: Larger window = more smoothing, but peak broadening
polyorder: Higher order preserves sharp features better
deriv=1: Converts peaks to zero-crossings, removes baseline
deriv=2: Converts peaks to negative dips, enhances resolution

Common Issues:

Over-smoothing: Window too large → Reduce window
Peak broadening: Window too large → Use window ≤ 11
Noisy derivatives: Smooth first, then derivative
Oscillations: polyorder too high → Reduce to 2 or 3

When to Use:

✓ General smoothing
✓ Peak-preserving smoothing
✓ Derivatives for baseline removal
✓ Most common choice

Constraints:

window_length > polyorder
window_length must be odd

Reference: Savitzky & Golay (1964). “Smoothing and Differentiation of Data by Simplified Least Squares Procedures”

Gaussian Smoothing

Purpose: Strong smoothing using Gaussian kernel

Theory: Convolves spectrum with Gaussian kernel:

K(x) = (1/√(2πσ²)) exp(-x²/2σ²)

Parameters:

Parameter	Type	Range	Default	Description
`sigma`	float	0.5 - 5.0	2.0	Standard deviation of Gaussian kernel

Parameter Guide:

sigma = 1.0  # Light smoothing
sigma = 2.0  # Moderate (default)
sigma = 3.0  # Heavy smoothing
sigma > 4.0  # Very strong (may blur peaks)

Usage Example:

from functions.preprocess import apply_gaussian

smoothed = apply_gaussian(
    spectra,
    sigma=2.0
)

Common Issues:

Peak broadening: sigma too high → Reduce sigma
Insufficient smoothing: sigma too low → Increase sigma

When to Use:

✓ Very noisy data
✓ When peak shape less critical
✓ Visualization
✗ Not for: Quantitative analysis (peak heights change)

Moving Average

Purpose: Simple uniform smoothing

Parameters:

Parameter	Type	Range	Default	Description
`window_size`	int	3 - 21 (odd)	5	Window size

Usage Example:

from functions.preprocess import apply_moving_average

smoothed = apply_moving_average(
    spectra,
    window_size=5
)

When to Use:

✓ Quick smoothing
✓ Preliminary exploration
✗ Not recommended for: Publication (use Savitzky-Golay)

Median Filter

Purpose: Remove spike noise (cosmic rays, detector artifacts)

Theory: Replaces each point with median of surrounding window

Parameters:

Parameter	Type	Range	Default	Description
`window_size`	int	3 - 11 (odd)	5	Window size

Usage Example:

from functions.preprocess import apply_median_filter

despike = apply_median_filter(
    spectra,
    window_size=5
)

Common Issues:

Peak clipping: Window too large → Use window=3 or 5
Spikes remain: Window too small → Increase to 7

When to Use:

✓ Cosmic ray removal
✓ Spike artifacts
✓ Before other preprocessing
✓ CCD detector noise

Best Practice: Apply BEFORE baseline correction and smoothing

Kernel Denoising

Purpose: Advanced denoising using various kernel functions

Available Kernels: Gaussian, Epanechnikov, Tricube, Triweight

Parameters:

Parameter	Type	Range	Default	Description
`kernel`	str	-	‘gaussian’	Kernel type
`bandwidth`	float	0.5 - 5.0	1.0	Kernel bandwidth

Usage Example:

from functions.preprocess.kernel_denoise import apply_kernel_denoise

denoised = apply_kernel_denoise(
    spectra,
    kernel='gaussian',
    bandwidth=1.0
)

When to Use:

✓ Alternative to Gaussian smoothing
✓ Experimenting with different kernels

Normalization

Scale spectra to comparable ranges.

Vector Normalization (L2 Norm)

Purpose: Normalize to unit length (most common)

Theory:

normalized = spectrum / ||spectrum||₂
where ||spectrum||₂ = √(Σ x²)

Parameters: None

Usage Example:

from functions.preprocess import apply_vector_norm

normalized = apply_vector_norm(spectra)

Effect:

Makes all spectra have same total “energy”
Removes intensity variations
Preserves relative peak ratios

When to Use:

✓ Most common choice
✓ Classification tasks
✓ Removing concentration effects
✓ SVM, neural networks
✓ After baseline correction

Min-Max Normalization

Purpose: Scale to [0, 1] range

Theory:

normalized = (spectrum - min) / (max - min)

Parameters:

Parameter	Type	Range	Default	Description
`feature_range`	tuple	-	(0, 1)	Target range (min, max)

Usage Example:

from functions.preprocess import apply_minmax_norm

normalized = apply_minmax_norm(
    spectra,
    feature_range=(0, 1)
)

When to Use:

✓ Neural networks
✓ Visualization
✓ Bounded input required
✗ Not for: Preserving absolute intensities

Area Normalization

Purpose: Normalize by area under curve

Theory:

normalized = spectrum / Σ|spectrum|

Parameters: None

Usage Example:

from functions.preprocess import apply_area_norm

normalized = apply_area_norm(spectra)

When to Use:

✓ Concentration normalization
✓ Comparing relative peak heights
✓ Eliminating total intensity differences
✗ Not for: Absolute quantification

Standard Normal Variate (SNV)

Purpose: Remove multiplicative scatter effects

Theory:

SNV = (spectrum - mean) / std

Parameters: None

Usage Example:

from functions.preprocess import apply_snv

normalized = apply_snv(spectra)

When to Use:

✓ Solid samples with scattering
✓ Particle size variations
✓ NIR spectroscopy (also useful for Raman)
✓ Removing multiplicative effects

Reference: Barnes et al. (1989). “Standard Normal Variate Transformation”

Multiplicative Scatter Correction (MSC)

Purpose: Correct for light scattering variations

Theory: Fits each spectrum to a reference (mean spectrum):

spectrum_i = a + b × reference
corrected = (spectrum_i - a) / b

Parameters:

Parameter	Type	Default	Description
`reference`	array	None	Reference spectrum (default: mean of all)

Usage Example:

from functions.preprocess import apply_msc

normalized = apply_msc(
    spectra,
    reference=None  # Auto: use mean
)

When to Use:

✓ Diffuse reflectance spectroscopy
✓ Particle size effects
✓ Scattering correction
✓ Quantitative analysis

Requirement: Need reference spectrum (usually mean)

Reference: Geladi et al. (1985). “Linearization and Scatter-Correction for Near-Infrared Reflectance Spectra”

Quantile Normalization

Purpose: Match distributions across spectra

Parameters:

Parameter	Type	Range	Default	Description
`n_quantiles`	int	100 - 1000	1000	Number of quantiles

Usage Example:

from functions.preprocess.advanced_normalization import apply_quantile_norm

normalized = apply_quantile_norm(
    spectra,
    n_quantiles=1000
)

When to Use:

✓ Batch effect correction
✓ Making distributions identical
✗ Less common in Raman

Probabilistic Quotient Normalization (PQN)

Purpose: Dilution correction for metabolomics

Theory:

Calculate reference (median spectrum)
Calculate quotients: spectrum / reference
Normalize by median quotient

Parameters:

Parameter	Type	Default	Description
`reference`	array	None	Reference spectrum

Usage Example:

from functions.preprocess.advanced_normalization import apply_pqn

normalized = apply_pqn(
    spectra,
    reference=None
)

When to Use:

✓ Metabolomics
✓ Dilution correction
✓ Biological fluids

Reference: Dieterle et al. (2006). “Probabilistic Quotient Normalization”

Rank Transformation

Purpose: Convert to ranks (non-parametric)

Parameters: None

Usage Example:

from functions.preprocess.advanced_normalization import apply_rank_transform

ranked = apply_rank_transform(spectra)

When to Use:

✓ Non-parametric analysis
✓ Outlier-resistant
✓ When distribution doesn’t matter

Derivatives

Calculate spectral derivatives for baseline removal and peak resolution.

First Derivative (Savitzky-Golay)

Purpose: Remove baseline, enhance peak differences

Theory: First derivative of smoothed spectrum using Savitzky-Golay

Parameters: Same as Savitzky-Golay + deriv=1

Usage Example:

from functions.preprocess import apply_savgol

derivative1 = apply_savgol(
    spectra,
    window_length=11,
    polyorder=3,
    deriv=1
)

Effect:

Peaks become positive/negative transitions
Baseline removed (constant → zero)
Peak maxima → zero-crossings

When to Use:

✓ Alternative to baseline correction
✓ Overlapping peak resolution
✓ Chemometric analysis
✗ Amplifies noise (smooth first!)

Second Derivative

Purpose: Sharpen peaks, resolve overlaps

Theory: Second derivative of smoothed spectrum

Parameters: Same as Savitzky-Golay + deriv=2

Usage Example:

from functions.preprocess import apply_savgol

derivative2 = apply_savgol(
    spectra,
    window_length=11,
    polyorder=3,
    deriv=2
)

Effect:

Peaks become negative dips
Sharpens overlapping peaks
Greatly amplifies noise

When to Use:

✓ Overlapping peak resolution
✓ Peak identification
✓ Advanced analysis
⚠️ Warning: Requires good smoothing, amplifies noise significantly

Feature Engineering

Create new features from spectra.

Peak Ratio

Purpose: Calculate ratio between two peaks

Parameters:

Parameter	Type	Description
`peak1_range`	tuple	(start, end) wavenumber for peak 1
`peak2_range`	tuple	(start, end) wavenumber for peak 2

Usage Example:

from functions.preprocess.feature_engineering import calculate_peak_ratio

# Amide I / CH2 ratio
ratio = calculate_peak_ratio(
    spectra,
    wavenumbers,
    peak1_range=(1645, 1665),  # Amide I
    peak2_range=(1440, 1460)   # CH2
)

When to Use:

✓ Known biomarker ratios
✓ Creating interpretable features
✓ Reducing dimensionality
✓ Band ratio analysis

Example Ratios:

I₁₆₅₅/I₁₄₄₅ (Amide I / CH₂)
I₁₂₉₀/I₁₂₄₀ (Amide III ratio)
I₁₀₀₀/I₁₆₀₀ (Custom biomarkers)

Wavelet Transform

Purpose: Multi-resolution decomposition for denoising

Parameters:

Parameter	Type	Range	Default	Description
`wavelet`	str	-	‘db4’	Wavelet type (db4, sym8, coif5)
`level`	int	1 - 10	4	Decomposition level
`threshold`	str	-	‘soft’	Thresholding type (soft, hard)

Usage Example:

from functions.preprocess.feature_engineering import apply_wavelet_transform

denoised = apply_wavelet_transform(
    spectra,
    wavelet='db4',
    level=4,
    threshold='soft'
)

When to Use:

✓ Complex noise patterns
✓ Multi-scale features
✓ Non-stationary noise
✗ Computationally expensive

Reference: Mallat (1989). “A theory for multiresolution signal decomposition: the wavelet representation”

Advanced Methods

Specialized preprocessing methods.

Convolutional Denoising Autoencoder (CDAE)

Purpose: Deep learning-based denoising

Theory: Neural network trained to reconstruct clean spectra from noisy input

Parameters:

Parameter	Type	Default	Description
`model_path`	str	None	Path to trained model
`batch_size`	int	32	Batch size for prediction

Requirements:

PyTorch installed
GPU recommended
Pre-trained model

Usage Example:

from functions.preprocess.deep_learning import apply_cdae

denoised = apply_cdae(
    spectra,
    model_path='models/cdae_raman.pth',
    batch_size=32
)

When to Use:

✓ After training on your data type
✓ Complex noise patterns
✓ Large datasets
✗ Requires: Training data, GPU, expertise

Background Subtraction

Purpose: Subtract background/blank spectrum

Parameters:

Parameter	Type	Description
`background`	array	Background spectrum to subtract

Usage Example:

from functions.preprocess import subtract_background

corrected = subtract_background(
    spectra,
    background=blank_spectrum
)

When to Use:

✓ Measured blank/background available
✓ Removing substrate contribution
✓ Consistent background across samples

Calibration

Purpose: Wavenumber axis calibration

Types:

Linear Shift: Single reference peak
Polynomial: Multiple reference peaks

Parameters:

Parameter	Type	Description
`reference_peaks`	list	Expected peak positions
`measured_peaks`	list	Measured peak positions
`method`	str	‘linear’ or ‘polynomial’

Usage Example:

from functions.preprocess.calibration import apply_calibration

# Calibrate using silicon peak
calibrated_wn = apply_calibration(
    wavenumbers,
    reference_peaks=[520.7],  # Silicon
    measured_peaks=[522.3],    # Measured
    method='linear'
)

When to Use:

✓ Wavenumber axis errors detected
✓ Instrument drift correction
✓ Before combining datasets

Reference Standards:

Silicon: 520.7 cm⁻¹
Polystyrene: 1001, 1031, 1602 cm⁻¹
Diamond: 1332 cm⁻¹

Method Selection Guide

Decision Matrix

Goal	Recommended Method(s)	Parameters
Remove baseline	AsLS	λ=1e5, p=0.01
	AirPLS	λ=1e5
	Polynomial	degree=3
Reduce noise	Savitzky-Golay	window=11, order=3
	Gaussian	sigma=2.0
	Median (spikes)	window=5
Normalize intensity	Vector (L2)	-
	SNV	-
	Area	-
Remove scatter	MSC	reference=mean
	SNV	-
Baseline alternative	1st Derivative	SavGol deriv=1
Peak resolution	2nd Derivative	SavGol deriv=2
Create features	Peak Ratios	Custom ranges

Common Pipelines

Standard Pipeline:

AsLS (λ=1e5, p=0.01)
Savitzky-Golay (w=11, order=3)
Vector Normalization

High-Noise Pipeline:

Median Filter (w=5)
AirPLS (λ=1e6)
Gaussian (σ=2.0)
SNV

Derivative Pipeline:

AsLS (λ=1e5, p=0.01)
Savitzky-Golay 1st Derivative (w=11, order=3, deriv=1)
Vector Normalization

Quantitative Pipeline:

AsLS (λ=1e6, p=0.001)
MSC (reference=mean)
Savitzky-Golay (w=9, order=3)
Area Normalization

Parameter Constraints

Automatic Validation

All methods include automatic parameter validation:

Type Checking:

# Integer parameters converted if needed
window_length = "11" → 11 (converted)
polyorder = 3.0 → 3 (converted)

# Float parameters validated
lambda = "1e5" → 100000.0 (converted)
p = 0.01 (valid)

Range Validation:

# Values clamped to valid ranges
lambda = 1e10 → 1e9 (max allowed)
window_length = 3 → 5 (min allowed)
p = -0.01 → 0.001 (min allowed)

Logical Validation:

# Constraints enforced
if window_length <= polyorder:
    window_length = polyorder + 2
    
if window_length % 2 == 0:
    window_length += 1  # Make odd

Best Practices

General Guidelines

Order Matters:

Correct: Spike removal → Baseline → Smoothing → Normalization
Wrong: Normalization → Baseline (affects baseline estimation)

Less is More:
- Use minimal necessary steps
- Over-processing loses information
- Validate each step visually
Parameter Tuning:
- Start with defaults
- Adjust based on visual inspection
- Test on representative subset
- Document final parameters
Validation:
- Always preview before applying
- Check multiple spectra
- Compare before/after
- Verify peak preservation

Method-Specific Tips

AsLS:

Start with λ=1e5, adjust if needed
Lower p for sharper peaks
Check that real peaks aren’t removed

Savitzky-Golay:

Window ≤ 11 for most cases
polyorder = 3 is usually optimal
Don’t over-smooth

Derivatives:

Always smooth before derivative
1st derivative: window ≥ 11
2nd derivative: window ≥ 15, very noisy

Normalization:

Apply AFTER baseline and smoothing
Vector norm: most common choice
SNV: for scattering issues

Troubleshooting

Common Issues

Issue	Cause	Solution
Peaks removed	λ too high in AsLS	Reduce λ to 1e4-1e5
Baseline remains	λ too low	Increase λ to 1e6-1e7
Over-smoothed	Window too large	Reduce SavGol window to 7-9
Noisy	Insufficient smoothing	Increase window or sigma
Negative values	Normal after baseline	Use area/vector norm
Spikes remain	Window too small	Use median filter window=5-7
Slow processing	Too many iterations	Reduce max_iter or increase tol

References

Eilers & Boelens (2005): AsLS method
Zhang et al. (2010): AirPLS method
Savitzky & Golay (1964): SavGol filter
Barnes et al. (1989): SNV normalization
Geladi et al. (1985): MSC method
Dieterle et al. (2006): PQN normalization
Eilers (2003): Whittaker smoother
Mallat (1989): Wavelet theory

See References for complete citations.

Preprocessing Methods Reference

Table of Contents

Baseline Correction

AsLS (Asymmetric Least Squares)

AirPLS (Adaptive Iteratively Reweighted Penalized Least Squares)

Polynomial Baseline

Whittaker Smoothing

FABC (Fully Automatic Baseline Correction)

Butterworth High-Pass Filter

Smoothing and Denoising

Savitzky-Golay Filter

Gaussian Smoothing

Moving Average

Median Filter

Kernel Denoising

Normalization

Vector Normalization (L2 Norm)

Min-Max Normalization

Area Normalization

Standard Normal Variate (SNV)

Multiplicative Scatter Correction (MSC)

Quantile Normalization

Probabilistic Quotient Normalization (PQN)

Rank Transformation

Derivatives

First Derivative (Savitzky-Golay)

Second Derivative

Feature Engineering

Peak Ratio

Wavelet Transform

Advanced Methods

Convolutional Denoising Autoencoder (CDAE)

Background Subtraction

Calibration

Method Selection Guide

Decision Matrix

Common Pipelines

Parameter Constraints

Automatic Validation

Best Practices

General Guidelines

Method-Specific Tips

Troubleshooting

Common Issues

References

See Also