# Statistical Analysis Methods Comprehensive reference for hypothesis testing and statistical methods. ## Table of Contents - [T-Tests](#t-tests) - [Mann-Whitney U Test](#mann-whitney-u-test) - [ANOVA](#anova-analysis-of-variance) - [Pairwise Statistical Tests](#pairwise-statistical-tests) - [Correlation Analysis](#correlation-analysis) - [Multiple Testing Correction](#multiple-testing-correction) - [Effect Size Measures](#effect-size-measures) - [Band Ratio Analysis](#band-ratio-analysis) --- ## T-Tests **Purpose**: Compare means of two groups ### Types of T-Tests #### 1. Independent Samples T-Test **Use When**: Comparing two independent groups **Assumptions**: 1. ✓ Continuous data 2. ✓ Independent samples 3. ✓ Normally distributed (each group) 4. ✓ Equal variances (for Student's t-test) **Two Variants**: | Variant | Variance Assumption | When to Use | | -------------------- | ------------------- | ----------------------- | | **Student's t-test** | Equal variances | Classic version | | **Welch's t-test** | Unequal variances | **Recommended default** | **Parameters**: | Parameter | Type | Default | Description | | ------------- | ---- | ----------- | ----------------------------- | | `equal_var` | bool | False | Use equal variance assumption | | `alternative` | str | 'two-sided' | Alternative hypothesis | **Usage Example**: ```python from scipy import stats # Welch's t-test (recommended) t_stat, p_value = stats.ttest_ind( group1_data, group2_data, equal_var=False, # Welch's alternative='two-sided' ) print(f"t-statistic: {t_stat:.3f}") print(f"p-value: {p_value:.4f}") # Interpretation alpha = 0.05 if p_value < alpha: print("Significant difference between groups") else: print("No significant difference") ``` **Alternative Hypotheses**: ```python # Two-sided (default): Groups are different alternative='two-sided' # One-sided: Group 1 > Group 2 alternative='greater' # One-sided: Group 1 < Group 2 alternative='less' ``` **Checking Assumptions**: ```python # 1. Normality test (Shapiro-Wilk) stat1, p1 = stats.shapiro(group1_data) stat2, p2 = stats.shapiro(group2_data) if p1 < 0.05 or p2 < 0.05: print("Warning: Data not normally distributed") print("Consider: Mann-Whitney U test instead") # 2. Equal variance test (Levene's test) stat, p = stats.levene(group1_data, group2_data) if p < 0.05: print("Unequal variances detected") print("Use: Welch's t-test (equal_var=False)") else: print("Equal variances") print("Can use: Student's t-test (equal_var=True)") ``` #### 2. Paired Samples T-Test **Use When**: Same subjects measured twice (before/after) **Example**: Same sample measured at two time points **Usage Example**: ```python from scipy import stats # Paired t-test t_stat, p_value = stats.ttest_rel( before_treatment, after_treatment ) print(f"Paired t-statistic: {t_stat:.3f}") print(f"p-value: {p_value:.4f}") ``` **Assumptions**: 1. ✓ Paired observations (same subjects) 2. ✓ Differences normally distributed 3. ✓ Continuous data #### 3. One-Sample T-Test **Use When**: Compare sample mean to known value **Usage Example**: ```python from scipy import stats # Test if mean differs from theoretical value reference_value = 0 t_stat, p_value = stats.ttest_1samp( data, reference_value ) ``` ### Interpretation **P-Value**: - **p < 0.05**: Significant difference (reject null hypothesis) - **p ≥ 0.05**: No significant difference (fail to reject null) **T-Statistic**: - **Large |t|**: Groups are far apart - **Small |t|**: Groups are similar **Confidence Interval** (95%): ```python from scipy import stats import numpy as np # Calculate 95% CI for mean difference diff = np.mean(group1) - np.mean(group2) se = stats.sem(np.concatenate([group1, group2])) ci = stats.t.interval(0.95, len(group1)+len(group2)-2, loc=diff, scale=se) print(f"Mean difference: {diff:.3f}") print(f"95% CI: [{ci[0]:.3f}, {ci[1]:.3f}]") # If CI doesn't include 0 → significant difference ``` ### Troubleshooting | Issue | Cause | Solution | | -------------------- | --------------------- | ------------------------------------ | | Non-normal data | Assumption violated | Use Mann-Whitney U | | Unequal variances | Assumption violated | Use Welch's t-test (equal_var=False) | | Small sample size | Low power | Increase n or use non-parametric | | Outliers present | Skewed distribution | Remove outliers or use robust test | | Multiple comparisons | Inflated Type I error | Apply Bonferroni correction | ### When to Use **Use T-Test when**: - ✓ Two groups to compare - ✓ Continuous data - ✓ Approximately normal distribution - ✓ Independent observations **Use alternatives when**: - ✗ Non-normal data → Mann-Whitney U - ✗ More than 2 groups → ANOVA (or pairwise tests) - ✗ Paired data → Paired t-test or Wilcoxon signed-rank --- ## Mann-Whitney U Test **Alternative Names**: Wilcoxon rank-sum test **Purpose**: Non-parametric alternative to independent t-test ### Theory **How It Works**: 1. Rank all observations (both groups combined) 2. Sum ranks for each group 3. Test if rank sums differ significantly **Key Advantage**: No normality assumption ### Assumptions 1. ✓ Independent observations 2. ✓ Ordinal or continuous data 3. ✓ Similar distributions (for testing medians) **Does NOT require**: - ✗ Normal distribution - ✗ Equal variances ### Usage Example ```python from scipy import stats # Mann-Whitney U test u_stat, p_value = stats.mannwhitneyu( group1_data, group2_data, alternative='two-sided' ) print(f"U-statistic: {u_stat:.1f}") print(f"p-value: {p_value:.4f}") # Effect size (rank-biserial correlation) n1, n2 = len(group1_data), len(group2_data) r = 1 - (2*u_stat) / (n1 * n2) print(f"Effect size (r): {r:.3f}") ``` ### Interpretation **Null Hypothesis**: Distributions are identical **P-Value**: - **p < 0.05**: Distributions differ significantly - **p ≥ 0.05**: No significant difference **Effect Size (r)**: - **r = 0**: No effect - **r = ±0.1**: Small effect - **r = ±0.3**: Medium effect - **r = ±0.5**: Large effect ### When to Use **Use Mann-Whitney when**: - ✓ Non-normal data - ✓ Ordinal data (e.g., Likert scales) - ✓ Outliers present - ✓ Small sample sizes - ✓ T-test assumptions violated **Use T-Test instead when**: - ✓ Normal data - ✓ Large samples (n > 30 per group) - ✓ Want parametric inference --- ## ANOVA (Analysis of Variance) **Purpose**: Compare means across 3+ groups **In this application**: ANOVA is performed **at each wavenumber** across selected groups (datasets or grouped labels). Because this produces many tests (one per wavenumber), the app provides **multiple-testing correction**. ### One-Way ANOVA **Use When**: One categorical factor, 3+ groups **Assumptions**: 1. ✓ Independent observations 2. ✓ Normal distribution (each group) 3. ✓ Equal variances (homoscedasticity) ### Usage Example ```python from scipy import stats # One-way ANOVA f_stat, p_value = stats.f_oneway( group1_data, group2_data, group3_data ) print(f"F-statistic: {f_stat:.3f}") print(f"p-value: {p_value:.4f}") if p_value < 0.05: print("Significant difference among groups") ``` ### Application Parameters | Parameter | Default | Description | | ------------------------- | -------: | ------------------------------------------------------------------------------ | | `alpha` | 0.05 | Significance level (threshold applied to raw p-values or corrected values) | | `p_adjust` | `fdr_bh` | Multiple-testing correction across wavenumbers: `none`, `fdr_bh`, `bonferroni` | | `post_hoc` | `none` | Optional post-hoc testing for significant wavenumbers: `none`, `tukey` | | `max_posthoc_wavenumbers` | 20 | Cap to keep post-hoc computation responsive (0 disables) | | `show_mean_overlay` | True | Show mean spectra overlay by group | | `highlight_significant` | True | Highlight significant wavenumber regions on the overlay | ### Grouped Mode vs Simple Mode - **Simple mode**: each selected **dataset** is treated as one group. - **Grouped mode**: datasets are mapped into user-defined groups (same grouping shared with the ML page). ANOVA is run across those **group labels**. ### Post-Hoc Tests (Optional) **Why needed**: ANOVA tells you “at least one group differs”, but not which pairs differ. The application can optionally compute **Tukey’s HSD** for a limited number of significant wavenumbers (controlled by `max_posthoc_wavenumbers`). **Why Needed**: ANOVA tells you groups differ, but not which specific pairs are different **Common Post-Hoc Methods**: - **Tukey's HSD**: Controls family-wise error rate (most conservative) - **Bonferroni**: Simple adjustment, divides α by number of comparisons - **Holm-Bonferroni**: Less conservative step-down procedure For two-group comparisons, use the **Pairwise Statistical Tests** feature in the application instead. ### Checking Assumptions ```python # 1. Normality (each group) from scipy import stats for i, group in enumerate([group1, group2, group3]): stat, p = stats.shapiro(group) print(f"Group {i+1} normality: p={p:.4f}") if p < 0.05: print(" → Not normal, consider Mann-Whitney for pairwise") # 2. Homogeneity of variance (Levene's test) stat, p = stats.levene(group1, group2, group3) print(f"\nLevene's test: p={p:.4f}") if p < 0.05: print(" → Unequal variances") print(" → Consider Welch's ANOVA") ``` ### Welch's ANOVA **When**: Unequal variances detected ```python # Welch's ANOVA (unequal variances) # Not directly in scipy, use one-way ANOVA with robust method from scipy import stats # Alternative: Use stats library with robust option # or implement Welch's ANOVA manually ``` ### Interpretation **F-Statistic**: - **Large F**: Groups differ substantially - **Small F**: Groups similar **P-Value**: - **p < 0.05**: At least one group differs - **p ≥ 0.05**: No significant differences **Effect Size (Eta-Squared)**: ```python # Calculate eta-squared (proportion of variance explained) SS_between = sum([len(g) * (np.mean(g) - grand_mean)**2 for g in [group1, group2, group3]]) SS_total = sum([(x - grand_mean)**2 for g in [group1, group2, group3] for x in g]) eta_squared = SS_between / SS_total print(f"η² = {eta_squared:.3f}") # Interpretation: # η² = 0.01: Small effect # η² = 0.06: Medium effect # η² = 0.14: Large effect ``` ### When to Use **Use ANOVA when**: - ✓ 3+ groups to compare - ✓ One categorical factor - ✓ Continuous outcome - ✓ Assumptions met **Use alternatives when**: - ✗ Two groups only → T-test - ✗ Non-normal data → Mann-Whitney U test (for 2 groups) or non-parametric methods - ✗ Unequal variances → Welch's ANOVA - ✗ Repeated measures → Repeated-measures ANOVA **Note**: For exactly 2 groups, use **Pairwise Statistical Tests** (Welch t-test / Mann–Whitney / Wilcoxon) with optional FDR correction. --- ## Pairwise Statistical Tests **Purpose**: Compare two groups at each wavenumber across the spectrum ### Available Tests The application provides three statistical tests for pairwise comparisons: #### 1. Independent T-Test (t_test) **Use When**: Comparing two independent groups, data approximately normal **Assumptions**: 1. ✓ Independent samples 2. ✓ Approximately normal distribution 3. ✓ Continuous data **Parameters**: - `alpha`: Significance level (default: 0.05) - `fdr_correction`: Apply FDR correction for multiple testing (default: True) #### 2. Mann-Whitney U Test (mann_whitney) **Use When**: Non-parametric alternative to t-test **Advantages**: - ✓ No normality assumption - ✓ Robust to outliers - ✓ Works with small samples **Parameters**: - `alpha`: Significance level (default: 0.05) - `fdr_correction`: Apply FDR correction (default: True) #### 3. Wilcoxon Signed-Rank Test (wilcoxon) **Use When**: Comparing paired samples (e.g., before/after measurements) **Assumptions**: 1. ✓ Paired observations 2. ✓ Ordinal or continuous data 3. ✓ Symmetric distribution of differences ### Usage in Application ``` Analysis Page → Statistical Methods → Pairwise Statistical Tests 1. Select exactly 2 datasets to compare 2. Choose test type: t_test, mann_whitney, or wilcoxon 3. Set significance level (α) 4. Enable/disable FDR correction 5. Click "Run Analysis" ``` ### Interpretation **Output Includes**: - **P-value plot**: Shows p-values across all wavenumbers - **Significance markers**: Regions where p < α (after correction) - **Mean overlay**: Mean spectra of both groups - **Summary statistics**: Number of significant wavenumbers **Example Interpretation**: ``` Results: Group A vs Group B - Significant wavenumbers: 145 out of 1400 - FDR-corrected α: 0.05 - Key regions: 1000-1100 cm⁻¹, 1600-1700 cm⁻¹ Interpretation: Groups differ significantly in protein (Amide I, 1650 cm⁻¹) and lipid regions (1000-1100 cm⁻¹). ``` ### Best Practices **✅ Recommended**: - Always use FDR correction when testing across full spectrum - Use Mann-Whitney for non-normal data - Report both p-values and effect sizes - Verify biological relevance of significant regions **❌ Avoid**: - Testing without multiple testing correction - Ignoring assumption violations - Over-interpreting borderline significance - Comparing more than 2 groups (use appropriate multi-group method instead) --- ## Correlation Analysis **Purpose**: Measure association between two variables ### Pearson Correlation **Measures**: Linear relationship strength **Assumptions**: 1. ✓ Continuous data 2. ✓ Linear relationship 3. ✓ Bivariate normal distribution 4. ✓ No outliers ### Usage Example ```python from scipy import stats # Pearson correlation r, p_value = stats.pearsonr(variable1, variable2) print(f"Pearson r: {r:.3f}") print(f"p-value: {p_value:.4f}") # Interpretation if p_value < 0.05: if r > 0: print("Significant positive correlation") else: print("Significant negative correlation") ``` **Interpretation of r**: - **r = 1**: Perfect positive correlation - **r = 0.7-0.9**: Strong correlation - **r = 0.4-0.6**: Moderate correlation - **r = 0.1-0.3**: Weak correlation - **r = 0**: No correlation - **r = -1**: Perfect negative correlation **Coefficient of Determination (R²)**: ```python r_squared = r ** 2 print(f"R² = {r_squared:.3f}") # R² = proportion of variance explained ``` ### Spearman Correlation **Measures**: Monotonic relationship strength (non-parametric) **Advantages**: - ✓ No normality assumption - ✓ Robust to outliers - ✓ Detects non-linear monotonic relationships **Usage Example**: ```python from scipy import stats # Spearman correlation rho, p_value = stats.spearmanr(variable1, variable2) print(f"Spearman ρ: {rho:.3f}") print(f"p-value: {p_value:.4f}") ``` **When to Use**: - ✓ Ordinal data - ✓ Non-linear but monotonic relationship - ✓ Outliers present - ✓ Non-normal data ### Kendall's Tau **Measures**: Ordinal association (non-parametric) **Advantages**: - ✓ Better for small samples - ✓ More robust than Spearman **Usage Example**: ```python from scipy import stats # Kendall's tau tau, p_value = stats.kendalltau(variable1, variable2) print(f"Kendall τ: {tau:.3f}") print(f"p-value: {p_value:.4f}") ``` ### Partial Correlation **Purpose**: Correlation between X and Y, controlling for Z ```python from scipy import stats import numpy as np def partial_correlation(x, y, z): """ Partial correlation between x and y, controlling for z """ # Correlations r_xy = stats.pearsonr(x, y)[0] r_xz = stats.pearsonr(x, z)[0] r_yz = stats.pearsonr(y, z)[0] # Partial correlation numerator = r_xy - (r_xz * r_yz) denominator = np.sqrt((1 - r_xz**2) * (1 - r_yz**2)) r_xy_z = numerator / denominator # Significance test n = len(x) t_stat = r_xy_z * np.sqrt((n - 3) / (1 - r_xy_z**2)) p_value = 2 * (1 - stats.t.cdf(abs(t_stat), n - 3)) return r_xy_z, p_value # Example usage r_partial, p = partial_correlation( peak1_intensity, peak2_intensity, concentration ) print(f"Partial correlation: {r_partial:.3f}, p={p:.4f}") ``` ### Correlation Matrix **Purpose**: Correlations between multiple variables ```python import numpy as np import pandas as pd from scipy import stats # Create correlation matrix data = pd.DataFrame({ 'Peak1': peak1_intensities, 'Peak2': peak2_intensities, 'Peak3': peak3_intensities }) # Pearson correlation matrix corr_matrix = data.corr(method='pearson') print(corr_matrix) # P-values for correlations from scipy.stats import pearsonr n = len(data.columns) p_values = np.zeros((n, n)) for i in range(n): for j in range(n): if i != j: _, p_values[i, j] = pearsonr( data.iloc[:, i], data.iloc[:, j] ) p_df = pd.DataFrame(p_values, columns=data.columns, index=data.columns) print("\nP-values:") print(p_df) # Visualization import seaborn as sns import matplotlib.pyplot as plt plt.figure(figsize=(8, 6)) sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', vmin=-1, vmax=1, center=0, square=True, linewidths=1) plt.title('Correlation Matrix') plt.tight_layout() plt.show() ``` ### Troubleshooting | Issue | Cause | Solution | | -------------------------------------- | -------------------- | ----------------------------------------- | | Low correlation but clear relationship | Non-linear | Use Spearman or visualize | | Significant but r near 0 | Large sample size | Check practical significance | | High correlation, not significant | Small sample | Increase n | | Spurious correlation | Confounding variable | Use partial correlation | | Correlation ≠ Causation | | Design experiment or use causal inference | --- ## Multiple Testing Correction **Problem**: Testing multiple hypotheses inflates Type I error (false positives) **Example**: ``` Test 100 peaks at α=0.05 Expected false positives: 100 × 0.05 = 5 ``` ### Methods #### 1. Bonferroni Correction **Most Conservative**: Divide α by number of tests **Formula**: ``` α_corrected = α / n_tests ``` **Usage**: ```python from statsmodels.stats.multitest import multipletests # Original p-values from multiple tests p_values = [0.001, 0.03, 0.15, 0.008, 0.12] # Bonferroni correction reject, p_corrected, _, _ = multipletests( p_values, alpha=0.05, method='bonferroni' ) for i, (p_orig, p_corr, is_sig) in enumerate(zip(p_values, p_corrected, reject)): print(f"Test {i+1}: p={p_orig:.4f}, p_corr={p_corr:.4f}, sig={is_sig}") ``` **When to Use**: - ✓ Small number of tests (< 20) - ✓ Want to minimize false positives - ✗ Conservative (may miss real effects) #### 2. Holm-Bonferroni Method **Less Conservative**: Step-down Bonferroni **Procedure**: 1. Sort p-values: p₁ ≤ p₂ ≤ ... ≤ pₙ 2. Test p₁ against α/n 3. If reject, test p₂ against α/(n-1) 4. Continue until failure to reject **Usage**: ```python reject, p_corrected, _, _ = multipletests( p_values, alpha=0.05, method='holm' ) ``` **Advantages**: - ✓ More powerful than Bonferroni - ✓ Still controls family-wise error rate #### 3. False Discovery Rate (FDR) - Benjamini-Hochberg **Controls**: Proportion of false positives among rejections **Less Conservative**: Accepts some false positives for more power **Usage**: ```python reject, p_corrected, _, _ = multipletests( p_values, alpha=0.05, method='fdr_bh' # Benjamini-Hochberg ) ``` **When to Use**: - ✓ Large number of tests (e.g., all wavenumbers) - ✓ Exploratory analysis - ✓ Accept controlled false discovery rate (5%) **Interpretation**: ``` FDR = 0.05 means: - 5% of "significant" results may be false positives - More lenient than Bonferroni - Better for discovery ``` #### 4. Permutation Testing **Most Rigorous**: Non-parametric approach **Procedure**: 1. Compute test statistic on real data 2. Randomly permute labels many times (e.g., 10,000) 3. Compute test statistic for each permutation 4. p-value = proportion of permutations ≥ observed statistic **Usage**: ```python import numpy as np from scipy import stats def permutation_test(group1, group2, n_permutations=10000): """ Two-sample permutation test """ # Observed test statistic observed_stat = np.abs(np.mean(group1) - np.mean(group2)) # Combine data combined = np.concatenate([group1, group2]) n1 = len(group1) # Permutations perm_stats = [] for _ in range(n_permutations): np.random.shuffle(combined) perm_group1 = combined[:n1] perm_group2 = combined[n1:] perm_stat = np.abs(np.mean(perm_group1) - np.mean(perm_group2)) perm_stats.append(perm_stat) # P-value p_value = np.mean(np.array(perm_stats) >= observed_stat) return p_value, perm_stats # Example p_perm, perm_dist = permutation_test(group1, group2, n_permutations=10000) print(f"Permutation p-value: {p_perm:.4f}") ``` **Advantages**: - ✓ No distribution assumptions - ✓ Exact p-values - ✓ Flexible (any test statistic) **Disadvantages**: - ✗ Computationally intensive - ✗ Requires sufficient data ### Comparison Table | Method | Conservativeness | Power | Best For | | --------------- | ---------------- | ------ | ------------------------------ | | **Bonferroni** | Very high | Low | Few tests, minimize FP | | **Holm** | High | Medium | Few tests, more power | | **FDR (B-H)** | Medium | High | Many tests, discovery | | **Permutation** | Exact | High | Small datasets, no assumptions | ### Decision Guide ``` Number of tests: │ ├─ Few (< 20) │ ├─ Minimize false positives → Bonferroni │ └─ More power → Holm │ ├─ Many (20-1000) │ ├─ Exploratory → FDR (Benjamini-Hochberg) │ └─ Confirmatory → Holm │ └─ Very many (> 1000, e.g., all wavenumbers) └─ FDR (Benjamini-Hochberg) ``` --- ## Effect Size Measures **Why Important**: Statistical significance ≠ practical significance **Small p-value** with large n → tiny effect can be "significant" **Effect size** → How large is the difference? ### Cohen's d **Purpose**: Standardized mean difference **Formula**: ``` d = (mean₁ - mean₂) / pooled_SD ``` **Calculation**: ```python import numpy as np def cohens_d(group1, group2): """ Calculate Cohen's d effect size """ n1, n2 = len(group1), len(group2) mean1, mean2 = np.mean(group1), np.mean(group2) var1, var2 = np.var(group1, ddof=1), np.var(group2, ddof=1) # Pooled standard deviation pooled_std = np.sqrt(((n1-1)*var1 + (n2-1)*var2) / (n1+n2-2)) # Cohen's d d = (mean1 - mean2) / pooled_std return d # Example d = cohens_d(group1, group2) print(f"Cohen's d: {d:.3f}") ``` **Interpretation**: - **|d| < 0.2**: Negligible - **|d| = 0.2**: Small effect - **|d| = 0.5**: Medium effect - **|d| = 0.8**: Large effect - **|d| > 1.2**: Very large effect ### Eta-Squared (η²) **Purpose**: Proportion of variance explained (ANOVA) **Formula**: ``` η² = SS_between / SS_total ``` **Calculation**: ```python def eta_squared(groups): """ Calculate eta-squared for ANOVA """ # Grand mean grand_mean = np.mean(np.concatenate(groups)) # Between-group sum of squares SS_between = sum([ len(g) * (np.mean(g) - grand_mean)**2 for g in groups ]) # Total sum of squares SS_total = sum([ (x - grand_mean)**2 for g in groups for x in g ]) # Eta-squared eta_sq = SS_between / SS_total return eta_sq # Example eta_sq = eta_squared([group1, group2, group3]) print(f"η²: {eta_sq:.3f}") ``` **Interpretation**: - **η² = 0.01**: Small effect (1% variance explained) - **η² = 0.06**: Medium effect (6%) - **η² = 0.14**: Large effect (14%) ### Omega-Squared (ω²) **Purpose**: Less biased estimate of variance explained **Preferred over η²** for smaller samples **Calculation**: ```python def omega_squared(groups): """ Calculate omega-squared (less biased than eta-squared) """ k = len(groups) # number of groups N = sum([len(g) for g in groups]) # total sample size grand_mean = np.mean(np.concatenate(groups)) SS_between = sum([ len(g) * (np.mean(g) - grand_mean)**2 for g in groups ]) SS_within = sum([ sum((x - np.mean(g))**2 for x in g) for g in groups ]) MS_within = SS_within / (N - k) omega_sq = (SS_between - (k - 1) * MS_within) / (SS_between + SS_within + MS_within) return max(0, omega_sq) # Can't be negative omega_sq = omega_squared([group1, group2, group3]) print(f"ω²: {omega_sq:.3f}") ``` ### Reporting Effect Sizes **Best Practice**: Always report effect size with p-value **Example Report**: ``` "Group A showed significantly higher intensity than Group B (t(58) = 3.42, p = 0.001, d = 0.85), representing a large effect." "ANOVA revealed significant differences among groups (F(2, 87) = 12.3, p < 0.001, η² = 0.22), explaining 22% of variance." ``` --- ## Band Ratio Analysis **Purpose**: Compare relative intensities of specific spectral bands ### Theory **Rationale**: - Dimensionality reduction (1 feature from 2 peaks) - Normalize for concentration/thickness - Create interpretable biomarkers **Common Ratios**: ``` Amide I / CH₂: I₁₆₅₅ / I₁₄₄₅ Protein / Lipid: I₁₆₅₀ / I₁₃₀₀ Amide III ratio: I₁₂₉₀ / I₁₂₄₀ Phosphate / Amide: I₉₆₀ / I₁₆₅₀ ``` ### Calculation ```python def calculate_band_ratio(spectra, wavenumbers, band1_range, band2_range): """ Calculate ratio between two spectral bands Parameters: ----------- spectra : array (n_samples, n_features) wavenumbers : array (n_features,) band1_range : tuple (start, end) wavenumbers band2_range : tuple (start, end) wavenumbers Returns: -------- ratios : array (n_samples,) """ # Find indices for bands idx1 = np.where((wavenumbers >= band1_range[0]) & (wavenumbers <= band1_range[1]))[0] idx2 = np.where((wavenumbers >= band2_range[0]) & (wavenumbers <= band2_range[1]))[0] # Integrate (or max) each band intensity1 = np.trapz(spectra[:, idx1], wavenumbers[idx1], axis=1) intensity2 = np.trapz(spectra[:, idx2], wavenumbers[idx2], axis=1) # Calculate ratio ratios = intensity1 / (intensity2 + 1e-10) # Avoid division by zero return ratios # Example usage ratios = calculate_band_ratio( preprocessed_spectra, wavenumbers, band1_range=(1645, 1665), # Amide I band2_range=(1440, 1460) # CH₂ ) # Statistical comparison from scipy import stats t_stat, p_value = stats.ttest_ind( ratios[labels == 'Control'], ratios[labels == 'Disease'] ) print(f"Band ratio test: t={t_stat:.3f}, p={p_value:.4f}") ``` ### Integration Methods **1. Trapezoidal Integration** (Recommended): ```python intensity = np.trapz(spectrum[idx], wavenumbers[idx]) ``` **2. Maximum Intensity**: ```python intensity = np.max(spectrum[idx]) ``` **3. Area Under Curve**: ```python from sklearn.metrics import auc intensity = auc(wavenumbers[idx], spectrum[idx]) ``` ### Statistical Testing ```python # 1. Compare ratios between groups from scipy import stats group1_ratios = ratios[labels == 'GroupA'] group2_ratios = ratios[labels == 'GroupB'] # T-test t_stat, p_value = stats.ttest_ind(group1_ratios, group2_ratios) # Effect size d = cohens_d(group1_ratios, group2_ratios) print(f"t={t_stat:.3f}, p={p_value:.4f}, d={d:.3f}") # 2. Correlation with clinical variable r, p = stats.pearsonr(ratios, clinical_scores) print(f"Correlation: r={r:.3f}, p={p:.4f}") # 3. ROC analysis (for classification) from sklearn.metrics import roc_auc_score, roc_curve # Binary classification binary_labels = (labels == 'Disease').astype(int) auc_score = roc_auc_score(binary_labels, ratios) fpr, tpr, thresholds = roc_curve(binary_labels, ratios) print(f"AUC: {auc_score:.3f}") # Find optimal threshold optimal_idx = np.argmax(tpr - fpr) optimal_threshold = thresholds[optimal_idx] print(f"Optimal threshold: {optimal_threshold:.3f}") ``` ### Best Practices 1. **Preprocessing**: - Always baseline-correct before ratio calculation - Use same normalization for all spectra - Check for negative values 2. **Band Selection**: - Use literature-validated bands - Avoid overlapping bands - Verify peaks present in your samples 3. **Statistical Reporting**: ``` "The Amide I/CH₂ ratio was significantly elevated in disease samples (mean ± SD: 2.34 ± 0.45) compared to controls (1.87 ± 0.32; t(58) = 4.52, p < 0.001, d = 1.17)." ``` --- ## Best Practices ### General Workflow ``` 1. Check Assumptions ├─ Normality (Shapiro-Wilk) ├─ Equal variance (Levene's) └─ Independence 2. Choose Appropriate Test ├─ Parametric (if assumptions met) └─ Non-parametric (if violated) 3. Apply Test └─ Use two-sided unless justified 4. Multiple Testing Correction ├─ Few tests → Bonferroni/Holm └─ Many tests → FDR 5. Report Effect Size ├─ Cohen's d (t-test) ├─ η² or ω² (ANOVA) └─ r (correlation) 6. Interpret in Context └─ Statistical + Practical significance ``` ### Reporting Checklist ✓ **Test used** and why ✓ **Test statistic** and degrees of freedom ✓ **P-value** (exact or < 0.001) ✓ **Effect size** with interpretation ✓ **Sample sizes** per group ✓ **Descriptive statistics** (mean ± SD) ✓ **Multiple testing correction** method ✓ **Assumptions** checked (state violations if any) ### Common Mistakes to Avoid ❌ **Multiple testing without correction** → Inflated Type I error ❌ **Using parametric tests on non-normal data** → Invalid p-values ❌ **Reporting only p-values** → Missing practical significance ❌ **One-sided tests without justification** → Inflated significance ❌ **Ignoring assumption violations** → Misleading results ❌ **Cherry-picking significant results** → Publication bias --- ## See Also - [Statistical Analysis User Guide](../user-guide/analysis.md#statistical-analysis) - Step-by-step tutorials - [Exploratory Methods](exploratory.md) - PCA, UMAP, clustering - [Machine Learning Methods](machine-learning.md) - Classification algorithms - [Best Practices](../user-guide/best-practices.md) - Analysis strategies --- **Last Updated**: 2026-01-24