Statistical Analysis Methods

Comprehensive reference for hypothesis testing and statistical methods.

Table of Contents


T-Tests

Purpose: Compare means of two groups

Types of T-Tests

1. Independent Samples T-Test

Use When: Comparing two independent groups

Assumptions:

  1. ✓ Continuous data

  2. ✓ Independent samples

  3. ✓ Normally distributed (each group)

  4. ✓ Equal variances (for Student’s t-test)

Two Variants:

Variant

Variance Assumption

When to Use

Student’s t-test

Equal variances

Classic version

Welch’s t-test

Unequal variances

Recommended default

Parameters:

Parameter

Type

Default

Description

equal_var

bool

False

Use equal variance assumption

alternative

str

‘two-sided’

Alternative hypothesis

Usage Example:

from scipy import stats

# Welch's t-test (recommended)
t_stat, p_value = stats.ttest_ind(
    group1_data,
    group2_data,
    equal_var=False,  # Welch's
    alternative='two-sided'
)

print(f"t-statistic: {t_stat:.3f}")
print(f"p-value: {p_value:.4f}")

# Interpretation
alpha = 0.05
if p_value < alpha:
    print("Significant difference between groups")
else:
    print("No significant difference")

Alternative Hypotheses:

# Two-sided (default): Groups are different
alternative='two-sided'

# One-sided: Group 1 > Group 2
alternative='greater'

# One-sided: Group 1 < Group 2
alternative='less'

Checking Assumptions:

# 1. Normality test (Shapiro-Wilk)
stat1, p1 = stats.shapiro(group1_data)
stat2, p2 = stats.shapiro(group2_data)

if p1 < 0.05 or p2 < 0.05:
    print("Warning: Data not normally distributed")
    print("Consider: Mann-Whitney U test instead")

# 2. Equal variance test (Levene's test)
stat, p = stats.levene(group1_data, group2_data)

if p < 0.05:
    print("Unequal variances detected")
    print("Use: Welch's t-test (equal_var=False)")
else:
    print("Equal variances")
    print("Can use: Student's t-test (equal_var=True)")

2. Paired Samples T-Test

Use When: Same subjects measured twice (before/after)

Example: Same sample measured at two time points

Usage Example:

from scipy import stats

# Paired t-test
t_stat, p_value = stats.ttest_rel(
    before_treatment,
    after_treatment
)

print(f"Paired t-statistic: {t_stat:.3f}")
print(f"p-value: {p_value:.4f}")

Assumptions:

  1. ✓ Paired observations (same subjects)

  2. ✓ Differences normally distributed

  3. ✓ Continuous data

3. One-Sample T-Test

Use When: Compare sample mean to known value

Usage Example:

from scipy import stats

# Test if mean differs from theoretical value
reference_value = 0

t_stat, p_value = stats.ttest_1samp(
    data,
    reference_value
)

Interpretation

P-Value:

  • p < 0.05: Significant difference (reject null hypothesis)

  • p ≥ 0.05: No significant difference (fail to reject null)

T-Statistic:

  • Large |t|: Groups are far apart

  • Small |t|: Groups are similar

Confidence Interval (95%):

from scipy import stats
import numpy as np

# Calculate 95% CI for mean difference
diff = np.mean(group1) - np.mean(group2)
se = stats.sem(np.concatenate([group1, group2]))
ci = stats.t.interval(0.95, len(group1)+len(group2)-2, 
                      loc=diff, scale=se)

print(f"Mean difference: {diff:.3f}")
print(f"95% CI: [{ci[0]:.3f}, {ci[1]:.3f}]")

# If CI doesn't include 0 → significant difference

Troubleshooting

Issue

Cause

Solution

Non-normal data

Assumption violated

Use Mann-Whitney U

Unequal variances

Assumption violated

Use Welch’s t-test (equal_var=False)

Small sample size

Low power

Increase n or use non-parametric

Outliers present

Skewed distribution

Remove outliers or use robust test

Multiple comparisons

Inflated Type I error

Apply Bonferroni correction

When to Use

Use T-Test when:

  • ✓ Two groups to compare

  • ✓ Continuous data

  • ✓ Approximately normal distribution

  • ✓ Independent observations

Use alternatives when:

  • ✗ Non-normal data → Mann-Whitney U

  • ✗ More than 2 groups → ANOVA (or pairwise tests)

  • ✗ Paired data → Paired t-test or Wilcoxon signed-rank


Mann-Whitney U Test

Alternative Names: Wilcoxon rank-sum test

Purpose: Non-parametric alternative to independent t-test

Theory

How It Works:

  1. Rank all observations (both groups combined)

  2. Sum ranks for each group

  3. Test if rank sums differ significantly

Key Advantage: No normality assumption

Assumptions

  1. ✓ Independent observations

  2. ✓ Ordinal or continuous data

  3. ✓ Similar distributions (for testing medians)

Does NOT require:

  • ✗ Normal distribution

  • ✗ Equal variances

Usage Example

from scipy import stats

# Mann-Whitney U test
u_stat, p_value = stats.mannwhitneyu(
    group1_data,
    group2_data,
    alternative='two-sided'
)

print(f"U-statistic: {u_stat:.1f}")
print(f"p-value: {p_value:.4f}")

# Effect size (rank-biserial correlation)
n1, n2 = len(group1_data), len(group2_data)
r = 1 - (2*u_stat) / (n1 * n2)
print(f"Effect size (r): {r:.3f}")

Interpretation

Null Hypothesis: Distributions are identical

P-Value:

  • p < 0.05: Distributions differ significantly

  • p ≥ 0.05: No significant difference

Effect Size ®:

  • r = 0: No effect

  • r = ±0.1: Small effect

  • r = ±0.3: Medium effect

  • r = ±0.5: Large effect

When to Use

Use Mann-Whitney when:

  • ✓ Non-normal data

  • ✓ Ordinal data (e.g., Likert scales)

  • ✓ Outliers present

  • ✓ Small sample sizes

  • ✓ T-test assumptions violated

Use T-Test instead when:

  • ✓ Normal data

  • ✓ Large samples (n > 30 per group)

  • ✓ Want parametric inference


ANOVA (Analysis of Variance)

Purpose: Compare means across 3+ groups

In this application: ANOVA is performed at each wavenumber across selected groups (datasets or grouped labels). Because this produces many tests (one per wavenumber), the app provides multiple-testing correction.

One-Way ANOVA

Use When: One categorical factor, 3+ groups

Assumptions:

  1. ✓ Independent observations

  2. ✓ Normal distribution (each group)

  3. ✓ Equal variances (homoscedasticity)

Usage Example

from scipy import stats

# One-way ANOVA
f_stat, p_value = stats.f_oneway(
    group1_data,
    group2_data,
    group3_data
)

print(f"F-statistic: {f_stat:.3f}")
print(f"p-value: {p_value:.4f}")

if p_value < 0.05:
    print("Significant difference among groups")

Application Parameters

Parameter

Default

Description

alpha

0.05

Significance level (threshold applied to raw p-values or corrected values)

p_adjust

fdr_bh

Multiple-testing correction across wavenumbers: none, fdr_bh, bonferroni

post_hoc

none

Optional post-hoc testing for significant wavenumbers: none, tukey

max_posthoc_wavenumbers

20

Cap to keep post-hoc computation responsive (0 disables)

show_mean_overlay

True

Show mean spectra overlay by group

highlight_significant

True

Highlight significant wavenumber regions on the overlay

Grouped Mode vs Simple Mode

  • Simple mode: each selected dataset is treated as one group.

  • Grouped mode: datasets are mapped into user-defined groups (same grouping shared with the ML page). ANOVA is run across those group labels.

Post-Hoc Tests (Optional)

Why needed: ANOVA tells you “at least one group differs”, but not which pairs differ.

The application can optionally compute Tukey’s HSD for a limited number of significant wavenumbers (controlled by max_posthoc_wavenumbers).

Why Needed: ANOVA tells you groups differ, but not which specific pairs are different

Common Post-Hoc Methods:

  • Tukey’s HSD: Controls family-wise error rate (most conservative)

  • Bonferroni: Simple adjustment, divides α by number of comparisons

  • Holm-Bonferroni: Less conservative step-down procedure

For two-group comparisons, use the Pairwise Statistical Tests feature in the application instead.

Checking Assumptions

# 1. Normality (each group)
from scipy import stats

for i, group in enumerate([group1, group2, group3]):
    stat, p = stats.shapiro(group)
    print(f"Group {i+1} normality: p={p:.4f}")
    if p < 0.05:
        print("  → Not normal, consider Mann-Whitney for pairwise")

# 2. Homogeneity of variance (Levene's test)
stat, p = stats.levene(group1, group2, group3)
print(f"\nLevene's test: p={p:.4f}")
if p < 0.05:
    print("  → Unequal variances")
    print("  → Consider Welch's ANOVA")

Welch’s ANOVA

When: Unequal variances detected

# Welch's ANOVA (unequal variances)
# Not directly in scipy, use one-way ANOVA with robust method
from scipy import stats

# Alternative: Use stats library with robust option
# or implement Welch's ANOVA manually

Interpretation

F-Statistic:

  • Large F: Groups differ substantially

  • Small F: Groups similar

P-Value:

  • p < 0.05: At least one group differs

  • p ≥ 0.05: No significant differences

Effect Size (Eta-Squared):

# Calculate eta-squared (proportion of variance explained)
SS_between = sum([len(g) * (np.mean(g) - grand_mean)**2 
                  for g in [group1, group2, group3]])
SS_total = sum([(x - grand_mean)**2 
                for g in [group1, group2, group3] 
                for x in g])

eta_squared = SS_between / SS_total
print(f"η² = {eta_squared:.3f}")

# Interpretation:
# η² = 0.01: Small effect
# η² = 0.06: Medium effect
# η² = 0.14: Large effect

When to Use

Use ANOVA when:

  • ✓ 3+ groups to compare

  • ✓ One categorical factor

  • ✓ Continuous outcome

  • ✓ Assumptions met

Use alternatives when:

  • ✗ Two groups only → T-test

  • ✗ Non-normal data → Mann-Whitney U test (for 2 groups) or non-parametric methods

  • ✗ Unequal variances → Welch’s ANOVA

  • ✗ Repeated measures → Repeated-measures ANOVA

Note: For exactly 2 groups, use Pairwise Statistical Tests (Welch t-test / Mann–Whitney / Wilcoxon) with optional FDR correction.


Pairwise Statistical Tests

Purpose: Compare two groups at each wavenumber across the spectrum

Available Tests

The application provides three statistical tests for pairwise comparisons:

1. Independent T-Test (t_test)

Use When: Comparing two independent groups, data approximately normal

Assumptions:

  1. ✓ Independent samples

  2. ✓ Approximately normal distribution

  3. ✓ Continuous data

Parameters:

  • alpha: Significance level (default: 0.05)

  • fdr_correction: Apply FDR correction for multiple testing (default: True)

2. Mann-Whitney U Test (mann_whitney)

Use When: Non-parametric alternative to t-test

Advantages:

  • ✓ No normality assumption

  • ✓ Robust to outliers

  • ✓ Works with small samples

Parameters:

  • alpha: Significance level (default: 0.05)

  • fdr_correction: Apply FDR correction (default: True)

3. Wilcoxon Signed-Rank Test (wilcoxon)

Use When: Comparing paired samples (e.g., before/after measurements)

Assumptions:

  1. ✓ Paired observations

  2. ✓ Ordinal or continuous data

  3. ✓ Symmetric distribution of differences

Usage in Application

Analysis Page → Statistical Methods → Pairwise Statistical Tests

1. Select exactly 2 datasets to compare
2. Choose test type: t_test, mann_whitney, or wilcoxon
3. Set significance level (α)
4. Enable/disable FDR correction
5. Click "Run Analysis"

Interpretation

Output Includes:

  • P-value plot: Shows p-values across all wavenumbers

  • Significance markers: Regions where p < α (after correction)

  • Mean overlay: Mean spectra of both groups

  • Summary statistics: Number of significant wavenumbers

Example Interpretation:

Results: Group A vs Group B
- Significant wavenumbers: 145 out of 1400
- FDR-corrected α: 0.05
- Key regions: 1000-1100 cm⁻¹, 1600-1700 cm⁻¹

Interpretation: Groups differ significantly in protein (Amide I, 1650 cm⁻¹)
and lipid regions (1000-1100 cm⁻¹).

Best Practices

✅ Recommended:

  • Always use FDR correction when testing across full spectrum

  • Use Mann-Whitney for non-normal data

  • Report both p-values and effect sizes

  • Verify biological relevance of significant regions

❌ Avoid:

  • Testing without multiple testing correction

  • Ignoring assumption violations

  • Over-interpreting borderline significance

  • Comparing more than 2 groups (use appropriate multi-group method instead)


Correlation Analysis

Purpose: Measure association between two variables

Pearson Correlation

Measures: Linear relationship strength

Assumptions:

  1. ✓ Continuous data

  2. ✓ Linear relationship

  3. ✓ Bivariate normal distribution

  4. ✓ No outliers

Usage Example

from scipy import stats

# Pearson correlation
r, p_value = stats.pearsonr(variable1, variable2)

print(f"Pearson r: {r:.3f}")
print(f"p-value: {p_value:.4f}")

# Interpretation
if p_value < 0.05:
    if r > 0:
        print("Significant positive correlation")
    else:
        print("Significant negative correlation")

Interpretation of r:

  • r = 1: Perfect positive correlation

  • r = 0.7-0.9: Strong correlation

  • r = 0.4-0.6: Moderate correlation

  • r = 0.1-0.3: Weak correlation

  • r = 0: No correlation

  • r = -1: Perfect negative correlation

Coefficient of Determination (R²):

r_squared = r ** 2
print(f"R² = {r_squared:.3f}")
# R² = proportion of variance explained

Spearman Correlation

Measures: Monotonic relationship strength (non-parametric)

Advantages:

  • ✓ No normality assumption

  • ✓ Robust to outliers

  • ✓ Detects non-linear monotonic relationships

Usage Example:

from scipy import stats

# Spearman correlation
rho, p_value = stats.spearmanr(variable1, variable2)

print(f"Spearman ρ: {rho:.3f}")
print(f"p-value: {p_value:.4f}")

When to Use:

  • ✓ Ordinal data

  • ✓ Non-linear but monotonic relationship

  • ✓ Outliers present

  • ✓ Non-normal data

Kendall’s Tau

Measures: Ordinal association (non-parametric)

Advantages:

  • ✓ Better for small samples

  • ✓ More robust than Spearman

Usage Example:

from scipy import stats

# Kendall's tau
tau, p_value = stats.kendalltau(variable1, variable2)

print(f"Kendall τ: {tau:.3f}")
print(f"p-value: {p_value:.4f}")

Partial Correlation

Purpose: Correlation between X and Y, controlling for Z

from scipy import stats
import numpy as np

def partial_correlation(x, y, z):
    """
    Partial correlation between x and y, controlling for z
    """
    # Correlations
    r_xy = stats.pearsonr(x, y)[0]
    r_xz = stats.pearsonr(x, z)[0]
    r_yz = stats.pearsonr(y, z)[0]
    
    # Partial correlation
    numerator = r_xy - (r_xz * r_yz)
    denominator = np.sqrt((1 - r_xz**2) * (1 - r_yz**2))
    r_xy_z = numerator / denominator
    
    # Significance test
    n = len(x)
    t_stat = r_xy_z * np.sqrt((n - 3) / (1 - r_xy_z**2))
    p_value = 2 * (1 - stats.t.cdf(abs(t_stat), n - 3))
    
    return r_xy_z, p_value

# Example usage
r_partial, p = partial_correlation(
    peak1_intensity,
    peak2_intensity,
    concentration
)
print(f"Partial correlation: {r_partial:.3f}, p={p:.4f}")

Correlation Matrix

Purpose: Correlations between multiple variables

import numpy as np
import pandas as pd
from scipy import stats

# Create correlation matrix
data = pd.DataFrame({
    'Peak1': peak1_intensities,
    'Peak2': peak2_intensities,
    'Peak3': peak3_intensities
})

# Pearson correlation matrix
corr_matrix = data.corr(method='pearson')
print(corr_matrix)

# P-values for correlations
from scipy.stats import pearsonr

n = len(data.columns)
p_values = np.zeros((n, n))

for i in range(n):
    for j in range(n):
        if i != j:
            _, p_values[i, j] = pearsonr(
                data.iloc[:, i],
                data.iloc[:, j]
            )

p_df = pd.DataFrame(p_values, 
                    columns=data.columns,
                    index=data.columns)
print("\nP-values:")
print(p_df)

# Visualization
import seaborn as sns
import matplotlib.pyplot as plt

plt.figure(figsize=(8, 6))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', 
            vmin=-1, vmax=1, center=0,
            square=True, linewidths=1)
plt.title('Correlation Matrix')
plt.tight_layout()
plt.show()

Troubleshooting

Issue

Cause

Solution

Low correlation but clear relationship

Non-linear

Use Spearman or visualize

Significant but r near 0

Large sample size

Check practical significance

High correlation, not significant

Small sample

Increase n

Spurious correlation

Confounding variable

Use partial correlation

Correlation ≠ Causation

Design experiment or use causal inference


Multiple Testing Correction

Problem: Testing multiple hypotheses inflates Type I error (false positives)

Example:

Test 100 peaks at α=0.05
Expected false positives: 100 × 0.05 = 5

Methods

1. Bonferroni Correction

Most Conservative: Divide α by number of tests

Formula:

α_corrected = α / n_tests

Usage:

from statsmodels.stats.multitest import multipletests

# Original p-values from multiple tests
p_values = [0.001, 0.03, 0.15, 0.008, 0.12]

# Bonferroni correction
reject, p_corrected, _, _ = multipletests(
    p_values,
    alpha=0.05,
    method='bonferroni'
)

for i, (p_orig, p_corr, is_sig) in enumerate(zip(p_values, p_corrected, reject)):
    print(f"Test {i+1}: p={p_orig:.4f}, p_corr={p_corr:.4f}, sig={is_sig}")

When to Use:

  • ✓ Small number of tests (< 20)

  • ✓ Want to minimize false positives

  • ✗ Conservative (may miss real effects)

2. Holm-Bonferroni Method

Less Conservative: Step-down Bonferroni

Procedure:

  1. Sort p-values: p₁ ≤ p₂ ≤ … ≤ pₙ

  2. Test p₁ against α/n

  3. If reject, test p₂ against α/(n-1)

  4. Continue until failure to reject

Usage:

reject, p_corrected, _, _ = multipletests(
    p_values,
    alpha=0.05,
    method='holm'
)

Advantages:

  • ✓ More powerful than Bonferroni

  • ✓ Still controls family-wise error rate

3. False Discovery Rate (FDR) - Benjamini-Hochberg

Controls: Proportion of false positives among rejections

Less Conservative: Accepts some false positives for more power

Usage:

reject, p_corrected, _, _ = multipletests(
    p_values,
    alpha=0.05,
    method='fdr_bh'  # Benjamini-Hochberg
)

When to Use:

  • ✓ Large number of tests (e.g., all wavenumbers)

  • ✓ Exploratory analysis

  • ✓ Accept controlled false discovery rate (5%)

Interpretation:

FDR = 0.05 means:
- 5% of "significant" results may be false positives
- More lenient than Bonferroni
- Better for discovery

4. Permutation Testing

Most Rigorous: Non-parametric approach

Procedure:

  1. Compute test statistic on real data

  2. Randomly permute labels many times (e.g., 10,000)

  3. Compute test statistic for each permutation

  4. p-value = proportion of permutations ≥ observed statistic

Usage:

import numpy as np
from scipy import stats

def permutation_test(group1, group2, n_permutations=10000):
    """
    Two-sample permutation test
    """
    # Observed test statistic
    observed_stat = np.abs(np.mean(group1) - np.mean(group2))
    
    # Combine data
    combined = np.concatenate([group1, group2])
    n1 = len(group1)
    
    # Permutations
    perm_stats = []
    for _ in range(n_permutations):
        np.random.shuffle(combined)
        perm_group1 = combined[:n1]
        perm_group2 = combined[n1:]
        perm_stat = np.abs(np.mean(perm_group1) - np.mean(perm_group2))
        perm_stats.append(perm_stat)
    
    # P-value
    p_value = np.mean(np.array(perm_stats) >= observed_stat)
    
    return p_value, perm_stats

# Example
p_perm, perm_dist = permutation_test(group1, group2, n_permutations=10000)
print(f"Permutation p-value: {p_perm:.4f}")

Advantages:

  • ✓ No distribution assumptions

  • ✓ Exact p-values

  • ✓ Flexible (any test statistic)

Disadvantages:

  • ✗ Computationally intensive

  • ✗ Requires sufficient data

Comparison Table

Method

Conservativeness

Power

Best For

Bonferroni

Very high

Low

Few tests, minimize FP

Holm

High

Medium

Few tests, more power

FDR (B-H)

Medium

High

Many tests, discovery

Permutation

Exact

High

Small datasets, no assumptions

Decision Guide

Number of tests:
│
├─ Few (< 20)
│  ├─ Minimize false positives → Bonferroni
│  └─ More power → Holm
│
├─ Many (20-1000)
│  ├─ Exploratory → FDR (Benjamini-Hochberg)
│  └─ Confirmatory → Holm
│
└─ Very many (> 1000, e.g., all wavenumbers)
   └─ FDR (Benjamini-Hochberg)

Effect Size Measures

Why Important: Statistical significance ≠ practical significance

Small p-value with large n → tiny effect can be “significant”
Effect size → How large is the difference?

Cohen’s d

Purpose: Standardized mean difference

Formula:

d = (mean₁ - mean₂) / pooled_SD

Calculation:

import numpy as np

def cohens_d(group1, group2):
    """
    Calculate Cohen's d effect size
    """
    n1, n2 = len(group1), len(group2)
    mean1, mean2 = np.mean(group1), np.mean(group2)
    var1, var2 = np.var(group1, ddof=1), np.var(group2, ddof=1)
    
    # Pooled standard deviation
    pooled_std = np.sqrt(((n1-1)*var1 + (n2-1)*var2) / (n1+n2-2))
    
    # Cohen's d
    d = (mean1 - mean2) / pooled_std
    
    return d

# Example
d = cohens_d(group1, group2)
print(f"Cohen's d: {d:.3f}")

Interpretation:

  • |d| < 0.2: Negligible

  • |d| = 0.2: Small effect

  • |d| = 0.5: Medium effect

  • |d| = 0.8: Large effect

  • |d| > 1.2: Very large effect

Eta-Squared (η²)

Purpose: Proportion of variance explained (ANOVA)

Formula:

η² = SS_between / SS_total

Calculation:

def eta_squared(groups):
    """
    Calculate eta-squared for ANOVA
    """
    # Grand mean
    grand_mean = np.mean(np.concatenate(groups))
    
    # Between-group sum of squares
    SS_between = sum([
        len(g) * (np.mean(g) - grand_mean)**2 
        for g in groups
    ])
    
    # Total sum of squares
    SS_total = sum([
        (x - grand_mean)**2 
        for g in groups 
        for x in g
    ])
    
    # Eta-squared
    eta_sq = SS_between / SS_total
    
    return eta_sq

# Example
eta_sq = eta_squared([group1, group2, group3])
print(f"η²: {eta_sq:.3f}")

Interpretation:

  • η² = 0.01: Small effect (1% variance explained)

  • η² = 0.06: Medium effect (6%)

  • η² = 0.14: Large effect (14%)

Omega-Squared (ω²)

Purpose: Less biased estimate of variance explained

Preferred over η² for smaller samples

Calculation:

def omega_squared(groups):
    """
    Calculate omega-squared (less biased than eta-squared)
    """
    k = len(groups)  # number of groups
    N = sum([len(g) for g in groups])  # total sample size
    
    grand_mean = np.mean(np.concatenate(groups))
    
    SS_between = sum([
        len(g) * (np.mean(g) - grand_mean)**2 
        for g in groups
    ])
    
    SS_within = sum([
        sum((x - np.mean(g))**2 for x in g)
        for g in groups
    ])
    
    MS_within = SS_within / (N - k)
    
    omega_sq = (SS_between - (k - 1) * MS_within) / (SS_between + SS_within + MS_within)
    
    return max(0, omega_sq)  # Can't be negative

omega_sq = omega_squared([group1, group2, group3])
print(f"ω²: {omega_sq:.3f}")

Reporting Effect Sizes

Best Practice: Always report effect size with p-value

Example Report:

"Group A showed significantly higher intensity than Group B 
(t(58) = 3.42, p = 0.001, d = 0.85), representing a large effect."

"ANOVA revealed significant differences among groups 
(F(2, 87) = 12.3, p < 0.001, η² = 0.22), explaining 22% of variance."

Band Ratio Analysis

Purpose: Compare relative intensities of specific spectral bands

Theory

Rationale:

  • Dimensionality reduction (1 feature from 2 peaks)

  • Normalize for concentration/thickness

  • Create interpretable biomarkers

Common Ratios:

Amide I / CH₂:     I₁₆₅₅ / I₁₄₄₅
Protein / Lipid:   I₁₆₅₀ / I₁₃₀₀
Amide III ratio:   I₁₂₉₀ / I₁₂₄₀
Phosphate / Amide: I₉₆₀ / I₁₆₅₀

Calculation

def calculate_band_ratio(spectra, wavenumbers, 
                         band1_range, band2_range):
    """
    Calculate ratio between two spectral bands
    
    Parameters:
    -----------
    spectra : array (n_samples, n_features)
    wavenumbers : array (n_features,)
    band1_range : tuple (start, end) wavenumbers
    band2_range : tuple (start, end) wavenumbers
    
    Returns:
    --------
    ratios : array (n_samples,)
    """
    # Find indices for bands
    idx1 = np.where((wavenumbers >= band1_range[0]) & 
                    (wavenumbers <= band1_range[1]))[0]
    idx2 = np.where((wavenumbers >= band2_range[0]) & 
                    (wavenumbers <= band2_range[1]))[0]
    
    # Integrate (or max) each band
    intensity1 = np.trapz(spectra[:, idx1], wavenumbers[idx1], axis=1)
    intensity2 = np.trapz(spectra[:, idx2], wavenumbers[idx2], axis=1)
    
    # Calculate ratio
    ratios = intensity1 / (intensity2 + 1e-10)  # Avoid division by zero
    
    return ratios

# Example usage
ratios = calculate_band_ratio(
    preprocessed_spectra,
    wavenumbers,
    band1_range=(1645, 1665),  # Amide I
    band2_range=(1440, 1460)   # CH₂
)

# Statistical comparison
from scipy import stats
t_stat, p_value = stats.ttest_ind(
    ratios[labels == 'Control'],
    ratios[labels == 'Disease']
)

print(f"Band ratio test: t={t_stat:.3f}, p={p_value:.4f}")

Integration Methods

1. Trapezoidal Integration (Recommended):

intensity = np.trapz(spectrum[idx], wavenumbers[idx])

2. Maximum Intensity:

intensity = np.max(spectrum[idx])

3. Area Under Curve:

from sklearn.metrics import auc
intensity = auc(wavenumbers[idx], spectrum[idx])

Statistical Testing

# 1. Compare ratios between groups
from scipy import stats

group1_ratios = ratios[labels == 'GroupA']
group2_ratios = ratios[labels == 'GroupB']

# T-test
t_stat, p_value = stats.ttest_ind(group1_ratios, group2_ratios)

# Effect size
d = cohens_d(group1_ratios, group2_ratios)

print(f"t={t_stat:.3f}, p={p_value:.4f}, d={d:.3f}")

# 2. Correlation with clinical variable
r, p = stats.pearsonr(ratios, clinical_scores)
print(f"Correlation: r={r:.3f}, p={p:.4f}")

# 3. ROC analysis (for classification)
from sklearn.metrics import roc_auc_score, roc_curve

# Binary classification
binary_labels = (labels == 'Disease').astype(int)
auc_score = roc_auc_score(binary_labels, ratios)

fpr, tpr, thresholds = roc_curve(binary_labels, ratios)

print(f"AUC: {auc_score:.3f}")

# Find optimal threshold
optimal_idx = np.argmax(tpr - fpr)
optimal_threshold = thresholds[optimal_idx]
print(f"Optimal threshold: {optimal_threshold:.3f}")

Best Practices

  1. Preprocessing:

    • Always baseline-correct before ratio calculation

    • Use same normalization for all spectra

    • Check for negative values

  2. Band Selection:

    • Use literature-validated bands

    • Avoid overlapping bands

    • Verify peaks present in your samples

  3. Statistical Reporting:

    "The Amide I/CH₂ ratio was significantly elevated in disease 
    samples (mean ± SD: 2.34 ± 0.45) compared to controls 
    (1.87 ± 0.32; t(58) = 4.52, p < 0.001, d = 1.17)."
    

Best Practices

General Workflow

1. Check Assumptions
   ├─ Normality (Shapiro-Wilk)
   ├─ Equal variance (Levene's)
   └─ Independence

2. Choose Appropriate Test
   ├─ Parametric (if assumptions met)
   └─ Non-parametric (if violated)

3. Apply Test
   └─ Use two-sided unless justified

4. Multiple Testing Correction
   ├─ Few tests → Bonferroni/Holm
   └─ Many tests → FDR

5. Report Effect Size
   ├─ Cohen's d (t-test)
   ├─ η² or ω² (ANOVA)
   └─ r (correlation)

6. Interpret in Context
   └─ Statistical + Practical significance

Reporting Checklist

Test used and why
Test statistic and degrees of freedom
P-value (exact or < 0.001)
Effect size with interpretation
Sample sizes per group
Descriptive statistics (mean ± SD)
Multiple testing correction method
Assumptions checked (state violations if any)

Common Mistakes to Avoid

Multiple testing without correction → Inflated Type I error
Using parametric tests on non-normal data → Invalid p-values
Reporting only p-values → Missing practical significance
One-sided tests without justification → Inflated significance
Ignoring assumption violations → Misleading results
Cherry-picking significant results → Publication bias


See Also


Last Updated: 2026-01-24