Statistical Analysis Methods
Comprehensive reference for hypothesis testing and statistical methods.
Table of Contents
T-Tests
Purpose: Compare means of two groups
Types of T-Tests
1. Independent Samples T-Test
Use When: Comparing two independent groups
Assumptions:
✓ Continuous data
✓ Independent samples
✓ Normally distributed (each group)
✓ Equal variances (for Student’s t-test)
Two Variants:
Variant |
Variance Assumption |
When to Use |
|---|---|---|
Student’s t-test |
Equal variances |
Classic version |
Welch’s t-test |
Unequal variances |
Recommended default |
Parameters:
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
bool |
False |
Use equal variance assumption |
|
str |
‘two-sided’ |
Alternative hypothesis |
Usage Example:
from scipy import stats
# Welch's t-test (recommended)
t_stat, p_value = stats.ttest_ind(
group1_data,
group2_data,
equal_var=False, # Welch's
alternative='two-sided'
)
print(f"t-statistic: {t_stat:.3f}")
print(f"p-value: {p_value:.4f}")
# Interpretation
alpha = 0.05
if p_value < alpha:
print("Significant difference between groups")
else:
print("No significant difference")
Alternative Hypotheses:
# Two-sided (default): Groups are different
alternative='two-sided'
# One-sided: Group 1 > Group 2
alternative='greater'
# One-sided: Group 1 < Group 2
alternative='less'
Checking Assumptions:
# 1. Normality test (Shapiro-Wilk)
stat1, p1 = stats.shapiro(group1_data)
stat2, p2 = stats.shapiro(group2_data)
if p1 < 0.05 or p2 < 0.05:
print("Warning: Data not normally distributed")
print("Consider: Mann-Whitney U test instead")
# 2. Equal variance test (Levene's test)
stat, p = stats.levene(group1_data, group2_data)
if p < 0.05:
print("Unequal variances detected")
print("Use: Welch's t-test (equal_var=False)")
else:
print("Equal variances")
print("Can use: Student's t-test (equal_var=True)")
2. Paired Samples T-Test
Use When: Same subjects measured twice (before/after)
Example: Same sample measured at two time points
Usage Example:
from scipy import stats
# Paired t-test
t_stat, p_value = stats.ttest_rel(
before_treatment,
after_treatment
)
print(f"Paired t-statistic: {t_stat:.3f}")
print(f"p-value: {p_value:.4f}")
Assumptions:
✓ Paired observations (same subjects)
✓ Differences normally distributed
✓ Continuous data
3. One-Sample T-Test
Use When: Compare sample mean to known value
Usage Example:
from scipy import stats
# Test if mean differs from theoretical value
reference_value = 0
t_stat, p_value = stats.ttest_1samp(
data,
reference_value
)
Interpretation
P-Value:
p < 0.05: Significant difference (reject null hypothesis)
p ≥ 0.05: No significant difference (fail to reject null)
T-Statistic:
Large |t|: Groups are far apart
Small |t|: Groups are similar
Confidence Interval (95%):
from scipy import stats
import numpy as np
# Calculate 95% CI for mean difference
diff = np.mean(group1) - np.mean(group2)
se = stats.sem(np.concatenate([group1, group2]))
ci = stats.t.interval(0.95, len(group1)+len(group2)-2,
loc=diff, scale=se)
print(f"Mean difference: {diff:.3f}")
print(f"95% CI: [{ci[0]:.3f}, {ci[1]:.3f}]")
# If CI doesn't include 0 → significant difference
Troubleshooting
Issue |
Cause |
Solution |
|---|---|---|
Non-normal data |
Assumption violated |
Use Mann-Whitney U |
Unequal variances |
Assumption violated |
Use Welch’s t-test (equal_var=False) |
Small sample size |
Low power |
Increase n or use non-parametric |
Outliers present |
Skewed distribution |
Remove outliers or use robust test |
Multiple comparisons |
Inflated Type I error |
Apply Bonferroni correction |
When to Use
Use T-Test when:
✓ Two groups to compare
✓ Continuous data
✓ Approximately normal distribution
✓ Independent observations
Use alternatives when:
✗ Non-normal data → Mann-Whitney U
✗ More than 2 groups → ANOVA (or pairwise tests)
✗ Paired data → Paired t-test or Wilcoxon signed-rank
Mann-Whitney U Test
Alternative Names: Wilcoxon rank-sum test
Purpose: Non-parametric alternative to independent t-test
Theory
How It Works:
Rank all observations (both groups combined)
Sum ranks for each group
Test if rank sums differ significantly
Key Advantage: No normality assumption
Assumptions
✓ Independent observations
✓ Ordinal or continuous data
✓ Similar distributions (for testing medians)
Does NOT require:
✗ Normal distribution
✗ Equal variances
Usage Example
from scipy import stats
# Mann-Whitney U test
u_stat, p_value = stats.mannwhitneyu(
group1_data,
group2_data,
alternative='two-sided'
)
print(f"U-statistic: {u_stat:.1f}")
print(f"p-value: {p_value:.4f}")
# Effect size (rank-biserial correlation)
n1, n2 = len(group1_data), len(group2_data)
r = 1 - (2*u_stat) / (n1 * n2)
print(f"Effect size (r): {r:.3f}")
Interpretation
Null Hypothesis: Distributions are identical
P-Value:
p < 0.05: Distributions differ significantly
p ≥ 0.05: No significant difference
Effect Size ®:
r = 0: No effect
r = ±0.1: Small effect
r = ±0.3: Medium effect
r = ±0.5: Large effect
When to Use
Use Mann-Whitney when:
✓ Non-normal data
✓ Ordinal data (e.g., Likert scales)
✓ Outliers present
✓ Small sample sizes
✓ T-test assumptions violated
Use T-Test instead when:
✓ Normal data
✓ Large samples (n > 30 per group)
✓ Want parametric inference
ANOVA (Analysis of Variance)
Purpose: Compare means across 3+ groups
In this application: ANOVA is performed at each wavenumber across selected groups (datasets or grouped labels). Because this produces many tests (one per wavenumber), the app provides multiple-testing correction.
One-Way ANOVA
Use When: One categorical factor, 3+ groups
Assumptions:
✓ Independent observations
✓ Normal distribution (each group)
✓ Equal variances (homoscedasticity)
Usage Example
from scipy import stats
# One-way ANOVA
f_stat, p_value = stats.f_oneway(
group1_data,
group2_data,
group3_data
)
print(f"F-statistic: {f_stat:.3f}")
print(f"p-value: {p_value:.4f}")
if p_value < 0.05:
print("Significant difference among groups")
Application Parameters
Parameter |
Default |
Description |
|---|---|---|
|
0.05 |
Significance level (threshold applied to raw p-values or corrected values) |
|
|
Multiple-testing correction across wavenumbers: |
|
|
Optional post-hoc testing for significant wavenumbers: |
|
20 |
Cap to keep post-hoc computation responsive (0 disables) |
|
True |
Show mean spectra overlay by group |
|
True |
Highlight significant wavenumber regions on the overlay |
Grouped Mode vs Simple Mode
Simple mode: each selected dataset is treated as one group.
Grouped mode: datasets are mapped into user-defined groups (same grouping shared with the ML page). ANOVA is run across those group labels.
Post-Hoc Tests (Optional)
Why needed: ANOVA tells you “at least one group differs”, but not which pairs differ.
The application can optionally compute Tukey’s HSD for a limited number of significant wavenumbers (controlled by max_posthoc_wavenumbers).
Why Needed: ANOVA tells you groups differ, but not which specific pairs are different
Common Post-Hoc Methods:
Tukey’s HSD: Controls family-wise error rate (most conservative)
Bonferroni: Simple adjustment, divides α by number of comparisons
Holm-Bonferroni: Less conservative step-down procedure
For two-group comparisons, use the Pairwise Statistical Tests feature in the application instead.
Checking Assumptions
# 1. Normality (each group)
from scipy import stats
for i, group in enumerate([group1, group2, group3]):
stat, p = stats.shapiro(group)
print(f"Group {i+1} normality: p={p:.4f}")
if p < 0.05:
print(" → Not normal, consider Mann-Whitney for pairwise")
# 2. Homogeneity of variance (Levene's test)
stat, p = stats.levene(group1, group2, group3)
print(f"\nLevene's test: p={p:.4f}")
if p < 0.05:
print(" → Unequal variances")
print(" → Consider Welch's ANOVA")
Welch’s ANOVA
When: Unequal variances detected
# Welch's ANOVA (unequal variances)
# Not directly in scipy, use one-way ANOVA with robust method
from scipy import stats
# Alternative: Use stats library with robust option
# or implement Welch's ANOVA manually
Interpretation
F-Statistic:
Large F: Groups differ substantially
Small F: Groups similar
P-Value:
p < 0.05: At least one group differs
p ≥ 0.05: No significant differences
Effect Size (Eta-Squared):
# Calculate eta-squared (proportion of variance explained)
SS_between = sum([len(g) * (np.mean(g) - grand_mean)**2
for g in [group1, group2, group3]])
SS_total = sum([(x - grand_mean)**2
for g in [group1, group2, group3]
for x in g])
eta_squared = SS_between / SS_total
print(f"η² = {eta_squared:.3f}")
# Interpretation:
# η² = 0.01: Small effect
# η² = 0.06: Medium effect
# η² = 0.14: Large effect
When to Use
Use ANOVA when:
✓ 3+ groups to compare
✓ One categorical factor
✓ Continuous outcome
✓ Assumptions met
Use alternatives when:
✗ Two groups only → T-test
✗ Non-normal data → Mann-Whitney U test (for 2 groups) or non-parametric methods
✗ Unequal variances → Welch’s ANOVA
✗ Repeated measures → Repeated-measures ANOVA
Note: For exactly 2 groups, use Pairwise Statistical Tests (Welch t-test / Mann–Whitney / Wilcoxon) with optional FDR correction.
Pairwise Statistical Tests
Purpose: Compare two groups at each wavenumber across the spectrum
Available Tests
The application provides three statistical tests for pairwise comparisons:
1. Independent T-Test (t_test)
Use When: Comparing two independent groups, data approximately normal
Assumptions:
✓ Independent samples
✓ Approximately normal distribution
✓ Continuous data
Parameters:
alpha: Significance level (default: 0.05)fdr_correction: Apply FDR correction for multiple testing (default: True)
2. Mann-Whitney U Test (mann_whitney)
Use When: Non-parametric alternative to t-test
Advantages:
✓ No normality assumption
✓ Robust to outliers
✓ Works with small samples
Parameters:
alpha: Significance level (default: 0.05)fdr_correction: Apply FDR correction (default: True)
3. Wilcoxon Signed-Rank Test (wilcoxon)
Use When: Comparing paired samples (e.g., before/after measurements)
Assumptions:
✓ Paired observations
✓ Ordinal or continuous data
✓ Symmetric distribution of differences
Usage in Application
Analysis Page → Statistical Methods → Pairwise Statistical Tests
1. Select exactly 2 datasets to compare
2. Choose test type: t_test, mann_whitney, or wilcoxon
3. Set significance level (α)
4. Enable/disable FDR correction
5. Click "Run Analysis"
Interpretation
Output Includes:
P-value plot: Shows p-values across all wavenumbers
Significance markers: Regions where p < α (after correction)
Mean overlay: Mean spectra of both groups
Summary statistics: Number of significant wavenumbers
Example Interpretation:
Results: Group A vs Group B
- Significant wavenumbers: 145 out of 1400
- FDR-corrected α: 0.05
- Key regions: 1000-1100 cm⁻¹, 1600-1700 cm⁻¹
Interpretation: Groups differ significantly in protein (Amide I, 1650 cm⁻¹)
and lipid regions (1000-1100 cm⁻¹).
Best Practices
✅ Recommended:
Always use FDR correction when testing across full spectrum
Use Mann-Whitney for non-normal data
Report both p-values and effect sizes
Verify biological relevance of significant regions
❌ Avoid:
Testing without multiple testing correction
Ignoring assumption violations
Over-interpreting borderline significance
Comparing more than 2 groups (use appropriate multi-group method instead)
Correlation Analysis
Purpose: Measure association between two variables
Pearson Correlation
Measures: Linear relationship strength
Assumptions:
✓ Continuous data
✓ Linear relationship
✓ Bivariate normal distribution
✓ No outliers
Usage Example
from scipy import stats
# Pearson correlation
r, p_value = stats.pearsonr(variable1, variable2)
print(f"Pearson r: {r:.3f}")
print(f"p-value: {p_value:.4f}")
# Interpretation
if p_value < 0.05:
if r > 0:
print("Significant positive correlation")
else:
print("Significant negative correlation")
Interpretation of r:
r = 1: Perfect positive correlation
r = 0.7-0.9: Strong correlation
r = 0.4-0.6: Moderate correlation
r = 0.1-0.3: Weak correlation
r = 0: No correlation
r = -1: Perfect negative correlation
Coefficient of Determination (R²):
r_squared = r ** 2
print(f"R² = {r_squared:.3f}")
# R² = proportion of variance explained
Spearman Correlation
Measures: Monotonic relationship strength (non-parametric)
Advantages:
✓ No normality assumption
✓ Robust to outliers
✓ Detects non-linear monotonic relationships
Usage Example:
from scipy import stats
# Spearman correlation
rho, p_value = stats.spearmanr(variable1, variable2)
print(f"Spearman ρ: {rho:.3f}")
print(f"p-value: {p_value:.4f}")
When to Use:
✓ Ordinal data
✓ Non-linear but monotonic relationship
✓ Outliers present
✓ Non-normal data
Kendall’s Tau
Measures: Ordinal association (non-parametric)
Advantages:
✓ Better for small samples
✓ More robust than Spearman
Usage Example:
from scipy import stats
# Kendall's tau
tau, p_value = stats.kendalltau(variable1, variable2)
print(f"Kendall τ: {tau:.3f}")
print(f"p-value: {p_value:.4f}")
Partial Correlation
Purpose: Correlation between X and Y, controlling for Z
from scipy import stats
import numpy as np
def partial_correlation(x, y, z):
"""
Partial correlation between x and y, controlling for z
"""
# Correlations
r_xy = stats.pearsonr(x, y)[0]
r_xz = stats.pearsonr(x, z)[0]
r_yz = stats.pearsonr(y, z)[0]
# Partial correlation
numerator = r_xy - (r_xz * r_yz)
denominator = np.sqrt((1 - r_xz**2) * (1 - r_yz**2))
r_xy_z = numerator / denominator
# Significance test
n = len(x)
t_stat = r_xy_z * np.sqrt((n - 3) / (1 - r_xy_z**2))
p_value = 2 * (1 - stats.t.cdf(abs(t_stat), n - 3))
return r_xy_z, p_value
# Example usage
r_partial, p = partial_correlation(
peak1_intensity,
peak2_intensity,
concentration
)
print(f"Partial correlation: {r_partial:.3f}, p={p:.4f}")
Correlation Matrix
Purpose: Correlations between multiple variables
import numpy as np
import pandas as pd
from scipy import stats
# Create correlation matrix
data = pd.DataFrame({
'Peak1': peak1_intensities,
'Peak2': peak2_intensities,
'Peak3': peak3_intensities
})
# Pearson correlation matrix
corr_matrix = data.corr(method='pearson')
print(corr_matrix)
# P-values for correlations
from scipy.stats import pearsonr
n = len(data.columns)
p_values = np.zeros((n, n))
for i in range(n):
for j in range(n):
if i != j:
_, p_values[i, j] = pearsonr(
data.iloc[:, i],
data.iloc[:, j]
)
p_df = pd.DataFrame(p_values,
columns=data.columns,
index=data.columns)
print("\nP-values:")
print(p_df)
# Visualization
import seaborn as sns
import matplotlib.pyplot as plt
plt.figure(figsize=(8, 6))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm',
vmin=-1, vmax=1, center=0,
square=True, linewidths=1)
plt.title('Correlation Matrix')
plt.tight_layout()
plt.show()
Troubleshooting
Issue |
Cause |
Solution |
|---|---|---|
Low correlation but clear relationship |
Non-linear |
Use Spearman or visualize |
Significant but r near 0 |
Large sample size |
Check practical significance |
High correlation, not significant |
Small sample |
Increase n |
Spurious correlation |
Confounding variable |
Use partial correlation |
Correlation ≠ Causation |
Design experiment or use causal inference |
Multiple Testing Correction
Problem: Testing multiple hypotheses inflates Type I error (false positives)
Example:
Test 100 peaks at α=0.05
Expected false positives: 100 × 0.05 = 5
Methods
1. Bonferroni Correction
Most Conservative: Divide α by number of tests
Formula:
α_corrected = α / n_tests
Usage:
from statsmodels.stats.multitest import multipletests
# Original p-values from multiple tests
p_values = [0.001, 0.03, 0.15, 0.008, 0.12]
# Bonferroni correction
reject, p_corrected, _, _ = multipletests(
p_values,
alpha=0.05,
method='bonferroni'
)
for i, (p_orig, p_corr, is_sig) in enumerate(zip(p_values, p_corrected, reject)):
print(f"Test {i+1}: p={p_orig:.4f}, p_corr={p_corr:.4f}, sig={is_sig}")
When to Use:
✓ Small number of tests (< 20)
✓ Want to minimize false positives
✗ Conservative (may miss real effects)
2. Holm-Bonferroni Method
Less Conservative: Step-down Bonferroni
Procedure:
Sort p-values: p₁ ≤ p₂ ≤ … ≤ pₙ
Test p₁ against α/n
If reject, test p₂ against α/(n-1)
Continue until failure to reject
Usage:
reject, p_corrected, _, _ = multipletests(
p_values,
alpha=0.05,
method='holm'
)
Advantages:
✓ More powerful than Bonferroni
✓ Still controls family-wise error rate
3. False Discovery Rate (FDR) - Benjamini-Hochberg
Controls: Proportion of false positives among rejections
Less Conservative: Accepts some false positives for more power
Usage:
reject, p_corrected, _, _ = multipletests(
p_values,
alpha=0.05,
method='fdr_bh' # Benjamini-Hochberg
)
When to Use:
✓ Large number of tests (e.g., all wavenumbers)
✓ Exploratory analysis
✓ Accept controlled false discovery rate (5%)
Interpretation:
FDR = 0.05 means:
- 5% of "significant" results may be false positives
- More lenient than Bonferroni
- Better for discovery
4. Permutation Testing
Most Rigorous: Non-parametric approach
Procedure:
Compute test statistic on real data
Randomly permute labels many times (e.g., 10,000)
Compute test statistic for each permutation
p-value = proportion of permutations ≥ observed statistic
Usage:
import numpy as np
from scipy import stats
def permutation_test(group1, group2, n_permutations=10000):
"""
Two-sample permutation test
"""
# Observed test statistic
observed_stat = np.abs(np.mean(group1) - np.mean(group2))
# Combine data
combined = np.concatenate([group1, group2])
n1 = len(group1)
# Permutations
perm_stats = []
for _ in range(n_permutations):
np.random.shuffle(combined)
perm_group1 = combined[:n1]
perm_group2 = combined[n1:]
perm_stat = np.abs(np.mean(perm_group1) - np.mean(perm_group2))
perm_stats.append(perm_stat)
# P-value
p_value = np.mean(np.array(perm_stats) >= observed_stat)
return p_value, perm_stats
# Example
p_perm, perm_dist = permutation_test(group1, group2, n_permutations=10000)
print(f"Permutation p-value: {p_perm:.4f}")
Advantages:
✓ No distribution assumptions
✓ Exact p-values
✓ Flexible (any test statistic)
Disadvantages:
✗ Computationally intensive
✗ Requires sufficient data
Comparison Table
Method |
Conservativeness |
Power |
Best For |
|---|---|---|---|
Bonferroni |
Very high |
Low |
Few tests, minimize FP |
Holm |
High |
Medium |
Few tests, more power |
FDR (B-H) |
Medium |
High |
Many tests, discovery |
Permutation |
Exact |
High |
Small datasets, no assumptions |
Decision Guide
Number of tests:
│
├─ Few (< 20)
│ ├─ Minimize false positives → Bonferroni
│ └─ More power → Holm
│
├─ Many (20-1000)
│ ├─ Exploratory → FDR (Benjamini-Hochberg)
│ └─ Confirmatory → Holm
│
└─ Very many (> 1000, e.g., all wavenumbers)
└─ FDR (Benjamini-Hochberg)
Effect Size Measures
Why Important: Statistical significance ≠ practical significance
Small p-value with large n → tiny effect can be “significant”
Effect size → How large is the difference?
Cohen’s d
Purpose: Standardized mean difference
Formula:
d = (mean₁ - mean₂) / pooled_SD
Calculation:
import numpy as np
def cohens_d(group1, group2):
"""
Calculate Cohen's d effect size
"""
n1, n2 = len(group1), len(group2)
mean1, mean2 = np.mean(group1), np.mean(group2)
var1, var2 = np.var(group1, ddof=1), np.var(group2, ddof=1)
# Pooled standard deviation
pooled_std = np.sqrt(((n1-1)*var1 + (n2-1)*var2) / (n1+n2-2))
# Cohen's d
d = (mean1 - mean2) / pooled_std
return d
# Example
d = cohens_d(group1, group2)
print(f"Cohen's d: {d:.3f}")
Interpretation:
|d| < 0.2: Negligible
|d| = 0.2: Small effect
|d| = 0.5: Medium effect
|d| = 0.8: Large effect
|d| > 1.2: Very large effect
Eta-Squared (η²)
Purpose: Proportion of variance explained (ANOVA)
Formula:
η² = SS_between / SS_total
Calculation:
def eta_squared(groups):
"""
Calculate eta-squared for ANOVA
"""
# Grand mean
grand_mean = np.mean(np.concatenate(groups))
# Between-group sum of squares
SS_between = sum([
len(g) * (np.mean(g) - grand_mean)**2
for g in groups
])
# Total sum of squares
SS_total = sum([
(x - grand_mean)**2
for g in groups
for x in g
])
# Eta-squared
eta_sq = SS_between / SS_total
return eta_sq
# Example
eta_sq = eta_squared([group1, group2, group3])
print(f"η²: {eta_sq:.3f}")
Interpretation:
η² = 0.01: Small effect (1% variance explained)
η² = 0.06: Medium effect (6%)
η² = 0.14: Large effect (14%)
Omega-Squared (ω²)
Purpose: Less biased estimate of variance explained
Preferred over η² for smaller samples
Calculation:
def omega_squared(groups):
"""
Calculate omega-squared (less biased than eta-squared)
"""
k = len(groups) # number of groups
N = sum([len(g) for g in groups]) # total sample size
grand_mean = np.mean(np.concatenate(groups))
SS_between = sum([
len(g) * (np.mean(g) - grand_mean)**2
for g in groups
])
SS_within = sum([
sum((x - np.mean(g))**2 for x in g)
for g in groups
])
MS_within = SS_within / (N - k)
omega_sq = (SS_between - (k - 1) * MS_within) / (SS_between + SS_within + MS_within)
return max(0, omega_sq) # Can't be negative
omega_sq = omega_squared([group1, group2, group3])
print(f"ω²: {omega_sq:.3f}")
Reporting Effect Sizes
Best Practice: Always report effect size with p-value
Example Report:
"Group A showed significantly higher intensity than Group B
(t(58) = 3.42, p = 0.001, d = 0.85), representing a large effect."
"ANOVA revealed significant differences among groups
(F(2, 87) = 12.3, p < 0.001, η² = 0.22), explaining 22% of variance."
Band Ratio Analysis
Purpose: Compare relative intensities of specific spectral bands
Theory
Rationale:
Dimensionality reduction (1 feature from 2 peaks)
Normalize for concentration/thickness
Create interpretable biomarkers
Common Ratios:
Amide I / CH₂: I₁₆₅₅ / I₁₄₄₅
Protein / Lipid: I₁₆₅₀ / I₁₃₀₀
Amide III ratio: I₁₂₉₀ / I₁₂₄₀
Phosphate / Amide: I₉₆₀ / I₁₆₅₀
Calculation
def calculate_band_ratio(spectra, wavenumbers,
band1_range, band2_range):
"""
Calculate ratio between two spectral bands
Parameters:
-----------
spectra : array (n_samples, n_features)
wavenumbers : array (n_features,)
band1_range : tuple (start, end) wavenumbers
band2_range : tuple (start, end) wavenumbers
Returns:
--------
ratios : array (n_samples,)
"""
# Find indices for bands
idx1 = np.where((wavenumbers >= band1_range[0]) &
(wavenumbers <= band1_range[1]))[0]
idx2 = np.where((wavenumbers >= band2_range[0]) &
(wavenumbers <= band2_range[1]))[0]
# Integrate (or max) each band
intensity1 = np.trapz(spectra[:, idx1], wavenumbers[idx1], axis=1)
intensity2 = np.trapz(spectra[:, idx2], wavenumbers[idx2], axis=1)
# Calculate ratio
ratios = intensity1 / (intensity2 + 1e-10) # Avoid division by zero
return ratios
# Example usage
ratios = calculate_band_ratio(
preprocessed_spectra,
wavenumbers,
band1_range=(1645, 1665), # Amide I
band2_range=(1440, 1460) # CH₂
)
# Statistical comparison
from scipy import stats
t_stat, p_value = stats.ttest_ind(
ratios[labels == 'Control'],
ratios[labels == 'Disease']
)
print(f"Band ratio test: t={t_stat:.3f}, p={p_value:.4f}")
Integration Methods
1. Trapezoidal Integration (Recommended):
intensity = np.trapz(spectrum[idx], wavenumbers[idx])
2. Maximum Intensity:
intensity = np.max(spectrum[idx])
3. Area Under Curve:
from sklearn.metrics import auc
intensity = auc(wavenumbers[idx], spectrum[idx])
Statistical Testing
# 1. Compare ratios between groups
from scipy import stats
group1_ratios = ratios[labels == 'GroupA']
group2_ratios = ratios[labels == 'GroupB']
# T-test
t_stat, p_value = stats.ttest_ind(group1_ratios, group2_ratios)
# Effect size
d = cohens_d(group1_ratios, group2_ratios)
print(f"t={t_stat:.3f}, p={p_value:.4f}, d={d:.3f}")
# 2. Correlation with clinical variable
r, p = stats.pearsonr(ratios, clinical_scores)
print(f"Correlation: r={r:.3f}, p={p:.4f}")
# 3. ROC analysis (for classification)
from sklearn.metrics import roc_auc_score, roc_curve
# Binary classification
binary_labels = (labels == 'Disease').astype(int)
auc_score = roc_auc_score(binary_labels, ratios)
fpr, tpr, thresholds = roc_curve(binary_labels, ratios)
print(f"AUC: {auc_score:.3f}")
# Find optimal threshold
optimal_idx = np.argmax(tpr - fpr)
optimal_threshold = thresholds[optimal_idx]
print(f"Optimal threshold: {optimal_threshold:.3f}")
Best Practices
Preprocessing:
Always baseline-correct before ratio calculation
Use same normalization for all spectra
Check for negative values
Band Selection:
Use literature-validated bands
Avoid overlapping bands
Verify peaks present in your samples
Statistical Reporting:
"The Amide I/CH₂ ratio was significantly elevated in disease samples (mean ± SD: 2.34 ± 0.45) compared to controls (1.87 ± 0.32; t(58) = 4.52, p < 0.001, d = 1.17)."
Best Practices
General Workflow
1. Check Assumptions
├─ Normality (Shapiro-Wilk)
├─ Equal variance (Levene's)
└─ Independence
2. Choose Appropriate Test
├─ Parametric (if assumptions met)
└─ Non-parametric (if violated)
3. Apply Test
└─ Use two-sided unless justified
4. Multiple Testing Correction
├─ Few tests → Bonferroni/Holm
└─ Many tests → FDR
5. Report Effect Size
├─ Cohen's d (t-test)
├─ η² or ω² (ANOVA)
└─ r (correlation)
6. Interpret in Context
└─ Statistical + Practical significance
Reporting Checklist
✓ Test used and why
✓ Test statistic and degrees of freedom
✓ P-value (exact or < 0.001)
✓ Effect size with interpretation
✓ Sample sizes per group
✓ Descriptive statistics (mean ± SD)
✓ Multiple testing correction method
✓ Assumptions checked (state violations if any)
Common Mistakes to Avoid
❌ Multiple testing without correction → Inflated Type I error
❌ Using parametric tests on non-normal data → Invalid p-values
❌ Reporting only p-values → Missing practical significance
❌ One-sided tests without justification → Inflated significance
❌ Ignoring assumption violations → Misleading results
❌ Cherry-picking significant results → Publication bias
See Also
Statistical Analysis User Guide - Step-by-step tutorials
Exploratory Methods - PCA, UMAP, clustering
Machine Learning Methods - Classification algorithms
Best Practices - Analysis strategies
Last Updated: 2026-01-24