Quick Start
This quick start guide will help you perform your first complete analysis in 15 minutes.
Prerequisites
Application installed (see Installation Guide)
Sample Raman spectroscopy data (CSV, TXT, ASC/ASCII, or PKL format)
Basic understanding of Raman spectroscopy
Tutorial: Analyzing Blood Plasma Samples
This tutorial demonstrates a complete workflow for comparing healthy vs disease samples.
Step 1: Launch and Create Project (2 minutes)
Launch the application
uv run python main.py # From source # OR # Double-click RamanApp.exe # Portable/Installer
Create a new project
Click New Project on the Home page
Project Name:
Blood Plasma AnalysisLocation: Choose a folder (default is fine)
Click Create
Verify project creation
You should see the project name in the title bar
All tabs (Home, Data, Preprocessing, Analysis, ML) should be visible
Step 2: Import Data (3 minutes)
Navigate to Data Package tab
Click the Data Package tab at the top
Import your spectra
Click Import Data button
Select your data files:
CSV: Each column is a spectrum, rows are wavenumbers
TXT: Tab or space-separated values
ASC/ASCII: Text files with wavenumber + intensity columns
PKL: Pickled pandas DataFrame
Click Open
Create groups
Click Create Group in the left panel
Group Name:
HealthySelect spectra from healthy samples
Click Add to Group
Repeat for
Diseasegroup
Verify data
Preview pane should show all imported spectra
Check that wavenumber range is correct (typically 400-1800 cm⁻¹)
Verify spectrum count matches your expectations
Step 3: Preprocess Data (5 minutes)
Navigate to Preprocessing tab
Add baseline correction
Click ➕ Add Step button
Category: Baseline Correction
Method: AsLS (Asymmetric Least Squares)
Parameters:
Lambda:
1e6(smoothness)P:
0.001(asymmetry)
Preview: Check that fluorescence background is removed
Add smoothing
Click ➕ Add Step
Category: Smoothing
Method: Savitzky-Golay
Parameters:
Window Length:
11(must be odd)Polynomial Order:
3
Preview: Check that noise is reduced without losing peaks
Add normalization
Click ➕ Add Step
Category: Normalization
Method: Vector Normalization
Preview: Check that all spectra have similar intensity scales
Apply pipeline
Review the preview of all steps
Click Apply Pipeline button
Output Name:
Preprocessed_SpectraSelect All Datasets
Click Confirm
Wait for processing to complete (~10-30 seconds)
Verify results
New dataset
Preprocessed_Spectrashould appear in Data PackageInspect spectra visually - should be clean and normalized
Step 4: Exploratory Analysis with PCA (3 minutes)
Navigate to Analysis tab
Select PCA method
In the method list, click PCA (Principal Component Analysis)
Configure parameters
Dataset: Select
Preprocessed_SpectraNumber of Components:
3Scaling Method:
StandardScaler(recommended)Show 95% Confidence Ellipses: ✓ Enable
Show Loadings Plot: ✓ Enable
Run analysis
Click Run Analysis button
Wait for computation (~5-15 seconds)
Interpret results
Scores Plot (PC1 vs PC2):
Do Healthy and Disease groups separate?
Are there any outliers?
Scree Plot:
How much variance do PC1 and PC2 explain?
Typically want >60% for PC1+PC2
Loadings Plot:
Which wavenumbers (Raman bands) drive the separation?
Match peaks to biochemical assignments
Export results
Click Export Results button
Choose location and filename
Saves figures (PNG) and data (CSV)
Step 5: Statistical Testing (2 minutes)
Select statistical test
In the method list, click Pairwise Statistical Tests
Configure parameters
Dataset:
Preprocessed_SpectraGroup 1:
HealthyGroup 2:
DiseaseTest Method:
Mann-Whitney U(non-parametric, recommended)Multiple Testing Correction:
FDR (Benjamini-Hochberg)Significance Level:
0.05
Run test
Click Run Analysis
Results show:
P-value heatmap across wavenumbers
Significant regions highlighted
Effect sizes
Interpret results
Which wavenumber regions show significant differences?
Map significant peaks to biochemical components:
1650 cm⁻¹ → Amide I (proteins)
1440 cm⁻¹ → CH₂ deformation (lipids)
1000 cm⁻¹ → Phenylalanine (aromatic amino acids)
Optional: Machine Learning Classification
If you want to build a classification model:
Step 6: Train ML Model (Optional, +10 minutes)
Navigate to Machine Learning tab
Configure dataset
Select
Preprocessed_SpectraGroups: Ensure
HealthyandDiseaseare defined
Choose algorithm
Algorithm: Random Forest (recommended for beginners)
Parameters: Use defaults
Configure validation
Method: GroupKFold (prevents data leakage)
Number of Folds:
5Test Set Size:
20%
Train model
Click Train Model
Wait for training (~30 seconds to 2 minutes)
Evaluate results
ROC Curve: Check AUC score (>0.90 is excellent)
Confusion Matrix: Check classification accuracy
SHAP Values: Identify most important wavenumbers
Export model
Click Export Model
Save trained model for future use
Next Steps
Congratulations! You’ve completed your first analysis. Now explore:
Learn More About Methods
Preprocessing Methods - Complete preprocessing reference
PCA Guide - Deep dive into PCA theory and interpretation
Statistical Tests - All available statistical methods
Machine Learning - Complete ML pipeline guide
Advanced Workflows
Multi-Group Comparison - Compare >2 groups
Custom Pipelines - Build complex preprocessing workflows
Batch Processing - Process multiple datasets
Hyperparameter Optimization - Optimize ML models
Best Practices
Data Quality - Ensure clean data
Avoiding Data Leakage - Proper train/test splitting
Publication-Ready Figures - Export high-quality plots
Reproducible Workflows - Document your analysis
Common Issues
Data Import Problems
Issue: “Unable to read file”
Solution:
Check file format (CSV with headers, TXT tab-separated)
Ensure numeric data only (remove text annotations)
Verify wavenumber range is in first column/row
Issue: “Dimension mismatch”
Solution:
All spectra must have same wavenumber range
Check for missing data points
Ensure consistent sampling intervals
Preprocessing Errors
Issue: “Baseline correction failed”
Solution:
Try different method (AsLS, AirPLS, Polynomial)
Adjust lambda parameter (increase for smoother baseline)
Check for cosmic rays or spikes in raw data
Issue: “Preview is blank”
Solution:
Check that input dataset is selected
Verify preprocessing parameters are valid
Look for error messages in console/log
Analysis Issues
Issue: “Groups don’t separate in PCA”
Solution:
Ensure preprocessing is correct (baseline + normalization)
Check for outliers and remove bad spectra
Try supervised method (PLS-DA) instead of PCA
Consider that groups may actually be similar
Issue: “No significant differences found”
Solution:
Check sample size (n ≥ 5 per group recommended)
Verify groups are correctly assigned
Consider more sensitive statistical tests
Groups may genuinely not differ
Getting Help
If you encounter issues not covered here:
Check documentation: User Guide and Troubleshooting
Search issues: GitHub Issues
Ask community: GitHub Discussions
Report bug: Create new issue with:
Steps to reproduce
Error messages
Sample data (if possible)
Feedback
Help us improve this quick start guide! Submit suggestions via GitHub Issues with the label documentation.