Quick Start

This quick start guide will help you perform your first complete analysis in 15 minutes.

Prerequisites

  • Application installed (see Installation Guide)

  • Sample Raman spectroscopy data (CSV, TXT, ASC/ASCII, or PKL format)

  • Basic understanding of Raman spectroscopy

Tutorial: Analyzing Blood Plasma Samples

This tutorial demonstrates a complete workflow for comparing healthy vs disease samples.

Step 1: Launch and Create Project (2 minutes)

  1. Launch the application

    uv run python main.py  # From source
    # OR
    # Double-click RamanApp.exe  # Portable/Installer
    
  2. Create a new project

    • Click New Project on the Home page

    • Project Name: Blood Plasma Analysis

    • Location: Choose a folder (default is fine)

    • Click Create

  3. Verify project creation

    • You should see the project name in the title bar

    • All tabs (Home, Data, Preprocessing, Analysis, ML) should be visible

Step 2: Import Data (3 minutes)

  1. Navigate to Data Package tab

    • Click the Data Package tab at the top

  2. Import your spectra

    • Click Import Data button

    • Select your data files:

      • CSV: Each column is a spectrum, rows are wavenumbers

      • TXT: Tab or space-separated values

        • ASC/ASCII: Text files with wavenumber + intensity columns

        • PKL: Pickled pandas DataFrame

    • Click Open

  3. Create groups

    • Click Create Group in the left panel

    • Group Name: Healthy

    • Select spectra from healthy samples

    • Click Add to Group

    • Repeat for Disease group

  4. Verify data

    • Preview pane should show all imported spectra

    • Check that wavenumber range is correct (typically 400-1800 cm⁻¹)

    • Verify spectrum count matches your expectations

Step 3: Preprocess Data (5 minutes)

  1. Navigate to Preprocessing tab

  2. Add baseline correction

    • Click ➕ Add Step button

    • Category: Baseline Correction

    • Method: AsLS (Asymmetric Least Squares)

    • Parameters:

      • Lambda: 1e6 (smoothness)

      • P: 0.001 (asymmetry)

    • Preview: Check that fluorescence background is removed

  3. Add smoothing

    • Click ➕ Add Step

    • Category: Smoothing

    • Method: Savitzky-Golay

    • Parameters:

      • Window Length: 11 (must be odd)

      • Polynomial Order: 3

    • Preview: Check that noise is reduced without losing peaks

  4. Add normalization

    • Click ➕ Add Step

    • Category: Normalization

    • Method: Vector Normalization

    • Preview: Check that all spectra have similar intensity scales

  5. Apply pipeline

    • Review the preview of all steps

    • Click Apply Pipeline button

    • Output Name: Preprocessed_Spectra

    • Select All Datasets

    • Click Confirm

    • Wait for processing to complete (~10-30 seconds)

  6. Verify results

    • New dataset Preprocessed_Spectra should appear in Data Package

    • Inspect spectra visually - should be clean and normalized

Step 4: Exploratory Analysis with PCA (3 minutes)

  1. Navigate to Analysis tab

  2. Select PCA method

    • In the method list, click PCA (Principal Component Analysis)

  3. Configure parameters

    • Dataset: Select Preprocessed_Spectra

    • Number of Components: 3

    • Scaling Method: StandardScaler (recommended)

    • Show 95% Confidence Ellipses: ✓ Enable

    • Show Loadings Plot: ✓ Enable

  4. Run analysis

    • Click Run Analysis button

    • Wait for computation (~5-15 seconds)

  5. Interpret results

    • Scores Plot (PC1 vs PC2):

      • Do Healthy and Disease groups separate?

      • Are there any outliers?

    • Scree Plot:

      • How much variance do PC1 and PC2 explain?

      • Typically want >60% for PC1+PC2

    • Loadings Plot:

      • Which wavenumbers (Raman bands) drive the separation?

      • Match peaks to biochemical assignments

  6. Export results

    • Click Export Results button

    • Choose location and filename

    • Saves figures (PNG) and data (CSV)

Step 5: Statistical Testing (2 minutes)

  1. Select statistical test

    • In the method list, click Pairwise Statistical Tests

  2. Configure parameters

    • Dataset: Preprocessed_Spectra

    • Group 1: Healthy

    • Group 2: Disease

    • Test Method: Mann-Whitney U (non-parametric, recommended)

    • Multiple Testing Correction: FDR (Benjamini-Hochberg)

    • Significance Level: 0.05

  3. Run test

    • Click Run Analysis

    • Results show:

      • P-value heatmap across wavenumbers

      • Significant regions highlighted

      • Effect sizes

  4. Interpret results

    • Which wavenumber regions show significant differences?

    • Map significant peaks to biochemical components:

      • 1650 cm⁻¹ → Amide I (proteins)

      • 1440 cm⁻¹ → CH₂ deformation (lipids)

      • 1000 cm⁻¹ → Phenylalanine (aromatic amino acids)

Optional: Machine Learning Classification

If you want to build a classification model:

Step 6: Train ML Model (Optional, +10 minutes)

  1. Navigate to Machine Learning tab

  2. Configure dataset

    • Select Preprocessed_Spectra

    • Groups: Ensure Healthy and Disease are defined

  3. Choose algorithm

    • Algorithm: Random Forest (recommended for beginners)

    • Parameters: Use defaults

  4. Configure validation

    • Method: GroupKFold (prevents data leakage)

    • Number of Folds: 5

    • Test Set Size: 20%

  5. Train model

    • Click Train Model

    • Wait for training (~30 seconds to 2 minutes)

  6. Evaluate results

    • ROC Curve: Check AUC score (>0.90 is excellent)

    • Confusion Matrix: Check classification accuracy

    • SHAP Values: Identify most important wavenumbers

  7. Export model

    • Click Export Model

    • Save trained model for future use

Next Steps

Congratulations! You’ve completed your first analysis. Now explore:

Learn More About Methods

Advanced Workflows

Best Practices

Common Issues

Data Import Problems

Issue: “Unable to read file”
Solution:

  • Check file format (CSV with headers, TXT tab-separated)

  • Ensure numeric data only (remove text annotations)

  • Verify wavenumber range is in first column/row

Issue: “Dimension mismatch”
Solution:

  • All spectra must have same wavenumber range

  • Check for missing data points

  • Ensure consistent sampling intervals

Preprocessing Errors

Issue: “Baseline correction failed”
Solution:

  • Try different method (AsLS, AirPLS, Polynomial)

  • Adjust lambda parameter (increase for smoother baseline)

  • Check for cosmic rays or spikes in raw data

Issue: “Preview is blank”
Solution:

  • Check that input dataset is selected

  • Verify preprocessing parameters are valid

  • Look for error messages in console/log

Analysis Issues

Issue: “Groups don’t separate in PCA”
Solution:

  • Ensure preprocessing is correct (baseline + normalization)

  • Check for outliers and remove bad spectra

  • Try supervised method (PLS-DA) instead of PCA

  • Consider that groups may actually be similar

Issue: “No significant differences found”
Solution:

  • Check sample size (n ≥ 5 per group recommended)

  • Verify groups are correctly assigned

  • Consider more sensitive statistical tests

  • Groups may genuinely not differ

Getting Help

If you encounter issues not covered here:

  1. Check documentation: User Guide and Troubleshooting

  2. Search issues: GitHub Issues

  3. Ask community: GitHub Discussions

  4. Report bug: Create new issue with:

    • Steps to reproduce

    • Error messages

    • Sample data (if possible)

Feedback

Help us improve this quick start guide! Submit suggestions via GitHub Issues with the label documentation.