# Quick Start This quick start guide will help you perform your first complete analysis in **15 minutes**. ## Prerequisites - Application installed (see [Installation Guide](installation.md)) - Sample Raman spectroscopy data (CSV, TXT, ASC/ASCII, or PKL format) - Basic understanding of Raman spectroscopy ## Tutorial: Analyzing Blood Plasma Samples This tutorial demonstrates a complete workflow for comparing healthy vs disease samples. ### Step 1: Launch and Create Project (2 minutes) 1. **Launch the application** ```bash uv run python main.py # From source # OR # Double-click RamanApp.exe # Portable/Installer ``` 2. **Create a new project** - Click **New Project** on the Home page - **Project Name**: `Blood Plasma Analysis` - **Location**: Choose a folder (default is fine) - Click **Create** 3. **Verify project creation** - You should see the project name in the title bar - All tabs (Home, Data, Preprocessing, Analysis, ML) should be visible ### Step 2: Import Data (3 minutes) 1. **Navigate to Data Package tab** - Click the **Data Package** tab at the top 2. **Import your spectra** - Click **Import Data** button - Select your data files: - **CSV**: Each column is a spectrum, rows are wavenumbers - **TXT**: Tab or space-separated values - **ASC/ASCII**: Text files with wavenumber + intensity columns - **PKL**: Pickled pandas DataFrame - Click **Open** 3. **Create groups** - Click **Create Group** in the left panel - **Group Name**: `Healthy` - Select spectra from healthy samples - Click **Add to Group** - Repeat for `Disease` group 4. **Verify data** - Preview pane should show all imported spectra - Check that wavenumber range is correct (typically 400-1800 cm⁻¹) - Verify spectrum count matches your expectations ### Step 3: Preprocess Data (5 minutes) 1. **Navigate to Preprocessing tab** 2. **Add baseline correction** - Click **➕ Add Step** button - **Category**: Baseline Correction - **Method**: AsLS (Asymmetric Least Squares) - **Parameters**: - Lambda: `1e6` (smoothness) - P: `0.001` (asymmetry) - **Preview**: Check that fluorescence background is removed 3. **Add smoothing** - Click **➕ Add Step** - **Category**: Smoothing - **Method**: Savitzky-Golay - **Parameters**: - Window Length: `11` (must be odd) - Polynomial Order: `3` - **Preview**: Check that noise is reduced without losing peaks 4. **Add normalization** - Click **➕ Add Step** - **Category**: Normalization - **Method**: Vector Normalization - **Preview**: Check that all spectra have similar intensity scales 5. **Apply pipeline** - Review the preview of all steps - Click **Apply Pipeline** button - **Output Name**: `Preprocessed_Spectra` - Select **All Datasets** - Click **Confirm** - Wait for processing to complete (~10-30 seconds) 6. **Verify results** - New dataset `Preprocessed_Spectra` should appear in Data Package - Inspect spectra visually - should be clean and normalized ### Step 4: Exploratory Analysis with PCA (3 minutes) 1. **Navigate to Analysis tab** 2. **Select PCA method** - In the method list, click **PCA (Principal Component Analysis)** 3. **Configure parameters** - **Dataset**: Select `Preprocessed_Spectra` - **Number of Components**: `3` - **Scaling Method**: `StandardScaler` (recommended) - **Show 95% Confidence Ellipses**: ✓ Enable - **Show Loadings Plot**: ✓ Enable 4. **Run analysis** - Click **Run Analysis** button - Wait for computation (~5-15 seconds) 5. **Interpret results** - **Scores Plot (PC1 vs PC2)**: - Do Healthy and Disease groups separate? - Are there any outliers? - **Scree Plot**: - How much variance do PC1 and PC2 explain? - Typically want >60% for PC1+PC2 - **Loadings Plot**: - Which wavenumbers (Raman bands) drive the separation? - Match peaks to biochemical assignments 6. **Export results** - Click **Export Results** button - Choose location and filename - Saves figures (PNG) and data (CSV) ### Step 5: Statistical Testing (2 minutes) 1. **Select statistical test** - In the method list, click **Pairwise Statistical Tests** 2. **Configure parameters** - **Dataset**: `Preprocessed_Spectra` - **Group 1**: `Healthy` - **Group 2**: `Disease` - **Test Method**: `Mann-Whitney U` (non-parametric, recommended) - **Multiple Testing Correction**: `FDR (Benjamini-Hochberg)` - **Significance Level**: `0.05` 3. **Run test** - Click **Run Analysis** - Results show: - P-value heatmap across wavenumbers - Significant regions highlighted - Effect sizes 4. **Interpret results** - Which wavenumber regions show significant differences? - Map significant peaks to biochemical components: - 1650 cm⁻¹ → Amide I (proteins) - 1440 cm⁻¹ → CH₂ deformation (lipids) - 1000 cm⁻¹ → Phenylalanine (aromatic amino acids) ## Optional: Machine Learning Classification If you want to build a classification model: ### Step 6: Train ML Model (Optional, +10 minutes) 1. **Navigate to Machine Learning tab** 2. **Configure dataset** - Select `Preprocessed_Spectra` - **Groups**: Ensure `Healthy` and `Disease` are defined 3. **Choose algorithm** - **Algorithm**: Random Forest (recommended for beginners) - **Parameters**: Use defaults 4. **Configure validation** - **Method**: GroupKFold (prevents data leakage) - **Number of Folds**: `5` - **Test Set Size**: `20%` 5. **Train model** - Click **Train Model** - Wait for training (~30 seconds to 2 minutes) 6. **Evaluate results** - **ROC Curve**: Check AUC score (>0.90 is excellent) - **Confusion Matrix**: Check classification accuracy - **SHAP Values**: Identify most important wavenumbers 7. **Export model** - Click **Export Model** - Save trained model for future use ## Next Steps Congratulations! You've completed your first analysis. Now explore: ### Learn More About Methods - [Preprocessing Methods](analysis-methods/preprocessing.md) - Complete preprocessing reference - {ref}`PCA Guide ` - Deep dive into PCA theory and interpretation - [Statistical Tests](analysis-methods/statistical.md) - All available statistical methods - [Machine Learning](analysis-methods/machine-learning.md) - Complete ML pipeline guide ### Advanced Workflows - [Multi-Group Comparison](user-guide/analysis.md) - Compare >2 groups - [Custom Pipelines](user-guide/preprocessing.md) - Build complex preprocessing workflows - [Batch Processing](user-guide/preprocessing.md) - Process multiple datasets - [Hyperparameter Optimization](user-guide/machine-learning.md) - Optimize ML models ### Best Practices - [Data Quality](user-guide/best-practices.md#data-quality) - Ensure clean data - {ref}`Avoiding Data Leakage ` - Proper train/test splitting - [Publication-Ready Figures](user-guide/best-practices.md#figures) - Export high-quality plots - [Reproducible Workflows](user-guide/best-practices.md#reproducibility) - Document your analysis ## Common Issues ### Data Import Problems **Issue**: "Unable to read file" **Solution**: - Check file format (CSV with headers, TXT tab-separated) - Ensure numeric data only (remove text annotations) - Verify wavenumber range is in first column/row **Issue**: "Dimension mismatch" **Solution**: - All spectra must have same wavenumber range - Check for missing data points - Ensure consistent sampling intervals ### Preprocessing Errors **Issue**: "Baseline correction failed" **Solution**: - Try different method (AsLS, AirPLS, Polynomial) - Adjust lambda parameter (increase for smoother baseline) - Check for cosmic rays or spikes in raw data **Issue**: "Preview is blank" **Solution**: - Check that input dataset is selected - Verify preprocessing parameters are valid - Look for error messages in console/log ### Analysis Issues **Issue**: "Groups don't separate in PCA" **Solution**: - Ensure preprocessing is correct (baseline + normalization) - Check for outliers and remove bad spectra - Try supervised method (PLS-DA) instead of PCA - Consider that groups may actually be similar **Issue**: "No significant differences found" **Solution**: - Check sample size (n ≥ 5 per group recommended) - Verify groups are correctly assigned - Consider more sensitive statistical tests - Groups may genuinely not differ ## Getting Help If you encounter issues not covered here: 1. **Check documentation**: [User Guide](user-guide/index.md) and [Troubleshooting](troubleshooting.md) 2. **Search issues**: [GitHub Issues](https://github.com/zerozedsc/Raman-Spectroscopy-Analysis-Application/issues) 3. **Ask community**: [GitHub Discussions](https://github.com/zerozedsc/Raman-Spectroscopy-Analysis-Application/discussions) 4. **Report bug**: Create new issue with: - Steps to reproduce - Error messages - Sample data (if possible) ## Feedback Help us improve this quick start guide! Submit suggestions via [GitHub Issues](https://github.com/zerozedsc/Raman-Spectroscopy-Analysis-Application/issues) with the label `documentation`.