# Pages API Reference documentation for application page modules. ## Table of Contents - {ref}`Page Architecture ` - {ref}`Home Page ` - {ref}`Data Package Page ` - {ref}`Preprocessing Page ` - {ref}`Analysis Page ` - {ref}`Machine Learning Page ` - {ref}`Workspace Page ` --- (page-architecture)= ## Page Architecture All pages inherit from `BasePage` and follow a consistent structure. ### BasePage Class ```python from PyQt6.QtWidgets import QWidget class BasePage(QWidget): """ Base class for all application pages. Provides common functionality and interface. """ def __init__(self, parent=None): super().__init__(parent) self.setup_ui() self.connect_signals() def setup_ui(self): """Initialize UI components - Override in subclasses""" raise NotImplementedError def connect_signals(self): """Connect signals and slots - Override in subclasses""" pass def load_data(self, data: dict): """Load data into page - Override if needed""" pass def reset(self): """Reset page to initial state - Override if needed""" pass ``` **Subclass Template**: ```python from pages.base_page import BasePage class MyCustomPage(BasePage): def __init__(self, parent=None): super().__init__(parent) def setup_ui(self): layout = QVBoxLayout(self) # Add widgets self.setLayout(layout) def connect_signals(self): # Connect button clicks, etc. self.my_button.clicked.connect(self.on_button_clicked) def on_button_clicked(self): # Handle event pass ``` --- (home-page)= ## Home Page ### pages/home_page.py Welcome page with quick start guide and recent projects. #### HomePage Class ```python from pages.home_page import HomePage home = HomePage() home.show() ``` **Features**: - Quick start wizard - Recent projects list - Sample data links - Tutorial access - System status **UI Components**: ```python home.welcome_label # QLabel: Welcome message home.quick_start_widget # QWidget: Quick start guide home.recent_projects_list # QListWidget: Recent projects home.sample_data_buttons # List[QPushButton]: Sample data home.tutorial_links # QListWidget: Tutorial links ``` **Methods**: ##### `load_recent_projects()` Load and display recent projects. ```python home.load_recent_projects() ``` **Returns**: None **Updates**: `recent_projects_list` widget ##### `open_recent_project(index: int)` Open project from recent list. ```python home.recent_projects_list.itemClicked.connect( lambda item: home.open_recent_project( home.recent_projects_list.row(item) ) ) ``` **Parameters**: - `index` (int): Project index in recent list **Side Effects**: Opens project, switches to workspace page ##### `load_sample_data(sample_name: str)` Load built-in sample dataset. ```python # Load tissue sample data home.load_sample_data('tissue_samples') # Load cell culture data home.load_sample_data('cell_cultures') ``` **Parameters**: - `sample_name` (str): Sample dataset name **Available Samples**: - `'tissue_samples'`: Tissue Raman spectra - `'cell_cultures'`: Cell culture spectra - `'pharmaceuticals'`: Drug analysis spectra ##### `show_tutorial(topic: str)` Open tutorial dialog for specific topic. ```python home.show_tutorial('preprocessing') home.show_tutorial('machine_learning') ``` **Parameters**: - `topic` (str): Tutorial topic **Topics**: - `'getting_started'` - `'preprocessing'` - `'analysis'` - `'machine_learning'` --- (data-package-page)= ## Data Package Page ### pages/data_package_page.py Data import, organization, and group management. #### DataPackagePage Class ```python from pages.data_package_page import DataPackagePage data_page = DataPackagePage() ``` **Features**: - Import CSV/TXT files - Create and manage groups - Data validation - Metadata editing - Batch operations **UI Structure**: ![Data Package page (screenshot)](../../assets/screenshots/en/data-package-page.png) *Figure: Data Package page showing the file list (left) and preview/properties (right)* High-level layout: - Import toolbar - File list (left) - Preview plot + properties/metadata (right) **Key Methods**: ##### `import_files(filepaths: List[str])` Import spectral data files. ```python # Single file data_page.import_files(['data/sample1.csv']) # Multiple files data_page.import_files([ 'data/sample1.csv', 'data/sample2.csv', 'data/sample3.csv' ]) ``` **Parameters**: - `filepaths` (List[str]): List of file paths **Returns**: dict - Import summary **Side Effects**: - Validates data format - Adds to file list - Updates preview **Supported Formats**: - CSV (comma-separated) - TXT (tab-separated) - ASCII with various delimiters ##### `create_group(group_name: str)` Create new sample group. ```python data_page.create_group('Control') data_page.create_group('Treatment_A') data_page.create_group('Treatment_B') ``` **Parameters**: - `group_name` (str): Group identifier **Returns**: None **Validation**: - No duplicate names - Valid characters only - Not empty ##### `assign_to_group(sample_indices: List[int], group_name: str)` Assign samples to group. ```python # Assign samples 0, 1, 2 to Control group data_page.assign_to_group([0, 1, 2], 'Control') # Assign selected samples selected = data_page.get_selected_indices() data_page.assign_to_group(selected, 'Treatment_A') ``` **Parameters**: - `sample_indices` (List[int]): Sample indices - `group_name` (str): Target group name ##### `get_data()` Get current dataset. ```python data = data_page.get_data() # Returns: # { # 'wavenumbers': np.array([...]), # 'spectra': np.array([[...], [...], ...]), # 'labels': ['Sample1', 'Sample2', ...], # 'groups': ['Control', 'Control', 'Treatment', ...], # 'metadata': {...} # } ``` **Returns**: dict - Complete dataset ##### `validate_data()` Validate loaded data. ```python is_valid, errors = data_page.validate_data() if not is_valid: for error in errors: print(f"Validation error: {error}") ``` **Returns**: Tuple[bool, List[str]] - (is_valid, error_messages) **Checks**: - All spectra same length - Matching wavenumbers - No missing values - Valid group assignments - Consistent data types ##### `merge_files(file_indices: List[int])` Merge multiple files into single dataset. ```python # Merge first 3 files data_page.merge_files([0, 1, 2]) ``` **Parameters**: - `file_indices` (List[int]): Files to merge **Validation**: Ensures compatible wavenumber axes ##### `split_by_group()` Split dataset into separate files by group. ```python # Creates separate datasets for each group files = data_page.split_by_group() # Returns: {'Control': data_control, 'Treatment': data_treatment} ``` **Returns**: Dict[str, dict] - Group name to data mapping --- (preprocessing-page)= ## Preprocessing Page ### pages/preprocess_page.py Spectral preprocessing pipeline builder and executor. #### PreprocessPage Class ```python from pages.preprocess_page import PreprocessPage preprocess = PreprocessPage() preprocess.load_data(data) ``` **Features**: - Interactive pipeline builder - Real-time preview - Parameter tuning widgets - Template pipelines - Batch processing **UI Structure**: ![Preprocessing page (screenshot)](../../assets/screenshots/en/preprocessing-page.png) *Figure: Preprocessing page showing the pipeline builder and preview plots* High-level layout: - Method selector (tabs) - Pipeline steps + add/reorder/remove - Preview plot + parameters panel **Key Methods**: ##### `add_preprocessing_step(method: str, params: dict = None)` Add method to pipeline. ```python # Add baseline correction preprocess.add_preprocessing_step( 'asls', {'lambda': 1e5, 'p': 0.01} ) # Add smoothing preprocess.add_preprocessing_step( 'savgol', {'window_length': 11, 'polyorder': 3} ) # Add normalization (no params needed) preprocess.add_preprocessing_step('vector_norm') ``` **Parameters**: - `method` (str): Method name - `params` (dict, optional): Method parameters **Returns**: int - Step index ##### `remove_preprocessing_step(index: int)` Remove step from pipeline. ```python # Remove second step preprocess.remove_preprocessing_step(1) ``` **Parameters**: - `index` (int): Step index (0-based) ##### `reorder_steps(old_index: int, new_index: int)` Change step order in pipeline. ```python # Move step 0 to position 2 preprocess.reorder_steps(0, 2) ``` **Parameters**: - `old_index` (int): Current position - `new_index` (int): Target position ##### `update_step_params(index: int, params: dict)` Update parameters for specific step. ```python # Update AsLS parameters preprocess.update_step_params(0, { 'lambda': 1e6, 'p': 0.001 }) ``` **Parameters**: - `index` (int): Step index - `params` (dict): New parameters ##### `preview_pipeline(spectrum_index: int = 0)` Preview pipeline on single spectrum. ```python # Preview on first spectrum preprocess.preview_pipeline(0) # Preview on selected spectrum selected_idx = preprocess.get_selected_spectrum() preprocess.preview_pipeline(selected_idx) ``` **Parameters**: - `spectrum_index` (int): Spectrum to preview **Updates**: Preview plot with before/after comparison ##### `apply_pipeline(progress_callback=None)` Apply pipeline to all spectra. ```python # Simple application result = preprocess.apply_pipeline() # With progress tracking def on_progress(current, total, message): print(f"{current}/{total}: {message}") result = preprocess.apply_pipeline(progress_callback=on_progress) # Returns preprocessed data dictionary ``` **Parameters**: - `progress_callback` (callable, optional): Progress function(current, total, message) **Returns**: dict - Preprocessed data ##### `load_pipeline_template(template_name: str)` Load predefined pipeline. ```python # Standard pipeline preprocess.load_pipeline_template('standard') # Minimal pipeline preprocess.load_pipeline_template('minimal') # Derivative-based preprocess.load_pipeline_template('derivative') ``` **Parameters**: - `template_name` (str): Template identifier **Templates**: - `'standard'`: AsLS → SavGol → Vector norm - `'minimal'`: AirPLS → Vector norm - `'aggressive'`: AsLS → Median → Gaussian → SNV - `'derivative'`: AsLS → SavGol 1st derivative → Vector norm - `'chemometric'`: MSC → SavGol → Mean center - `'deep_learning'`: AsLS → MinMax → Feature scaling ##### `save_pipeline(name: str, description: str = '')` Save current pipeline for reuse. ```python preprocess.save_pipeline( 'my_custom_pipeline', description='Optimized for tissue samples' ) ``` **Parameters**: - `name` (str): Pipeline name - `description` (str, optional): Description **Storage**: `pipelines/` directory as JSON ##### `load_pipeline(name: str)` Load saved pipeline. ```python preprocess.load_pipeline('my_custom_pipeline') ``` **Parameters**: - `name` (str): Pipeline name ##### `get_available_methods()` Get list of available preprocessing methods. ```python methods = preprocess.get_available_methods() # Returns: # { # 'baseline': ['asls', 'airpls', 'polynomial', ...], # 'smoothing': ['savgol', 'gaussian', 'median', ...], # 'normalization': ['vector_norm', 'snv', 'msc', ...], # 'derivatives': ['savgol_1st', 'savgol_2nd'], # 'advanced': ['cdae', 'wavelet', ...] # } ``` **Returns**: Dict[str, List[str]] - Methods by category --- (analysis-page)= ## Exploratory Analysis Page ### pages/exploratory_analysis_page.py Exploratory and statistical analysis interface. #### AnalysisPage Class ```python from pages.exploratory_analysis_page import AnalysisPage analysis = AnalysisPage() analysis.load_data(preprocessed_data) ``` **Features**: - PCA, UMAP, t-SNE visualization - Clustering (hierarchical, K-means, DBSCAN) - Statistical tests (t-test, ANOVA, correlation) - Interactive plots - Results export **UI Structure**: ![Analysis page (screenshot)](../../assets/screenshots/en/analysis-page.png) *Figure: Analysis page showing method selection, parameters, and results/plots* High-level layout: - Analysis type + method selection (left) - Parameters + apply actions - Results display (plots/tables) (right) **Key Methods**: ##### `apply_pca(n_components: int = 2, **kwargs)` Perform PCA analysis. ```python result = analysis.apply_pca( n_components=2, whiten=False, random_state=42 ) # Returns: # { # 'scores': np.array([[...], ...]), # 'loadings': np.array([[...], ...]), # 'explained_variance': np.array([...]), # 'explained_variance_ratio': np.array([...]) # } ``` **Parameters**: - `n_components` (int or float): Number of PCs or variance to retain - `**kwargs`: Additional sklearn.PCA parameters **Returns**: dict - PCA results **Updates**: Displays scores plot and loadings ##### `apply_umap(n_neighbors: int = 15, min_dist: float = 0.1, **kwargs)` Perform UMAP dimensionality reduction. ```python result = analysis.apply_umap( n_neighbors=15, min_dist=0.1, n_components=2, random_state=42 ) # Returns: # { # 'embedding': np.array([[...], ...]) # } ``` **Parameters**: - `n_neighbors` (int): Neighborhood size - `min_dist` (float): Minimum distance - `**kwargs`: Additional umap.UMAP parameters **Returns**: dict - UMAP results ##### `apply_hierarchical_clustering(linkage: str = 'ward', **kwargs)` Perform hierarchical clustering. ```python result = analysis.apply_hierarchical_clustering( linkage='ward', metric='euclidean', n_clusters=3 ) # Returns: # { # 'clusters': np.array([0, 0, 1, 1, 2, 2, ...]), # 'linkage_matrix': np.array([[...], ...]), # 'cophenetic_corr': 0.85 # } ``` **Parameters**: - `linkage` (str): Linkage method ('ward', 'complete', 'average', 'single') - `**kwargs`: Additional parameters **Returns**: dict - Clustering results **Updates**: Displays dendrogram ##### `apply_statistical_test(test_type: str, group1: str, group2: str = None, **kwargs)` Perform statistical test. ```python # T-test result = analysis.apply_statistical_test( 'ttest', group1='Control', group2='Treatment', equal_var=False # Welch's t-test ) # ANOVA (multiple groups) result = analysis.apply_statistical_test( 'anova', group1=None # Uses all groups ) # Returns: # { # 'statistic': 3.42, # 'pvalue': 0.001, # 'significant_features': [100, 234, 567, ...], # 'effect_sizes': np.array([...]) # } ``` **Parameters**: - `test_type` (str): Test name ('ttest', 'mannwhitney', 'anova', 'kruskal') - `group1` (str): First group name - `group2` (str, optional): Second group name (for pairwise tests) - `**kwargs`: Test-specific parameters **Returns**: dict - Test results ##### `calculate_correlation(method: str = 'pearson')` Calculate correlation matrix. ```python result = analysis.calculate_correlation(method='pearson') # Returns: # { # 'correlation_matrix': np.array([[1.0, 0.8, ...], ...]), # 'pvalue_matrix': np.array([[0.0, 0.001, ...], ...]) # } ``` **Parameters**: - `method` (str): Correlation method ('pearson', 'spearman', 'kendall') **Returns**: dict - Correlation results **Updates**: Displays heatmap ##### `export_results(format: str = 'xlsx')` Export analysis results. ```python # Export to Excel analysis.export_results('xlsx') # Export to CSV analysis.export_results('csv') # Export to PDF report analysis.export_results('pdf') ``` **Parameters**: - `format` (str): Output format ('xlsx', 'csv', 'pdf', 'html') **Creates**: File in `results/` directory --- (machine-learning-page)= ## Modeling & Classification Page ### pages/machine_learning_page.py Model training, evaluation, and prediction interface. #### MachineLearningPage Class ```python from pages.modeling_classification_page import MachineLearningPage ml = MachineLearningPage() ml.load_data(preprocessed_data) ``` **Features**: - Algorithm selection (SVM, RF, XGBoost, LR, MLP) - Hyperparameter tuning - Cross-validation - Model evaluation - Feature importance - Model export/import **UI Structure**: ![Machine Learning page (screenshot)](../../assets/screenshots/en/ml-page.png) *Figure: Machine Learning page showing configuration controls and results dashboards* High-level layout: - Algorithm selection + train/predict controls - Validation and hyperparameter settings - Results dashboard (metrics, plots, exports) **Key Methods**: ##### `train_model(algorithm: str, params: dict = None, cv_strategy: str = 'stratified')` Train machine learning model. ```python # Train Random Forest result = ml.train_model( algorithm='random_forest', params={ 'n_estimators': 100, 'max_depth': None, 'min_samples_split': 2 }, cv_strategy='group' # Patient-level CV ) # Returns: # { # 'model': trained_model, # 'train_score': 0.98, # 'test_score': 0.92, # 'cv_scores': [0.90, 0.93, 0.91, 0.94, 0.89], # 'confusion_matrix': np.array([[...], ...]), # 'classification_report': {...} # } ``` **Parameters**: - `algorithm` (str): Algorithm name - `params` (dict, optional): Hyperparameters - `cv_strategy` (str): CV type ('stratified', 'kfold', 'group') **Returns**: dict - Training results **Supported Algorithms**: - `'svm'`: Support Vector Machine - `'random_forest'`: Random Forest - `'xgboost'`: XGBoost - `'logistic_regression'`: Logistic Regression - `'mlp'`: Multi-Layer Perceptron ##### `optimize_hyperparameters(algorithm: str, param_grid: dict, method: str = 'grid')` Hyperparameter optimization. ```python # Grid search result = ml.optimize_hyperparameters( algorithm='random_forest', param_grid={ 'n_estimators': [50, 100, 200], 'max_depth': [None, 10, 20], 'min_samples_split': [2, 5, 10] }, method='grid' ) # Random search (faster) result = ml.optimize_hyperparameters( algorithm='svm', param_grid={ 'C': [0.1, 1, 10, 100], 'gamma': [0.001, 0.01, 0.1, 1], 'kernel': ['rbf'] }, method='random', n_iter=20 # Try 20 random combinations ) # Returns: # { # 'best_params': {'n_estimators': 100, 'max_depth': 20, ...}, # 'best_score': 0.94, # 'cv_results': {...} # } ``` **Parameters**: - `algorithm` (str): Algorithm name - `param_grid` (dict): Parameter grid - `method` (str): Search method ('grid', 'random', 'bayesian') - `**kwargs`: Method-specific parameters **Returns**: dict - Optimization results ##### `evaluate_model(model, X_test, y_test)` Evaluate trained model. ```python metrics = ml.evaluate_model(model, X_test, y_test) # Returns: # { # 'accuracy': 0.92, # 'precision': 0.91, # 'recall': 0.93, # 'f1_score': 0.92, # 'roc_auc': 0.95, # 'confusion_matrix': np.array([[...], ...]), # 'classification_report': {...} # } ``` **Parameters**: - `model`: Trained model - `X_test` (np.ndarray): Test features - `y_test` (np.ndarray): True labels **Returns**: dict - Evaluation metrics ##### `predict(model, X_new)` Make predictions on new data. ```python # Predict classes y_pred = ml.predict(model, X_new) # Predict probabilities y_proba = ml.predict(model, X_new, return_proba=True) ``` **Parameters**: - `model`: Trained model - `X_new` (np.ndarray): New data - `return_proba` (bool): Return probabilities **Returns**: np.ndarray - Predictions or probabilities ##### `get_feature_importance(model, method: str = 'default')` Extract feature importance. ```python # Default importance (model-specific) importance = ml.get_feature_importance(model) # Permutation importance (model-agnostic) importance = ml.get_feature_importance( model, method='permutation', X=X_test, y=y_test ) # SHAP values (most detailed) importance = ml.get_feature_importance( model, method='shap', X=X_background # Background dataset ) # Returns: # { # 'importance': np.array([...]), # 'feature_indices': np.array([...]), # 'wavenumbers': np.array([...]) # } ``` **Parameters**: - `model`: Trained model - `method` (str): Importance method - `**kwargs`: Method-specific parameters **Returns**: dict - Feature importance ##### `export_model(model, filepath: str)` Save trained model to disk. ```python ml.export_model(model, 'models/my_model.pkl') # With metadata ml.export_model( model, 'models/my_model.pkl', metadata={ 'algorithm': 'random_forest', 'trained_date': '2026-01-24', 'preprocessing': 'standard_pipeline', 'accuracy': 0.92 } ) ``` **Parameters**: - `model`: Trained model - `filepath` (str): Output path - `metadata` (dict, optional): Model metadata ##### `load_model(filepath: str)` Load saved model from disk. ```python model, metadata = ml.load_model('models/my_model.pkl') print(f"Accuracy: {metadata['accuracy']}") ``` **Parameters**: - `filepath` (str): Model file path **Returns**: Tuple[model, dict] - Model and metadata --- (workspace-page)= ## Workspace Page ### pages/workspace_page.py Project management and file browser. #### WorkspacePage Class ```python from pages.workspace_page import WorkspacePage workspace = WorkspacePage() ``` **Features**: - Project file browser - File operations (rename, delete, move) - Project settings - Data summaries - Export options **Key Methods**: ##### `set_project_directory(path: str)` Set current project directory. ```python workspace.set_project_directory('/path/to/project') ``` **Parameters**: - `path` (str): Project directory path ##### `get_project_files()` List all files in project. ```python files = workspace.get_project_files() # Returns: # { # 'data': ['raw_data.csv', 'preprocessed.csv'], # 'models': ['model1.pkl', 'model2.pkl'], # 'results': ['pca_results.xlsx', 'report.pdf'], # 'pipelines': ['custom_pipeline.json'] # } ``` **Returns**: Dict[str, List[str]] - Files by category ##### `export_project(format: str = 'zip')` Export entire project. ```python # Create ZIP archive workspace.export_project('zip') # Create folder structure workspace.export_project('folder') ``` **Parameters**: - `format` (str): Export format ('zip', 'folder') --- ## Best Practices ### Data Flow Between Pages ```python # 1. Load data in Data Package page data_page.import_files(filepaths) raw_data = data_page.get_data() # 2. Preprocess preprocess_page.load_data(raw_data) preprocessed_data = preprocess_page.apply_pipeline() # 3. Analyze analysis_page.load_data(preprocessed_data) analysis_page.apply_pca(n_components=2) # 4. Train model ml_page.load_data(preprocessed_data) ml_page.train_model('random_forest') ``` ### Error Handling in Pages ```python def on_button_clicked(self): try: result = self.perform_operation() self.show_success_message("Operation completed") except DataError as e: self.show_error_dialog(f"Data error: {e}") except Exception as e: logger.exception("Unexpected error") self.show_error_dialog(f"An error occurred: {e}") ``` ### Progress Reporting ```python def long_operation(self): total_steps = 100 for i in range(total_steps): # Do work process_step(i) # Update progress progress = int((i + 1) / total_steps * 100) self.progress_updated.emit(progress, f"Step {i+1}/{total_steps}") ``` --- ## See Also - [Core API](core.md) - Core application modules - [Components API](components.md) - UI components - [Functions API](functions.md) - Processing functions - [User Guide](../user-guide/index.md) - User documentation --- **Last Updated**: 2026-01-24