Welcome to mlQTL
Overview
mlQTL is a gene-centric machine learning framework for genome-wide QTL detection. It models the relationship between genomic variants and phenotypes at the gene level, capturing nonlinear effects and weak-effect loci. A sliding window strategy aggregates gene-level signals to identify high-confidence QTL regions and prioritize candidate causal variants. mlQTL is released as an open-source Python toolkit for high-throughput, reproducible genetic analysis and molecular breeding research.
Features
- Gene-level QTL detection: Uses SNPs from any genomic regions within genes to model gene-phenotype associations.
- Multiple regression models: Decision Tree, Random Forest, and Support Vector Regression; additional models and encoding schemes can be customized.
- Sliding window analysis: Aggregates gene scores into window scores for robust QTL detection.
- SNP prioritization: Feature importance scores quantify contributions of individual SNPs for fine-scale variant prioritization. Scalable and efficient: Supports large datasets with multi-process parallelism.
- Flexible workflow: Provides command-line interface and Python API with customizable parameters, visualization, and output options. Open-source and reproducible: Available on GitHub with example datasets and documentation.