Skip to content

Welcome to mlQTL

Overview

mlQTL is a gene-centric machine learning framework for genome-wide QTL detection. It models the relationship between genomic variants and phenotypes at the gene level, capturing nonlinear effects and weak-effect loci. A sliding window strategy aggregates gene-level signals to identify high-confidence QTL regions and prioritize candidate causal variants. mlQTL is released as an open-source Python toolkit for high-throughput, reproducible genetic analysis and molecular breeding research.

Features

  • Gene-level QTL detection: Uses SNPs from any genomic regions within genes to model gene-phenotype associations.
  • Multiple regression models: Decision Tree, Random Forest, and Support Vector Regression; additional models and encoding schemes can be customized.
  • Sliding window analysis: Aggregates gene scores into window scores for robust QTL detection.
  • SNP prioritization: Feature importance scores quantify contributions of individual SNPs for fine-scale variant prioritization. Scalable and efficient: Supports large datasets with multi-process parallelism.
  • Flexible workflow: Provides command-line interface and Python API with customizable parameters, visualization, and output options. Open-source and reproducible: Available on GitHub with example datasets and documentation.