GAIA
gaia
is a Python package for Genomic Analysis of Introgressed Alleles using machine learning.
Currently, it supports two types of models:
- Logistic regression models
- U-Net models
gaia
uses established population genetic simulators like msprime
for generating training and test data.
It can be applied to detect introgressed fragments or alleles in genomes from various species.
Requirements
gaia
works on UNIX/LINUX operating systems and tested with the following:
- Python 3.9
-
Python packages:
- demes=0.2.3
- h5py=3.10.0
- joblib=1.3.2
- msprime=1.3.1
- numpy=1.26.4
- pandas=2.2.1
- python=3.9.19
- pyranges=0.0.129
- pytest=8.1.1
- scikit-allel=1.3.7
- scikit-learn=1.4.1.post1
- scipy=1.12.0
- tskit=0.5.6
- pyyaml=6.0.1
- seriate==1.1.2
- torch==2.2.0
Installation
Users can install gaia
by using the following commands:
git clone https://github.com/xin-huang/gaia
cd gaia
mamba env create -f env.yaml
mamba activate gaia
pip install .
Users first need to install mamba to create the virtual environment.
Help
To get help information, users can use:
gaia -h
This will display information for three commands:
Command | Description |
---|---|
lr | Use logistic regression models |
unet | Use U-Net models |
eval | Evaluate model performance |
If you need further help, such as such as reporting a bug or suggesting a feature, please open an issue.