Skip to content

GAIA

gaia is a Python package for Genomic Analysis of Introgressed Alleles using machine learning. Currently, it supports two types of models:

  • Logistic regression models
  • U-Net models

gaia uses established population genetic simulators like msprime for generating training and test data. It can be applied to detect introgressed fragments or alleles in genomes from various species.

Requirements

gaia works on UNIX/LINUX operating systems and tested with the following:

  • Python 3.9
  • Python packages:

    • demes=0.2.3
    • h5py=3.10.0
    • joblib=1.3.2
    • msprime=1.3.1
    • numpy=1.26.4
    • pandas=2.2.1
    • python=3.9.19
    • pyranges=0.0.129
    • pytest=8.1.1
    • scikit-allel=1.3.7
    • scikit-learn=1.4.1.post1
    • scipy=1.12.0
    • tskit=0.5.6
    • pyyaml=6.0.1
    • seriate==1.1.2
    • torch==2.2.0

Installation

Users can install gaia by using the following commands:

git clone https://github.com/xin-huang/gaia
cd gaia
mamba env create -f env.yaml
mamba activate gaia
pip install .

Users first need to install mamba to create the virtual environment.

Help

To get help information, users can use:

gaia -h

This will display information for three commands:

Command Description
lr Use logistic regression models
unet Use U-Net models
eval Evaluate model performance

If you need further help, such as such as reporting a bug or suggesting a feature, please open an issue.