Get candidate introgressed fragments
Input
To obtain candidate introgressed fragments from sstar, users should provide the output (e.g. test.tract.threshold) from sstar threshold. If no source genome is available, users can use the following command:
sstar tract --threshold test.tract.threshold --output-prefix test
If one source individual is available, users can calculate source match rates and output these source match rates and candidate introgressed regions in a single BED file:
sstar tract --threshold test.tract.threshold --output-prefix test --match-rate test.tract.src1.match.rate
If source genomes from two different source populations are available, users can provide the output (e.g. test.tract.src1.match.rate and test.tract.src2.match.rate) from sstar matchrate, and use the following command:
sstar tract --threshold test.tract.threshold --output-prefix test --match-rate test.tract.src1.match.rate test.tract.src2.match.rate
Output
If no source match-rate file is provided, sstar tract writes one BED file:
test.bed
If one source match-rate file is provided, sstar tract writes one BED file containing the source match rate:
test.bed
If two source match-rate files are provided, sstar tract writes two BED files:
test.src1.bed
test.src2.bed
These two files contain segments assigned to one of the two possible source genomes or source populations.
An example for the output is below:
| 1 | 75030000 | 75080000 | tsk_10 | 0.280797 |
- The first column is the name of the chromosome.
- The second column is the start position.
- The third column is the end position.
- The fourth column is the name of the sample.
- The fifth column is the source match rate. This column is included only when source match-rate files are provided.
Settings
When two source match-rate files are provided, users can use --diff to set the difference cutoff for assigning tracts to the two sources. The default is 0.
For two source match-rate files, the current implementation compares match_rate_x - match_rate_y, where match_rate_x is from the first match-rate file and match_rate_y is from the second match-rate file. Rows with match_rate_x - match_rate_y > diff are written to .src1.bed, and rows with match_rate_x - match_rate_y < diff are written to .src2.bed.