Title: | FourGamete Package |
---|---|
Description: | The four-gamete test is based on the infinite-sites model which assumes that the probability of the same mutation occurring twice (recurrent or parallel mutations) and the probability of a mutation back to the original state (reverse mutations) are close to zero. Without these types of mutations, the only explanation for observing the four dilocus genotypes (example below) is recombination (Hudson and Kaplan 1985, Genetics 111:147-164). Thus, the presence of all four gametes is also called phylogenetic incompatibility. |
Authors: | Milton T Drott |
Maintainer: | Milton T Drott <[email protected]> |
License: | GPL-2 |
Version: | 0.1.0 |
Built: | 2024-11-21 02:47:57 UTC |
Source: | https://github.com/aflavus/fourgametep |
The function FGf allows users to quickly assess genotypic data for phylogenetic incompatibility, a sign of recombination, using the four-gamete test.
FGf(x)
FGf(x)
x |
A properly formatted data frame. For details see '?FourgameteP' |
All data output files are organized based on names in the example:
...............Locus1....Locus2
Gamete 1.......A.........A
Gamete 2.......A.........B
Gamete 3.......B.........A
Gamete 4.......B.........B
Columns "Locus1" and "Locus2" in data files indicate which two columns (loci) of your data were being compared.
In data output file "TrueUnique", the column "Locus1A" is the first allele "A" in the column "Locus1" while "Locus2B" would indicate the second allele (B) present in the "Locus2" column. Together they make up four dilocus genotypes (or the four gametes): (Locus1A,Locus2A), (Locus1A, Locus2B), (Locus1B,Locus2A), and (Locus1B,Locus2B)
The "FourGametes?" column indicates if the program was able to find all four gametes. A value of "TRUE" indicates phylogenetic incompatibility between the tested loci. A value of “FALSE” indicates that only three or fewer gametes were found.
"FinalResults" This file contains all locus pair comparisons and if the program was able to find any ANY combination of alleles at a locus pair for which the four gametes were found (TRUE) or if it was only able to find three or fewer of the gametes (FALSE).
"TrueUnique" This file contains all of the possible combination of alleles at all locus pairs for which the four gametes can be found (TRUE).
The number of monomorphic loci is stored in the value "MonoLoci" and their identity in "NamesOfMonoLoci".
The total number of comparisons between loci that are not monomorphic is stored in "Comparisons"
The number of these comparisons that were TRUE or False is stored in "ComparisonsTrue" or in "ComparisonsFalse" respectively.
n = c(2, 3, 5) c = c(2, 4, 5) df = data.frame(n, c) FGf(df)
n = c(2, 3, 5) c = c(2, 4, 5) df = data.frame(n, c) FGf(df)
The four-gamete test is based on the infinite-sites model which assumes that the probability of the same mutation occurring twice (recurrent or parallel mutations) and the probability of a mutation back to the original state (reverse mutations) are close to zero. Without these types of mutations, the only explanation for observing the four dilocus genotypes (example below) is recombination (Hudson and Kaplan 1985, Genetics 111:147-164). Thus, the presence of all four gametes is also called phylogenetic incompatibility.
FourgameteP contains a single function: 'FGf()'.
This function determines if, across all pairwise comparisons, it is possible to find all four gametes
Example:
...............Locus1....Locus2
Gamete 1.......A.........A
Gamete 2.......A.........B
Gamete 3.......B.........A
Gamete 4.......B.........B
While the example above indicates two alleles at each of two loci, FourGamete will output all possible allele combinations between two loci.
Please make sure that your data fits all of the following requirements:
1)Data must represent alleles at haploid loci (maximum of 26 loci)
"2)"Rows are individuals and columns are loci
"3)"Loci must labeled in the top row with names containing no spaces
"4)"Do not include metadata columns (e.g., individual names or other
"5)"Alleles at a locus must be numerical
"6)"Missing data should be coded "0"
"7)"Files should be saved as tab-delimited text
Milton T Drott [email protected]