This function will calculate D statistics in blocks of prespecified size for all pairs of populations specified in the poplabels file. Please refer to the Relate documentation for input file formats (https://myersgroup.github.io/relate/).
D_from_Relate(
file_anc,
file_mut,
poplabels,
pop1,
pop2,
pop3,
pop4,
file_map = NULL,
chrs = NULL,
blgsize = NULL,
mu = NULL,
tmin = NULL,
t = NULL,
transitions = NULL,
use_muts = NULL,
minMAF = NULL
)
Filename of anc file. If chrs is specified, this should only be the prefix, resulting in filenames of ${file_anc}_chr${chr}.anc(.gz).
Filename of mut file. If chrs is specified, this should only be the prefix, resulting in filenames of ${file_anc}_chr${chr}.anc(.gz).
Filename of poplabels file
Population 1. Either a population in poplabels or Root.
Population 2. Either a population in poplabels or Root.
Population 3. Either a population in poplabels or Root.
Population 4. Either a population in poplabels or Root.
(Optional) File prefix of recombination map. Not needed if blgsize is given in base-pairs, i.e. blgsize > 100
(Optional) Vector of chromosome IDs
(Optional) SNP block size in Morgan. Default is 0.05 (5 cM). If blgsize is 100 or greater, if will be interpreted as base pair distance rather than centimorgan distance. If blgsize is negative, every tree is its own block.
(Optional) Per base per generation mutation rate to scale f2 values. Default: 1.25e-8
(Optional) Minimum time cutof in generations. Any lineages younger than tmin will be excluded from the analysis. Default: t = 0.
(Optional) Time cutoff in generations. Default: Inf
(Optional) Set this to FALSE to exclude transition SNPs. Only meaningful with use_muts
(Optional) Calculate traditional f2 statistics by only using mutations mapped to Relate trees. Default: false.
(Optional) Minimum frequency cutoff. Default: 1 (i.e. excl singletons)
Data frame accepted by jackknife function.
file_anc <- system.file("sim/msprime_ad0.8_split250_1_chr1.anc.gz", package = "twigstats")
file_mut <- system.file("sim/msprime_ad0.8_split250_1_chr1.mut.gz", package = "twigstats")
poplabels <- system.file("sim/msprime_ad0.8_split250_1.poplabels", package = "twigstats")
file_map <- system.file("sim/genetic_map_combined_b37_chr1.txt.gz", package = "twigstats")
#Calculate D
D_blocks <- D_from_Relate(file_anc, file_mut, poplabels, "P4", "P3", "P1", "P2", file_map)
print(jackknife(D_blocks))
#> val se
#> 1 0.004914178 0.003732302
D_blocks <- D_from_Relate(file_anc, file_mut, poplabels, "P4", "P3", "P1", "PX", file_map)
print(jackknife(D_blocks))
#> val se
#> 1 0.02788034 0.005402604
#Calculate D with time cutoff
D_blocks <- D_from_Relate(file_anc, file_mut, poplabels, "P4", "P3", "P1", "PX", file_map, t = 1000)
print(jackknife(D_blocks))
#> val se
#> 1 0.2185745 0.01000329
#Calculate D with mutations only (and time cutoff)
D_blocks <- D_from_Relate(file_anc, file_mut, poplabels, "P4", "P3", "P1", "PX", file_map, use_muts = T, t = 1000)
print(jackknife(D_blocks))
#> val se
#> 1 0.2196723 0.01072831