This function computes f2s or Fst in blocks along the genome and returns a data frame with columns blockID, pos, pop1, pop2, and the f2 or Fst values.

TwigScan(
  file_anc,
  file_mut,
  poplabels,
  file_map,
  file_out,
  blgsize = 10000,
  t = Inf,
  use_muts = FALSE,
  Fst = FALSE
)

Arguments

file_anc

Filename of anc file. If chrs is specified, this should only be the prefix, resulting in filenames of ${file_anc}_chr${chr}.anc(.gz).

file_mut

Filename of mut file. If chrs is specified, this should only be the prefix, resulting in filenames of ${file_anc}_chr${chr}.anc(.gz).

poplabels

Filename of poplabels file

file_map

File prefix of recombination map.

file_out

File prefix of output files

blgsize

(Optional) SNP block size in Morgan. Default is 0.05 (5 cM). If blgsize is 1 or greater, if will be interpreted as base pair distance rather than centimorgan distance.

t

(Optional) Time cutoff in generations. Default: Inf

use_muts

(Optional) Calculate traditional f2 statistics by only using mutations mapped to Relate trees. Default: False.

Fst

(Optional) If TRUE, compute Fst. Default: FALSE

Value

Returns a data frame with Fst or f2 values.

Examples

#These lines assign file names to variables file_anc, file_mut, poplabels, file_map.
#see https://myersgroup.github.io/relate/getting_started.html#Output for file formats
file_anc  <- system.file("sim/msprime_ad0.8_split250_1_chr1.anc.gz", package = "twigstats")
file_mut  <- system.file("sim/msprime_ad0.8_split250_1_chr1.mut.gz", package = "twigstats")
#see https://myersgroup.github.io/relate/input_data.html for file formats
poplabels <- system.file("sim/msprime_ad0.8_split250_1.poplabels", package = "twigstats")
file_map  <- system.file("sim/genetic_map_combined_b37_chr1.txt.gz", package = "twigstats") #recombination map (three column format)

df <- TwigScan(file_anc = file_anc,
       file_mut = file_mut,
       poplabels = poplabels,
       file_map = file_map,
       file_out = "test",
       blgsize = 10000,  #optional
       use_muts = FALSE, #optional
       t = 1000,         #optional
       Fst = TRUE         #optional
 )

print(head(df))
#> # A tibble: 6 × 8
#> # Groups:   pop1, pop2 [1]
#>   block    pos   val pop1  pop2  cutoff use_muts  zval
#>   <int>  <int> <dbl> <chr> <chr>  <dbl> <lgl>    <dbl>
#> 1     0  50000 0.157 P1    P2      1000 FALSE     3.66
#> 2     1  60000 0.155 P1    P2      1000 FALSE     3.61
#> 3     2  70000 0.140 P1    P2      1000 FALSE     3.19
#> 4     3  80000 0.137 P1    P2      1000 FALSE     3.12
#> 5     4  90000 0.140 P1    P2      1000 FALSE     3.19
#> 6     5 100000 0.142 P1    P2      1000 FALSE     3.26