noFixed to function
to redraw loci that were drawn fixed for a single allele. These loci are
not polymorphic so they would normally not be considered in
fixed_loci to test for fixed loci
coanc_to_kinshipto easily obtain kinship matrices from coancestry matrices.
qisnow returns a numeric admixture proportions matrix (used to be logical).
sigma = 0special case.
q1dcnow provide more informative out-of-bounds messages when
sigmais missing (and
sigmaroot finding in
sis provided) is now more robust, explicitly tested at boundaries (min
s > 0achieved at
sigma = 0and max
s = 1achieved at
sigma = Inf).
q1dc(users would never need to set them now that procedure is more robust).
fst_admix(no deprecated version available in this case, to eliminate conflict with
coanc_subpops(if general matrix is accepted),
inbr_subpops(vector or scalar versions required)
sigma = 0bug in
draw_all_admix(compared to deprecated
rbnpsd, which retains old defaults):
noFixed) is now
wantB) are now
draw_p_subpopsnow admits scalar inputs
inbr_subpops, while number of loci and number of subpopulations can be provided as additional options.
coanc_subpops) or if the diagonal matrix case is required (specified as vector or scalar
admix_prop_1d_circular) to prevent overlapping individuals on the edges, and to better agree visually with the linear version (
low_memoption could be set but filled slowly by locus only.
draw_all_admixis also now automatically low-memory whenever
want_p_ind = FALSE, and the explicit
low_memoption has also been removed.
draw_p_ancto trigger a symmetric Beta distribution for the ancestral allele frequencies, with the desired shape parameter. The
betaoption can also be set on the wrapper function
draw_all_admix. This option allows simulation of a distribution heavier on rare variants (when
betais much smaller than 1), more similar to real human data.
bias_coeff_admix_fit, which caused it to die if the desired bias coefficient was an extreme value (particularly
1). The error message was:
f() values at end points not of opposite sign. The actual bug was not observed in the regular R build, but rather in a limited precision setting where R was configured with
draw_all_admix, to specify desired ancestral allele frequencies instead of having the code generate it randomly (default).
draw_p_subpops.R, clarifying that input
p_anccan be scalar.
draw_all_admix: when option
p_ancis provided as scalar and
want_p_anc = TRUE, now the return value is always a vector (in this case the input scalar value repeated
m_locitimes). The previous behavior was to return
p_ancas scalar if that was the input, which could be problematic for downstream applications.
admix_prop_1d_circularhad these changes:
fstnow have default values (of
NA, respectively) instead of missing, and these “missing” values can be passed to get the same behavior as if they hadn’t been passed at all.
bias_coeff = 1(to fix an issue only observed on Apple M1).
admix_prop_indep_subpops: default value for the optional parameter
subpopsis now made more clear in arguments definition.
README.mdand the vignette, to point to the published method in PLoS Genetics.
draw_p_subpops_treeis the tree version of
coanc_treecalculates the true coancestry matrix corresponding to the subpopulations related by a tree.
draw_all_admixhas new argument
tree_subpopsthat can be used in place of
inbr_subpops(to simulated subpopulation allele frequencies using
coanc_subpops) as input, so they work if they are passed the matrix that
sigmais missing (and therefore fit to a desired
bias_coeff), now additionally return multiplicative
factorused to rescale
It’s Fangorn Forest around here with all the tree updates!
fit_treefor fitting trees to coancestry matrices!
scale_treeto easily scale coancestry trees and check for out-of-bounds values.
tree_additivefor calculating “additive” edges for probabilistic edge coancestry trees, and also the reverse function .
coanc_tree, but now it’s renamed, exported, and well documented!
phyloobjects passed to these functions:
coanc_tree: edge is a shared covariance value affecting all subpopulations.
draw_p_subpops_tree: if root edge is present, functions warn that it will be ignored.
admix_prop_1d_circular: debugged an edge case where
sigmais small but not zero and numerically-calculated densities all come out to zero in a given row of the
admix_prop_1d_circularinfinite values also arise), which used to lead to NAs upon row normalization; now for those rows, the closest ancestry (by coordinate distance) gets assigned the full admixture fraction (just as for independent subpopulations/
sigma = 0).
admix_prop_1d_circularnow copy names from the input
coanc_subpops(vector and matrix versions, only required when fitting
bias_coeff) to the columns of the output
draw_genotypes_admixnow copies row and column names from input matrix
p_ind(or rownames from
p_indand column names from the rownames of
admix_proportionswhen the latter is provided) to output genotype matrix
draw_p_subpopsnow copies names from
p_ancto rows, names from
inbr_subpopsto columns, when present and of the right dimensions.
draw_p_subpops_treenow copies names from
p_ancto rows. Names from
tree_subpopswere already copied to columns before.
fst_admixstop if the column names of
admix_proportionsand the names of
draw_all_admixstops if the column names of
admix_proportionsand the names of either
admix_proportionsis passed, stops if the column names of
make_p_ind_admixstops if the column names of
tree_additivenow has option
force, which when
TRUEsimply proceeds without stopping if additive edges were already present (in
tree$edge.length.add, which is ignored and overwritten).
New functions and bug fixes dealing with reordering tree edges and tips.
tree_reindex_tipsfor ensuring that tip order agrees in both the internal labels vector and the edge matrix. Such lack of agreement is generally possible (technically the tree is the same for arbitrary orders of edges in the edge matrix). However, such a disagreement causes visual disagreement in plots (for example, trees are plotted in the order of the edge matrix, versus coancestry matrices are ordered as in the tip labels vector instead), which can now be fixed in general.
tree_reorderfor reordering tree edges and tips to agree as much as possible with a desired tip order. The heuristic finds the exact solution if it exists, otherwise returns a reasonable order close to the desired order. Tip order in labels and edge matrix agree (via
fit_treenow outputs trees with tip order that better agrees with the input data, and tip order in labels vector and edge matrix now agree (via
tree_additive. Before this bug fix, some trees could trigger the error message “Error: Node index 6 was not assigned coancestry from root! (unexpected)”, where “6” could be other numbers.
draw_p_subpops_tree. Before this bug fix, some trees could trigger the error message “Error: The root node index in
tree_subpops$edge(9) does not match
k_subpops + 1(6) where
k_subpopsis the number of tips! Is the
tree_subpopsobject malformed?”, where “9” and “6” could be other numbers. Other possible error messages contain “Parent node index 6 has not yet been processed …” or “Child” instead of “Parent”, where “6” could be other numbers.
fit_treehad related fixes, but overall
fit_treeappears to have had no bugs because users cannot provide trees, and the tree-building algorithm does not produce scrambled edges that would have caused problems.
draw_all_admixhave a new parameter
maf_minthat, when greater than zero, allows for treating rare variants as fixed. In
draw_all_admix, this now allows for simulating loci with frequency-based ascertainment bias.
draw_all_admixthat could cause a “stack overflow” error. The function used to call itself recursively if
require_polymorphic_loci = TRUE, and in cases where there are very rare allele frequencies or high
maf_minthe number of recursions could be so large that it triggered this error. Now the function has a
whileloop, and does not recurse more than one level at the time; there is no limit to the number of iterations and no errors occur inherently due to large numbers of iterations.
fit_treeinternally simplified to use
stats::hclust, which also results in a small runtime gain. The new code (when
method = "mcquitty", which is default) gives the same answers as before (in other words, the original algorithm was a special case of hierarchical clustering).
methodis passed to
hclust. Although all
hclustmethods are allowed, for this application the only ones that make sense are “mcquitty” (WPGMA) and “average” (UPGMA). In internal evaluations, both algorithms had similar accuracy and runtime, but only “mcquitty” exactly recapitulates the original algorithm.
inst/CITATION(missed last time I updated them in other locations).
undiff_affor creating “undifferentiated” allele frequency distributions based on real data but with a lower variance (more concentrated around 0.5) according to a given FST, useful for simulating data trying to match real data.
NEWS.mdslightly to improve its automatic parsing.
distr = "auto"cases where mixing variance ended up being smaller than required due to roundoff errors (
alphais now larger than given in direct formula by
eps = 10 * .Machine$double.eps, which is also a new option.
p_anc_distrfor passing custom ancestral allele frequency distributions (as vector or function). This differs from the similar preexisting option
p_anc, which fixed ancestral allele frequencies per locus to those values. These two options behave differently when loci have to be re-drawn due to being fixed or having too-low MAFs: passing
p_ancnever changes those values, whereas passing
p_anc_distrresults in drawing new values as necessary. The new option is more natural biologically and results in re-drawing fixed loci less often.
kinship_mean, and updated all documentation to reflect the correction that this parameter is the mean kinship and not FST (the complete derivation will appear in a manuscript).
F_maxis similarly now
bias_coeff_admix_fit) shared by
admix_prop_1d_circularfor edge cases.
Inf, but instead an error was encountered.