Title: | Methods for Adaptive Shrinkage, using Empirical Bayes |
---|---|
Description: | The R package 'ashr' implements an Empirical Bayes approach for large-scale hypothesis testing and false discovery rate (FDR) estimation based on the methods proposed in M. Stephens, 2016, "False discovery rates: a new deal", <DOI:10.1093/biostatistics/kxw041>. These methods can be applied whenever two sets of summary statistics---estimated effects and standard errors---are available, just as 'qvalue' can be applied to previously computed p-values. Two main interfaces are provided: ash(), which is more user-friendly; and ash.workhorse(), which has more options and is geared toward advanced users. The ash() and ash.workhorse() also provides a flexible modeling interface that can accommodate a variety of likelihoods (e.g., normal, Poisson) and mixture priors (e.g., uniform, normal). |
Authors: | Matthew Stephens [aut], Peter Carbonetto [aut, cre], Chaoxing Dai [ctb], David Gerard [aut], Mengyin Lu [aut], Lei Sun [aut], Jason Willwerscheid [aut], Nan Xiao [aut], Mazon Zeng [ctb] |
Maintainer: | Peter Carbonetto <[email protected]> |
License: | GPL (>=3) |
Version: | 2.2-66 |
Built: | 2024-11-11 04:45:33 UTC |
Source: | https://github.com/stephens999/ashr |
Implements Empirical Bayes shrinkage and false discovery rate methods based on unimodal prior distributions.
ash( betahat, sebetahat, mixcompdist = c("uniform", "halfuniform", "normal", "+uniform", "-uniform", "halfnormal"), df = NULL, ... ) ash.workhorse( betahat, sebetahat, method = c("fdr", "shrink"), mixcompdist = c("uniform", "halfuniform", "normal", "+uniform", "-uniform", "halfnormal"), optmethod = c("mixSQP", "mixIP", "cxxMixSquarem", "mixEM", "mixVBEM", "w_mixEM"), df = NULL, nullweight = 10, pointmass = TRUE, prior = c("nullbiased", "uniform", "unit"), mixsd = NULL, gridmult = sqrt(2), outputlevel = 2, g = NULL, fixg = FALSE, mode = 0, alpha = 0, grange = c(-Inf, Inf), control = list(), lik = NULL, weights = NULL, pi_thresh = 1e-10 )
ash( betahat, sebetahat, mixcompdist = c("uniform", "halfuniform", "normal", "+uniform", "-uniform", "halfnormal"), df = NULL, ... ) ash.workhorse( betahat, sebetahat, method = c("fdr", "shrink"), mixcompdist = c("uniform", "halfuniform", "normal", "+uniform", "-uniform", "halfnormal"), optmethod = c("mixSQP", "mixIP", "cxxMixSquarem", "mixEM", "mixVBEM", "w_mixEM"), df = NULL, nullweight = 10, pointmass = TRUE, prior = c("nullbiased", "uniform", "unit"), mixsd = NULL, gridmult = sqrt(2), outputlevel = 2, g = NULL, fixg = FALSE, mode = 0, alpha = 0, grange = c(-Inf, Inf), control = list(), lik = NULL, weights = NULL, pi_thresh = 1e-10 )
betahat |
a p vector of estimates |
sebetahat |
a p vector of corresponding standard errors |
mixcompdist |
distribution of components in mixture used to represent the family G.
Depending on the choice of mixture component, the family G becomes more or less flexible.
Options are:
If you are happy to assume a symmetric distribution for effects, you can use "uniform" or "normal". If you believe your effects may be asymmetric, use "halfuniform" or "halfnormal". If you want to allow only positive/negative effects use "+uniform"/"-uniform". The use of "normal" and "halfnormal" is permitted only if df=NULL. |
df |
appropriate degrees of freedom for (t) distribution of (betahat-beta)/sebetahat; default is NULL which is actually treated as infinity (Gaussian) |
... |
Further arguments of function |
method |
specifies how ash is to be run. Can be "shrinkage" (if main aim is shrinkage) or "fdr" (if main aim is to assess false discovery rate or false sign rate (fsr)). This is simply a convenient way to specify certain combinations of parameters: "shrinkage" sets pointmass=FALSE and prior="uniform"; "fdr" sets pointmass=TRUE and prior="nullbiased". |
optmethod |
The method used to compute maximum-likelihood
estimates of the mixture weights. The default setting,
“mixSQP”, uses the fast sequential quadratric programming
(SQP) method implemented in the mixsqp package. Alternative methods
include the interior-point method implemented in the REBayes
package ( |
nullweight |
scalar, the weight put on the prior under
"nullbiased" specification, see |
pointmass |
Logical, indicating whether to use a point mass at zero as one of components for a mixture distribution. |
prior |
string, or numeric vector indicating Dirichlet prior
on mixture proportions: “nullbiased”,
|
mixsd |
Vector of standard deviations for underlying mixture components. |
gridmult |
the multiplier by which the default grid values for mixsd differ by one another. (Smaller values produce finer grids.) |
outputlevel |
Determines amount of output. There are several numeric options: 0 = just fitted g; 1 = also PosteriorMean and PosteriorSD; 2 = everything usually needed; 3 = also include results of mixture fitting procedure (including matrix of log-likelihoods used to fit mixture). 4 and 5 are reserved for outputting additional data required by the (in-development) flashr package. The user can also specify the output they require in detail (see Examples). |
g |
The prior distribution for beta. Usually this is unspecified (NULL) and estimated from the data. However, it can be used in conjuction with fixg=TRUE to specify the g to use (e.g. useful in simulations to do computations with the "true" g). Or, if g is specified but fixg=FALSE, the g specifies the initial value of g used before optimization, (which also implicitly specifies mixcompdist). |
fixg |
If TRUE, don't estimate g but use the specified g - useful for computations under the "true" g in simulations. |
mode |
either numeric (indicating mode of g) or string "estimate", to indicate mode should be estimated, or a two dimension numeric vector to indicate the interval to be searched for the mode. |
alpha |
Numeric value of alpha parameter in the model. |
grange |
Two dimension numeric vector indicating the left and right limit of g. Default is c(-Inf, Inf). |
control |
A list of control parameters passed to optmethod. |
lik |
Contains details of the likelihood used; for general ash. Currently, the following choices are allowed: normal (see function lik_normal(); binomial likelihood (see function lik_binom); likelihood based on logF error distribution (see function lik_logF); mixture of normals likelihood (see function lik_normalmix); and Poisson likelihood (see function lik_pois). |
weights |
a vector of weights for observations; use with optmethod = "w_mixEM"; this is currently beta-functionality. |
pi_thresh |
a threshold below which to prune out mixture components before computing summaries (speeds up computation since empirically many components are usually assigned negligible weight). The current implementation still returns the full fitted distribution; this only affects the posterior summaries. |
The ash function provides a number of ways to perform Empirical Bayes shrinkage estimation and false discovery rate estimation. The main assumption is that the underlying distribution of effects is unimodal. Novice users are recommended to start with the examples provided below.
In the simplest case the inputs to ash are a vector of estimates (betahat) and their corresponding standard errors (sebetahat), and degrees of freedom (df). The method assumes that for some (unknown) "true" vector of effects beta, the statistic (betahat[j]-beta[j])/sebetahat[j] has a $t$ distribution on $df$ degrees of freedom. (The default of df=NULL assumes a normal distribution instead of a t.)
By default the method estimates the vector beta under the assumption that beta ~ g for a distribution
g in G, where G is some unimodal family of distributions to be specified (see parameter mixcompdist
).
By default is to assume the mode is 0, and this is suitable for settings where you are interested in testing which beta[j]
are non-zero. To estimate the mode see parameter mode
.
As is standard in empirical Bayes methods, the fitting proceeds in two stages: i) estimate g by maximizing a (possibly penalized) likelihood; ii) compute the posterior distribution for each beta[j] | betahat[j],sebetahat[j] using the estimated g as the prior distribution.
A more general case allows that beta[j]/sebetahat[j]^alpha | sebetahat[j] ~ g.
ash returns an object of class
"ash", a
list with some or all of the following elements (determined by
outputlevel)
fitted_g |
fitted mixture |
loglik |
log P(D|fitted_g) |
logLR |
log[P(D|fitted_g)/P(D|beta==0)] |
result |
A dataframe whose columns are: |
A vector of posterior probability that beta is negative.
A vector of posterior probability that beta is positive.
A vector of estimated local false sign rate.
A vector of estimated local false discovery rate.
A vector of q values.
A vector of s values.
A vector consisting the posterior mean of beta from the mixture.
A vector consisting the corresponding posterior standard deviation.
call |
a call in which all of the specified arguments are specified by their full names |
data |
a list containing details of the data and models used (mostly for internal use) |
fit_details |
a list containing results of mixture optimization, and matrix of component log-likelihoods used in this optimization |
ash.workhorse()
: Adaptive Shrinkage with full set of options.
ashci
for computation of credible intervals
after getting the ash object return by ash()
beta = c(rep(0,100),rnorm(100)) sebetahat = abs(rnorm(200,0,1)) betahat = rnorm(200,beta,sebetahat) beta.ash = ash(betahat, sebetahat) names(beta.ash) head(beta.ash$result) # the main dataframe of results head(get_pm(beta.ash)) # get_pm returns posterior mean head(get_lfsr(beta.ash)) # get_lfsr returns the local false sign rate graphics::plot(betahat,get_pm(beta.ash),xlim=c(-4,4),ylim=c(-4,4)) ## Not run: # Why is this example included here? -Peter CIMatrix=ashci(beta.ash,level=0.95) print(CIMatrix) ## End(Not run) # Illustrating the non-zero mode feature. betahat=betahat+5 beta.ash = ash(betahat, sebetahat) graphics::plot(betahat,get_pm(beta.ash)) betan.ash=ash(betahat, sebetahat,mode=5) graphics::plot(betahat,get_pm(betan.ash)) summary(betan.ash) # Running ash with different error models beta.ash1 = ash(betahat, sebetahat, lik = lik_normal()) beta.ash2 = ash(betahat, sebetahat, lik = lik_t(df=4)) e = rnorm(100)+log(rf(100,df1=10,df2=10)) # simulated data with log(F) error e.ash = ash(e,1,lik=lik_logF(df1=10,df2=10)) # Specifying the output beta.ash = ash(betahat, sebetahat, output = c("fitted_g","logLR","lfsr")) #Running ash with a pre-specified g, rather than estimating it beta = c(rep(0,100),rnorm(100)) sebetahat = abs(rnorm(200,0,1)) betahat = rnorm(200,beta,sebetahat) true_g = normalmix(c(0.5,0.5),c(0,0),c(0,1)) # define true g ## Passing this g into ash causes it to i) take the sd and the means ## for each component from this g, and ii) initialize pi to the value ## from this g. beta.ash = ash(betahat, sebetahat,g=true_g,fixg=TRUE) # running with weights beta.ash = ash(betahat, sebetahat, optmethod="w_mixEM", weights = c(rep(0.5,100),rep(1,100))) # Different algorithms can be used to compute maximum-likelihood # estimates of the mixture weights. Here, we illustrate use of the # EM algorithm and the (default) SQP algorithm. set.seed(1) betahat <- c(8.115,9.027,9.289,10.097,9.463) sebeta <- c(0.6157,0.4129,0.3197,0.3920,0.5496) fit.em <- ash(betahat,sebeta,mixcompdist = "normal",optmethod = "mixEM") fit.sqp <- ash(betahat,sebeta,mixcompdist = "normal",optmethod = "mixSQP") range(fit.em$fitted$pi - fit.sqp$fitted$pi)
beta = c(rep(0,100),rnorm(100)) sebetahat = abs(rnorm(200,0,1)) betahat = rnorm(200,beta,sebetahat) beta.ash = ash(betahat, sebetahat) names(beta.ash) head(beta.ash$result) # the main dataframe of results head(get_pm(beta.ash)) # get_pm returns posterior mean head(get_lfsr(beta.ash)) # get_lfsr returns the local false sign rate graphics::plot(betahat,get_pm(beta.ash),xlim=c(-4,4),ylim=c(-4,4)) ## Not run: # Why is this example included here? -Peter CIMatrix=ashci(beta.ash,level=0.95) print(CIMatrix) ## End(Not run) # Illustrating the non-zero mode feature. betahat=betahat+5 beta.ash = ash(betahat, sebetahat) graphics::plot(betahat,get_pm(beta.ash)) betan.ash=ash(betahat, sebetahat,mode=5) graphics::plot(betahat,get_pm(betan.ash)) summary(betan.ash) # Running ash with different error models beta.ash1 = ash(betahat, sebetahat, lik = lik_normal()) beta.ash2 = ash(betahat, sebetahat, lik = lik_t(df=4)) e = rnorm(100)+log(rf(100,df1=10,df2=10)) # simulated data with log(F) error e.ash = ash(e,1,lik=lik_logF(df1=10,df2=10)) # Specifying the output beta.ash = ash(betahat, sebetahat, output = c("fitted_g","logLR","lfsr")) #Running ash with a pre-specified g, rather than estimating it beta = c(rep(0,100),rnorm(100)) sebetahat = abs(rnorm(200,0,1)) betahat = rnorm(200,beta,sebetahat) true_g = normalmix(c(0.5,0.5),c(0,0),c(0,1)) # define true g ## Passing this g into ash causes it to i) take the sd and the means ## for each component from this g, and ii) initialize pi to the value ## from this g. beta.ash = ash(betahat, sebetahat,g=true_g,fixg=TRUE) # running with weights beta.ash = ash(betahat, sebetahat, optmethod="w_mixEM", weights = c(rep(0.5,100),rep(1,100))) # Different algorithms can be used to compute maximum-likelihood # estimates of the mixture weights. Here, we illustrate use of the # EM algorithm and the (default) SQP algorithm. set.seed(1) betahat <- c(8.115,9.027,9.289,10.097,9.463) sebeta <- c(0.6157,0.4129,0.3197,0.3920,0.5496) fit.em <- ash(betahat,sebeta,mixcompdist = "normal",optmethod = "mixEM") fit.sqp <- ash(betahat,sebeta,mixcompdist = "normal",optmethod = "mixSQP") range(fit.em$fitted$pi - fit.sqp$fitted$pi)
Uses Empirical Bayes to fit the model
with
where is a specified link function (either "identity" or "log" are permitted).
ash_pois(y, scale = 1, link = c("identity", "log"), ...)
ash_pois(y, scale = 1, link = c("identity", "log"), ...)
y |
vector of Poisson observations. |
scale |
vector of scale factors for Poisson observations: the model is |
link |
string, either "identity" or "log", indicating the link function. |
... |
other parameters to be passed to ash |
The model is fit in two stages: i) estimate by maximum likelihood (over the set of symmetric
unimodal distributions) to give estimate
;
ii) Compute posterior distributions for
given
.
Note that the link function
affects the prior assumptions (because, e.g., assuming a unimodal prior on
is
different from assuming unimodal on
), but posterior quantities are always computed for the
for
and *not*
.
beta = c(rep(0,50),rexp(50)) y = rpois(100,beta) # simulate Poisson observations y.ash = ash_pois(y,scale=1)
beta = c(rep(0,50),rexp(50)) y = rpois(100,beta) # simulate Poisson observations y.ash = ash_pois(y,scale=1)
Given the ash object returned by the main function ash, this function computes a posterior credible interval (CI) for each observation. The ash object must include a data component to use this function (which it does by default).
ashci( a, level = 0.95, betaindex, lfsr_threshold = 1, tol = 0.001, trace = FALSE )
ashci( a, level = 0.95, betaindex, lfsr_threshold = 1, tol = 0.001, trace = FALSE )
a |
the fitted ash object |
level |
the level for the credible interval, (default=0.95) |
betaindex |
a vector consisting of locations of betahat where you would like to compute the credible interval |
lfsr_threshold |
a scalar, if specified then computes CIs only for observations more significant than that threshold. |
tol |
passed to uniroot; indicates desired accuracy. |
trace |
a logical variable denoting whether some of the intermediate results of iterations should be displayed to the user. Default is FALSE. |
Uses uniroot to find credible interval, one at a time for each observation. The computation cost is linear in number of observations.
A matrix, with 2 columns, ith row giving CI for ith observation
beta = c(rep(0,20),rnorm(20)) sebetahat = abs(rnorm(40,0,1)) betahat = rnorm(40,beta,sebetahat) beta.ash = ash(betahat, sebetahat) CImatrix=ashci(beta.ash,level=0.95) CImatrix1=ashci(beta.ash,level=0.95,betaindex=c(1,2,5)) CImatrix2=ashci(beta.ash,level=0.95,lfsr_threshold=0.1)
beta = c(rep(0,20),rnorm(20)) sebetahat = abs(rnorm(40,0,1)) betahat = rnorm(40,beta,sebetahat) beta.ash = ash(betahat, sebetahat) CImatrix=ashci(beta.ash,level=0.95) CImatrix1=ashci(beta.ash,level=0.95,betaindex=c(1,2,5)) CImatrix2=ashci(beta.ash,level=0.95,lfsr_threshold=0.1)
The main function in the ashr package is ash
, which should be examined for more details. For simplicity only the most commonly-used options are documented under ash
. For expert or interested users the documentation for function ash.workhorse
provides documentation on all implemented options.
Maintainer: Peter Carbonetto [email protected]
Authors:
Matthew Stephens [email protected]
David Gerard
Mengyin Lu
Lei Sun
Jason Willwerscheid
Nan Xiao
Other contributors:
Chaoxing Dai [contributor]
Mazon Zeng [contributor]
Useful links:
Return the log-likelihood of the data for a given g() prior
calc_loglik(g, data)
calc_loglik(g, data)
g |
the fitted g, or an ash object containing g |
data |
a data object, see set_data |
Return the log-likelihood ratio of the data for a given g() prior
calc_logLR(g, data)
calc_logLR(g, data)
g |
the fitted g, or an ash object containing g |
data |
a data object, see set_data |
Generic function of calculating the overall mean of the mixture
calc_mixmean(m)
calc_mixmean(m)
m |
a mixture of k components generated by normalmix() or unimix() or igmix() |
it returns scalar, the mean of the mixture distribution.
Generic function of calculating the overall standard deviation of the mixture
calc_mixsd(m)
calc_mixsd(m)
m |
a mixture of k components generated by normalmix() or unimix() or igmix() |
it returns scalar
Return the log-likelihood of the data betahat, with standard errors betahatsd, under the null that beta==0
calc_null_loglik(data)
calc_null_loglik(data)
data |
a data object; see set_data |
Return the vector of log-likelihoods of the data points under the null
calc_null_vloglik(data)
calc_null_vloglik(data)
data |
a data object; see set_data |
Return the vector of log-likelihoods of the data betahat, with standard errors betahatsd, for a given g() prior on beta, or an ash object containing that
calc_vloglik(g, data)
calc_vloglik(g, data)
g |
the fitted g, or an ash object containing g |
data |
a data object, see set_data |
Return the vector of log-likelihood ratios of the data betahat, with standard errors betahatsd, for a given g() prior on beta, or an ash object containing that, vs the null that g() is point mass on 0
calc_vlogLR(g, data)
calc_vlogLR(g, data)
g |
the fitted g, or an ash object containing g |
data |
a data object, see set_data |
compute cdf of mixture m convoluted with error distribution either normal of sd (s) or student t with df v at locations x
cdf_conv(m, data)
cdf_conv(m, data)
m |
mixture distribution with k components |
data |
details depend on the model |
evaluate cdf of posterior distribution of beta at c. m is the prior on beta, a mixture; c is location of evaluation assumption is betahat | beta ~ t_v(beta,sebetahat)
cdf_post(m, c, data)
cdf_post(m, c, data)
m |
mixture distribution with k components |
c |
a scalar |
data |
details depend on model |
an n vector containing the cdf for beta_i at c
beta = rnorm(100,0,1) betahat= beta+rnorm(100,0,1) sebetahat=rep(1,100) ash.beta = ash(betahat,1,mixcompdist="normal") cdf0 = cdf_post(ash.beta$fitted_g,0,set_data(betahat,sebetahat)) graphics::plot(cdf0,1-get_pp(ash.beta))
beta = rnorm(100,0,1) betahat= beta+rnorm(100,0,1) sebetahat=rep(1,100) ash.beta = ash(betahat,1,mixcompdist="normal") cdf0 = cdf_post(ash.beta$fitted_g,0,set_data(betahat,sebetahat)) graphics::plot(cdf0,1-get_pp(ash.beta))
Computed the cdf of the underlying fitted distribution
cdf.ash(a, x, lower.tail = TRUE)
cdf.ash(a, x, lower.tail = TRUE)
a |
the fitted ash object |
x |
the vector of locations at which cdf is to be computed |
lower.tail |
(default=TRUE) whether to compute the lower or upper tail |
None
Generic function of computing the cdf for each component
comp_cdf(m, y, lower.tail = TRUE)
comp_cdf(m, y, lower.tail = TRUE)
m |
a mixture (eg of type normalmix or unimix) |
y |
locations at which cdf to be computed |
lower.tail |
boolean indicating whether to report lower tail |
it returns a vector of probabilities, with length equals to number of components in m
compute the cdf of data for each component of mixture when convolved with error distribution
comp_cdf_conv(m, data)
comp_cdf_conv(m, data)
m |
mixture distribution with k components |
data |
details depend on the model |
a k by n matrix of cdfs
returns cdf of convolution of each component of a normal mixture with N(0,s^2) at x. Note that convolution of two normals is normal, so it works that way
## S3 method for class 'normalmix' comp_cdf_conv(m, data)
## S3 method for class 'normalmix' comp_cdf_conv(m, data)
m |
mixture distribution with k components |
data |
a list with components x and s to be interpreted as a normally-distributed observation and its standard error |
a k by n matrix
cdf of convolution of each component of a unif mixture
## S3 method for class 'unimix' comp_cdf_conv(m, data)
## S3 method for class 'unimix' comp_cdf_conv(m, data)
m |
a mixture of class unimix |
data |
see set_data() |
a k by n matrix
evaluate cdf of posterior distribution of beta at c. m is the prior on beta, a mixture; c is location of evaluation assumption is betahat | beta ~ t_v(beta,sebetahat)
comp_cdf_post(m, c, data)
comp_cdf_post(m, c, data)
m |
mixture distribution with k components |
c |
a scalar |
data |
details depend on model |
a k by n matrix
beta = rnorm(100,0,1) betahat= beta+rnorm(100,0,1) sebetahat=rep(1,100) ash.beta = ash(betahat,1,mixcompdist="normal") comp_cdf_post(get_fitted_g(ash.beta),0,data=set_data(beta,sebetahat))
beta = rnorm(100,0,1) betahat= beta+rnorm(100,0,1) sebetahat=rep(1,100) ash.beta = ash(betahat,1,mixcompdist="normal") comp_cdf_post(get_fitted_g(ash.beta),0,data=set_data(beta,sebetahat))
Generic function of calculating the component densities of the mixture
comp_dens(m, y, log = FALSE)
comp_dens(m, y, log = FALSE)
m |
mixture of k components generated by normalmix() or unimix() or igmix() |
y |
is an n-vector of location |
log |
whether to use log-scale on densities |
A k by n matrix of densities
compute the density of data for each component of mixture when convolved with error distribution
comp_dens_conv(m, data, ...)
comp_dens_conv(m, data, ...)
m |
mixture distribution with k components |
data |
details depend on the model |
... |
other arguments |
a k by n matrix of densities
returns density of convolution of each component of a normal mixture with N(0,s^2) at x. Note that convolution of two normals is normal, so it works that way
## S3 method for class 'normalmix' comp_dens_conv(m, data, ...)
## S3 method for class 'normalmix' comp_dens_conv(m, data, ...)
m |
mixture distribution with k components |
data |
a list with components x and s to be interpreted as a normally-distributed observation and its standard error |
... |
other arguments (unused) |
a k by n matrix
density of convolution of each component of a unif mixture
## S3 method for class 'unimix' comp_dens_conv(m, data, ...)
## S3 method for class 'unimix' comp_dens_conv(m, data, ...)
m |
a mixture of class unimix |
data |
see set_data() |
... |
other arguments (unused) |
a k by n matrix
Generic function of calculating the first moment of components of the mixture
comp_mean(m)
comp_mean(m)
m |
a mixture of k components generated by normalmix() or unimix() or igmix() |
it returns a vector of means.
returns mean of the normal mixture
## S3 method for class 'normalmix' comp_mean(m)
## S3 method for class 'normalmix' comp_mean(m)
m |
a normal mixture distribution with k components |
a vector of length k
Returns mean of the truncated-normal mixture.
## S3 method for class 'tnormalmix' comp_mean(m)
## S3 method for class 'tnormalmix' comp_mean(m)
m |
A truncated normal mixture distribution with k components. |
A vector of length k.
Generic function of calculating the second moment of components of the mixture
comp_mean2(m)
comp_mean2(m)
m |
a mixture of k components generated by normalmix() or unimix() or igmix() |
it returns a vector of second moments.
output posterior mean for beta for each component of prior mixture m,given data
comp_postmean(m, data)
comp_postmean(m, data)
m |
mixture distribution with k components |
data |
details depend on the model |
output posterior mean-squared value given prior mixture m and data
comp_postmean2(m, data)
comp_postmean2(m, data)
m |
mixture distribution with k components |
data |
details depend on the model |
compute the posterior prob that each observation came from each component of the mixture m,output a k by n vector of probabilities computed by weighting the component densities by pi and then normalizing
comp_postprob(m, data)
comp_postprob(m, data)
m |
mixture distribution with k components |
data |
details depend on the model |
output posterior sd for beta for each component of prior mixture m,given data
comp_postsd(m, data)
comp_postsd(m, data)
m |
mixture distribution with k components |
data |
details depend on the model |
beta = rnorm(100,0,1) betahat= beta+rnorm(100,0,1) ash.beta = ash(betahat,1,mixcompdist="normal") data= set_data(betahat,rep(1,100)) comp_postmean(get_fitted_g(ash.beta),data) comp_postsd(get_fitted_g(ash.beta),data) comp_postprob(get_fitted_g(ash.beta),data)
beta = rnorm(100,0,1) betahat= beta+rnorm(100,0,1) ash.beta = ash(betahat,1,mixcompdist="normal") data= set_data(betahat,rep(1,100)) comp_postmean(get_fitted_g(ash.beta),data) comp_postsd(get_fitted_g(ash.beta),data) comp_postprob(get_fitted_g(ash.beta),data)
Generic function to extract the standard deviations of components of the mixture
comp_sd(m)
comp_sd(m)
m |
a mixture of k components generated by normalmix() or unimix() or igmix() |
it returns a vector of standard deviations
returns sds of the normal mixture
## S3 method for class 'normalmix' comp_sd(m)
## S3 method for class 'normalmix' comp_sd(m)
m |
a normal mixture distribution with k components |
a vector of length k
Returns standard deviations of the truncated normal mixture.
## S3 method for class 'tnormalmix' comp_sd(m)
## S3 method for class 'tnormalmix' comp_sd(m)
m |
A truncated normal mixture distribution with k components. |
A vector of length k.
Function to compute the local false sign rate
compute_lfsr(NegativeProb, ZeroProb)
compute_lfsr(NegativeProb, ZeroProb)
NegativeProb |
A vector of posterior probability that beta is negative. |
ZeroProb |
A vector of posterior probability that beta is zero. |
The local false sign rate.
Explain here what this function does.
cxxMixSquarem(matrix_lik, prior, pi_init, control)
cxxMixSquarem(matrix_lik, prior, pi_init, control)
matrix_lik |
Description of argument goes here. |
prior |
Description of argument goes here. |
pi_init |
Description of argument goes shere. |
control |
Description of argument goes here. |
Find density at y, a generic function
dens(x, y)
dens(x, y)
x |
|
y |
An n-vector of the location. |
compute density of mixture m convoluted with normal of sd (s) or student t with df v at locations x
dens_conv(m, data)
dens_conv(m, data)
m |
mixture distribution with k components |
data |
details depend on the model |
Density function for the log-F distribution with df1
and df2
degrees of freedom (and optional non-centrality parameter ncp
).
dlogf(x, df1, df2, ncp, log = FALSE)
dlogf(x, df1, df2, ncp, log = FALSE)
x |
vector of quantiles |
df1 , df2
|
degrees of freedom |
ncp |
non-centrality parameter. If omitted the central F is assumed. |
log |
logical; if TRUE, probabilities p are given as log(p). |
The density function.
Estimate mixture proportions of a mixture g given noisy (error-prone) data from that mixture.
estimate_mixprop( data, g, prior, optmethod = c("mixSQP", "mixEM", "mixVBEM", "cxxMixSquarem", "mixIP", "w_mixEM"), control, weights = NULL )
estimate_mixprop( data, g, prior, optmethod = c("mixSQP", "mixEM", "mixVBEM", "cxxMixSquarem", "mixIP", "w_mixEM"), control, weights = NULL )
data |
list to be passed to log_comp_dens_conv; details depend on model |
g |
an object representing a mixture distribution (eg normalmix for mixture of normals; unimix for mixture of uniforms). The component parameters of g (eg the means and variances) specify the components whose mixture proportions are to be estimated. The mixture proportions of g are the parameters to be estimated; the values passed in may be used to initialize the optimization (depending on the optmethod used) |
prior |
numeric vector indicating parameters of "Dirichlet prior" on mixture proportions |
optmethod |
name of function to use to do optimization |
control |
list of control parameters to be passed to optmethod, typically affecting things like convergence tolerance |
weights |
vector of weights (for use with w_mixEM; in beta) |
This is used by the ash function. Most users won't need to call this directly, but is exported for use by some other related packages.
list, including the final loglikelihood, the null loglikelihood,
an n by k likelihood matrix with (j,k)th element equal to ,
the fit
and results of optmethod
Produce function to compute expectation of truncated error distribution from log cdf and log pdf (using numerical integration)
gen_etruncFUN(lcdfFUN, lpdfFUN)
gen_etruncFUN(lcdfFUN, lpdfFUN)
lcdfFUN |
the log cdfFUN of the error distribution |
lpdfFUN |
the log pdfFUN of the error distribution |
Return the density of the underlying fitted distribution
get_density(a, x)
get_density(a, x)
a |
the fitted ash object |
x |
the vector of locations at which density is to be computed |
None
These functions simply return elements of an ash object, generally without doing any calculations. (So if the value was not computed during the original call to ash, eg because of how outputlevel was set in the call, then NULL will be returned.) Accessing elements in this way rather than directly from the ash object will help ensure compatability moving forward (e.g. if the internal structure of the ash object changes during software development.)
get_lfsr(x) get_lfdr(a) get_svalue(a) get_qvalue(a) get_pm(a) get_psd(a) get_pp(a) get_np(a) get_loglik(a) get_logLR(a) get_fitted_g(a) get_pi0(a)
get_lfsr(x) get_lfdr(a) get_svalue(a) get_qvalue(a) get_pm(a) get_psd(a) get_pp(a) get_np(a) get_loglik(a) get_logLR(a) get_fitted_g(a) get_pi0(a)
x |
an ash fit (e.g. from running ash) |
a |
an ash fit (e.g. from running ash) |
a vector (ash) of local false sign rates
get_lfsr()
: local false sign rate
get_lfdr()
: local false discovery rate
get_svalue()
: svalue
get_qvalue()
: qvalue
get_pm()
: posterior mean
get_psd()
: posterior standard deviation
get_pp()
: positive probability
get_np()
: negative probability
get_loglik()
: log-likelihood
get_logLR()
: log-likelihood ratio
get_fitted_g()
: fitted g mixture
get_pi0()
: pi0, the proportion of nulls
Returns random samples from the posterior distribution for each observation in an ash object. A matrix is returned, with columns corresponding to observations and rows corresponding to samples.
get_post_sample(a, nsamp)
get_post_sample(a, nsamp)
a |
the fitted ash object |
nsamp |
number of samples to return (for each observation) |
beta = rnorm(100,0,1) betahat= beta+rnorm(100,0,1) ash.beta = ash(betahat,1,mixcompdist="normal") post.beta = get_post_sample(ash.beta,1000)
beta = rnorm(100,0,1) betahat= beta+rnorm(100,0,1) ash.beta = ash(betahat,1,mixcompdist="normal") post.beta = get_post_sample(ash.beta,1000)
Creates an object of class igmix (finite mixture of univariate inverse-gammas)
igmix(pi, alpha, beta)
igmix(pi, alpha, beta)
pi |
vector of mixture proportions |
alpha |
vector of shape parameters |
beta |
vector of rate parameters |
None
an object of class igmix
igmix(c(0.5,0.5),c(1,1),c(1,2))
igmix(c(0.5,0.5),c(1,1),c(1,2))
Creates a likelihood object for ash for use with Binomial error distribution
lik_binom(y, n, link = c("identity", "logit"))
lik_binom(y, n, link = c("identity", "logit"))
y |
Binomial observations |
n |
Binomial number of trials |
link |
Link function. The "identity" link directly puts unimodal prior on Binomial success probabilities p, and "logit" link puts unimodal prior on logit(p). |
Suppose we have Binomial observations y
where .
We either put an unimodal prior g on the success probabilities
(by specifying
link="identity"
) or on the logit success probabilities
(by specifying
link="logit"
). Either way, ASH with this Binomial likelihood function
will compute the posterior mean of the success probabilities .
p = rbeta(100,2,2) # prior mode: 0.5 n = rpois(100,10) y = rbinom(100,n,p) # simulate Binomial observations ash(rep(0,length(y)),1,lik=lik_binom(y,n))
p = rbeta(100,2,2) # prior mode: 0.5 n = rpois(100,10) y = rbinom(100,n,p) # simulate Binomial observations ash(rep(0,length(y)),1,lik=lik_binom(y,n))
Creates a likelihood object for ash for use with logF error distribution
lik_logF(df1, df2)
lik_logF(df1, df2)
df1 |
first degree of freedom parameter of F distribution |
df2 |
second degree of freedom parameter of F distribution |
e = rnorm(100) + log(rf(100,df1=10,df2=10)) # simulate some data with log(F) error ash(e,1,lik=lik_logF(df1=10,df2=10))
e = rnorm(100) + log(rf(100,df1=10,df2=10)) # simulate some data with log(F) error ash(e,1,lik=lik_logF(df1=10,df2=10))
Creates a likelihood object for ash for use with normal error distribution
lik_normal()
lik_normal()
z = rnorm(100) + rnorm(100) # simulate some data with normal error ash(z,1,lik=lik_normal())
z = rnorm(100) + rnorm(100) # simulate some data with normal error ash(z,1,lik=lik_normal())
Creates a likelihood object for ash for use with normal mixture error distribution
lik_normalmix(pilik, sdlik)
lik_normalmix(pilik, sdlik)
pilik |
a k vector of mixture proportions (k is the number of mixture components), or an n*k matrix that the j'th row the is mixture proportions for betahat_j |
sdlik |
a k vector of component-wise standard deviations, or an n*k matrix that the j'th row the is component-wise standard deviations for betahat_j |
e = rnorm(100,0,0.8) e[seq(1,100,by=2)] = rnorm(50,0,1.5) # generate e~0.5*N(0,0.8^2)+0.5*N(0,1.5^2) betahat = rnorm(100)+e ash(betahat, 1, lik=lik_normalmix(c(0.5,0.5),c(0.8,1.5)))
e = rnorm(100,0,0.8) e[seq(1,100,by=2)] = rnorm(50,0,1.5) # generate e~0.5*N(0,0.8^2)+0.5*N(0,1.5^2) betahat = rnorm(100)+e ash(betahat, 1, lik=lik_normalmix(c(0.5,0.5),c(0.8,1.5)))
Creates a likelihood object for ash for use with Poisson error distribution
lik_pois(y, scale = 1, link = c("identity", "log"))
lik_pois(y, scale = 1, link = c("identity", "log"))
y |
Poisson observations. |
scale |
Scale factor for Poisson observations: y~Pois(scale*lambda). |
link |
Link function. The "identity" link directly puts unimodal prior on Poisson intensities lambda, and "log" link puts unimodal prior on log(lambda). |
Suppose we have Poisson observations y
where .
We either put an unimodal prior g on the (scaled) intensities
(by specifying
link="identity"
) or on the log intensities
(by specifying
link="log"
). Either way,
ASH with this Poisson likelihood function will compute the posterior mean of the
intensities .
beta = c(rnorm(100,50,5)) # prior mode: 50 y = rpois(100,beta) # simulate Poisson observations ash(rep(0,length(y)),1,lik=lik_pois(y))
beta = c(rnorm(100,50,5)) # prior mode: 50 y = rpois(100,beta) # simulate Poisson observations ash(rep(0,length(y)),1,lik=lik_pois(y))
Creates a likelihood object for ash for use with t error distribution
lik_t(df)
lik_t(df)
df |
degree of freedom parameter of t distribution |
z = rnorm(100) + rt(100,df=4) # simulate some data with t error ash(z,1,lik=lik_t(df=4))
z = rnorm(100) + rt(100,df=4) # simulate some data with t error ash(z,1,lik=lik_t(df=4))
compute the log density of the components of the mixture m when convoluted with a normal with standard deviation s or a scaled (se) student.t with df v, the density is evaluated at x
log_comp_dens_conv(m, data)
log_comp_dens_conv(m, data)
m |
mixture distribution with k components |
data |
details depend on the model |
a k by n matrix of log densities
returns log-density of convolution of each component of a normal mixture with N(0,s^2) or s*t(v) at x. Note that convolution of two normals is normal, so it works that way
## S3 method for class 'normalmix' log_comp_dens_conv(m, data)
## S3 method for class 'normalmix' log_comp_dens_conv(m, data)
m |
mixture distribution with k components |
data |
a list with components x and s to be interpreted as a normally-distributed observation and its standard error |
a k by n matrix
log density of convolution of each component of a unif mixture
## S3 method for class 'unimix' log_comp_dens_conv(m, data)
## S3 method for class 'unimix' log_comp_dens_conv(m, data)
m |
a mixture of class unimix |
data |
see set_data() |
a k by n matrix of densities
find log likelihood of data using convolution of mixture with error distribution
loglik_conv(m, data)
loglik_conv(m, data)
m |
mixture distribution with k components |
data |
details depend on the model |
The default version of loglik_conv
.
## Default S3 method: loglik_conv(m, data)
## Default S3 method: loglik_conv(m, data)
m |
mixture distribution with k components |
data |
data whose details depend on model |
Returns cdf for a mixture (generic function)
mixcdf(x, y, lower.tail = TRUE)
mixcdf(x, y, lower.tail = TRUE)
x |
a mixture (eg of type normalmix or unimix) |
y |
locations at which cdf to be computed |
lower.tail |
boolean indicating whether to report lower tail |
None
an object of class normalmix
mixcdf(normalmix(c(0.5,0.5),c(0,0),c(1,2)),seq(-4,4,length=100))
mixcdf(normalmix(c(0.5,0.5),c(0,0),c(1,2)),seq(-4,4,length=100))
The default version of mixcdf
.
## Default S3 method: mixcdf(x, y, lower.tail = TRUE)
## Default S3 method: mixcdf(x, y, lower.tail = TRUE)
x |
a mixture (eg of type normalmix or unimix) |
y |
locations at which cdf to be computed |
lower.tail |
boolean indicating whether to report lower tail |
Given the individual component likelihoods for a mixture model, estimates the mixture proportions by an EM algorithm.
mixEM(matrix_lik, prior, pi_init = NULL, control = list())
mixEM(matrix_lik, prior, pi_init = NULL, control = list())
matrix_lik |
a n by k matrix with (j,k)th element equal to |
prior |
a k vector of the parameters of the Dirichlet prior on |
pi_init |
the initial value of |
control |
A list of control parameters for the SQUAREM algorithm, default value is set to be control.default=list(K = 1, method=3, square=TRUE, step.min0=1, step.max0=1, mstep=4, kr=1, objfn.inc=1,tol=1.e-07, maxiter=5000, trace=FALSE). |
Fits a k component mixture model
to independent
and identically distributed data .
Estimates mixture proportions
by maximum likelihood, or by maximum a posteriori (MAP) estimation for a Dirichlet prior on
(if a prior is specified). Uses the SQUAREM package to accelerate convergence of EM. Used by the ash main function; there is no need for a user to call this
function separately, but it is exported for convenience.
A list, including the estimates (pihat), the log likelihood for each interation (B) and a flag to indicate convergence
Given the individual component likelihoods for a mixture model, estimates the mixture proportions.
mixIP(matrix_lik, prior, pi_init = NULL, control = list(), weights = NULL)
mixIP(matrix_lik, prior, pi_init = NULL, control = list(), weights = NULL)
matrix_lik |
a n by k matrix with (j,k)th element equal to |
prior |
a k vector of the parameters of the Dirichlet prior on |
pi_init |
the initial value of |
control |
A list of control parameters to be passed to REBayes::KWDual |
weights |
weights to be assigned to the observations (an n vector) |
Optimizes
subject to pi_k non-negative and sum_k pi_k = 1. Here
is a penalty function h(pi) = sum_k (prior_k-1) log pi_k. Calls REBayes::KWDual in the REBayes package, which is in turn a wrapper to the mosek convex optimization software. So REBayes must be installed to use this. Used by the ash main function; there is no need for a user to call this function separately, but it is exported for convenience.
A list, including the estimates (pihat), the log likelihood for each interation (B) and a flag to indicate convergence
Generic function of calculating the overall second moment of the mixture
mixmean2(m)
mixmean2(m)
m |
a mixture of k components generated by normalmix() or unimix() or igmix() |
it returns scalar
Generic function of extracting the mixture proportions
mixprop(m)
mixprop(m)
m |
a mixture of k components generated by normalmix() or unimix() or igmix() |
it returns a vector of component probabilities, summing up to 1.
Estimate mixture proportions of a mixture model using mix-SQP algorithm.
mixSQP(matrix_lik, prior, pi_init = NULL, control = list(), weights = NULL)
mixSQP(matrix_lik, prior, pi_init = NULL, control = list(), weights = NULL)
matrix_lik |
A matrix containing the conditional likelihood values, possibly normalized. |
prior |
A vector of the parameters of the Dirichlet prior on the mixture weights. |
pi_init |
The initial estimate of the mixture weights. |
control |
A list of settings for the mix-SQP optimization
algorithm; see |
weights |
The weights to be assigned to the observations. Must
be a vector of length equal the number of rows of |
A list object including the estimates (pihat
) and a
flag (control
) indicating convergence success or failure.
Given the individual component likelihoods for a mixture model, estimates the posterior on the mixture proportions by an VBEM algorithm. Used by the ash main function; there is no need for a user to call this function separately, but it is exported for convenience.
mixVBEM(matrix_lik, prior, pi_init = NULL, control = list())
mixVBEM(matrix_lik, prior, pi_init = NULL, control = list())
matrix_lik |
a n by k matrix with (j,k)th element equal to |
prior |
a k vector of the parameters of the Dirichlet prior on |
pi_init |
the initial value of the posterior parameters. If not specified defaults to the prior parameters. |
control |
A list of control parameters for the SQUAREM algorithm, default value is set to be control.default=list(K = 1, method=3, square=TRUE, step.min0=1, step.max0=1, mstep=4, kr=1, objfn.inc=1,tol=1.e-07, maxiter=5000, trace=FALSE). |
Fits a k component mixture model
to independent
and identically distributed data .
Estimates posterior on mixture proportions
by Variational Bayes,
with a Dirichlet prior on
.
Algorithm adapted from Bishop (2009), Pattern Recognition and Machine Learning, Chapter 10.
A list, whose components include point estimates (pihat),
the parameters of the fitted posterior on (pipost),
the bound on the log likelihood for each iteration (B)
and a flag to indicate convergence (converged).
Compute second moment of the truncated Beta.
my_e2truncbeta(a, b, alpha, beta)
my_e2truncbeta(a, b, alpha, beta)
a |
left limit of distribution |
b |
right limit of distribution |
alpha , beta
|
shape parameters of Beta distribution |
Compute second moment of the truncated gamma.
my_e2truncgamma(a, b, shape, rate)
my_e2truncgamma(a, b, shape, rate)
a |
left limit of distribution |
b |
right limit of distribution |
shape |
shape of gamma distribution |
rate |
rate of gamma distribution |
Computes the expected squared values of truncated normal
distributions with parameters a
, b
, mean
, and
sd
. Arguments can be scalars, vectors, or matrices. Arguments of
shorter length will be recycled according to the usual recycling rules,
but a
and b
must have the same length. Missing values are
accepted for all arguments.
my_e2truncnorm(a, b, mean = 0, sd = 1)
my_e2truncnorm(a, b, mean = 0, sd = 1)
a |
The lower limit for the support of the truncated normal. Can be
|
b |
The upper limit for the support. Can be |
mean |
The mean of the untruncated normal. |
sd |
The standard deviation of the untruncated normal. Standard
deviations of zero are interpreted as numerically (rather than exactly)
zero, so that the square of the untruncated mean is returned if it lies
within |
The expected squared values of truncated normal
distributions with parameters a
, b
, mean
, and
sd
. If any of the arguments is a matrix, then a matrix will
be returned.
Compute second moment of the truncated t. Uses results from O'Hagan, Biometrika, 1973
my_e2trunct(a, b, df)
my_e2trunct(a, b, df)
a |
left limit of distribution |
b |
right limit of distribution |
df |
degree of freedom of error distribution |
Compute mean of the truncated Beta.
my_etruncbeta(a, b, alpha, beta)
my_etruncbeta(a, b, alpha, beta)
a |
left limit of distribution |
b |
right limit of distribution |
alpha , beta
|
shape parameters of Beta distribution |
Compute mean of the truncated gamma.
my_etruncgamma(a, b, shape, rate)
my_etruncgamma(a, b, shape, rate)
a |
left limit of distribution |
b |
right limit of distribution |
shape |
shape of gamma distribution |
rate |
rate of gamma distribution |
Compute expectation of truncated log-F distribution.
my_etrunclogf(a, b, df1, df2)
my_etrunclogf(a, b, df1, df2)
a |
Left limit of distribution. |
b |
Right limit of distribution. |
df1 , df2
|
degrees of freedom |
Computes the means of truncated normal distributions with
parameters a
, b
, mean
, and sd
. Arguments
can be scalars, vectors, or matrices. Arguments of shorter length will
be recycled according to the usual recycling rules, but a
and
b
must have the same length. Missing values are accepted for all
arguments.
my_etruncnorm(a, b, mean = 0, sd = 1)
my_etruncnorm(a, b, mean = 0, sd = 1)
a |
The lower limit for the support of the truncated normal. Can be
|
b |
The upper limit for the support. Can be |
mean |
The mean of the untruncated normal. |
sd |
The standard deviation of the untruncated normal. Standard
deviations of zero are interpreted as numerically (rather than exactly)
zero, so that the untruncated mean is returned if it lies within
|
The expected values of truncated normal distributions with
parameters a
, b
, mean
, and sd
. If any of the
arguments is a matrix, then a matrix will be returned.
Compute second moment of the truncated t. Uses results from O'Hagan, Biometrika, 1973
my_etrunct(a, b, df)
my_etrunct(a, b, df)
a |
left limit of distribution |
b |
right limit of distribution |
df |
degree of freedom of error distribution |
Computes the variance of truncated normal distributions with
parameters a
, b
, mean
, and sd
. Arguments can
be scalars, vectors, or matrices. Arguments of shorter length will be
recycled according to the usual recycling rules, but a
and b
must have the same length. Missing values are accepted for all arguments.
my_vtruncnorm(a, b, mean = 0, sd = 1)
my_vtruncnorm(a, b, mean = 0, sd = 1)
a |
The lower limit for the support of the truncated normal. Can be
|
b |
The upper limit for the support. Can be |
mean |
The mean of the untruncated normal. |
sd |
The standard deviation of the untruncated normal. |
The variance of truncated normal distributions with parameters
a
, b
, mean
, and sd
. If any of the arguments
is a matrix, then a matrix will be returned.
ncomp
ncomp(m)
ncomp(m)
m |
a mixture of k components generated by normalmix() or unimix() or igmix() |
The default version of ncomp
.
## Default S3 method: ncomp(m)
## Default S3 method: ncomp(m)
m |
a mixture of k components generated by normalmix() or unimix() or igmix() |
Creates an object of class normalmix (finite mixture of univariate normals)
normalmix(pi, mean, sd)
normalmix(pi, mean, sd)
pi |
vector of mixture proportions |
mean |
vector of means |
sd |
vector of standard deviations |
None
an object of class normalmix
normalmix(c(0.5,0.5),c(0,0),c(1,2))
normalmix(c(0.5,0.5),c(0,0),c(1,2))
“parallel" vector version of cdf_post
where c is a vector, of same length as betahat and sebetahat
pcdf_post(m, c, data)
pcdf_post(m, c, data)
m |
mixture distribution with k components |
c |
a numeric vector with n elements |
data |
depends on context |
an n vector, whose ith element is the cdf for beta_i at c_i
beta = rnorm(100,0,1) betahat= beta+rnorm(100,0,1) sebetahat=rep(1,100) ash.beta = ash(betahat,1,mixcompdist="normal") c = pcdf_post(get_fitted_g(ash.beta),beta,set_data(betahat,sebetahat))
beta = rnorm(100,0,1) betahat= beta+rnorm(100,0,1) sebetahat=rep(1,100) ash.beta = ash(betahat,1,mixcompdist="normal") c = pcdf_post(get_fitted_g(ash.beta),beta,set_data(betahat,sebetahat))
Distribution function for the log-F distribution with df1
and df2
degrees of freedom (and optional non-centrality parameter ncp
).
plogf(q, df1, df2, ncp, lower.tail = TRUE, log.p = FALSE)
plogf(q, df1, df2, ncp, lower.tail = TRUE, log.p = FALSE)
q |
vector of quantiles |
df1 , df2
|
degrees of freedom |
ncp |
non-centrality parameter. If omitted the central F is assumed. |
lower.tail |
logical; if TRUE (default), probabilities are P[X <= x], otherwise, P[X > x]. |
log.p |
logical; if TRUE, probabilities p are given as log(p). |
The distribution function.
Generate several plots to diagnose the fitness of ASH on the data
plot_diagnostic( x, plot.it = TRUE, sebetahat.tol = 0.001, plot.hist, xmin, xmax, breaks = "Sturges", alpha = 0.01, pch = 19, cex = 0.25 )
plot_diagnostic( x, plot.it = TRUE, sebetahat.tol = 0.001, plot.hist, xmin, xmax, breaks = "Sturges", alpha = 0.01, pch = 19, cex = 0.25 )
x |
the fitted ash object |
plot.it |
logical. whether to plot the diagnostic result |
sebetahat.tol |
tolerance to test the equality of betahat |
plot.hist |
logical. whether to plot the histogram of betahat when sebetahat is not constant |
xmin , xmax
|
range of the histogram of betahat to be plotted |
breaks |
histograms parameter (see |
alpha |
error level for the de-trended diagnostic plot |
pch , cex
|
plot parameters for dots |
None.
Plot the cdf of the underlying fitted distribution
## S3 method for class 'ash' plot(x, ..., xmin, xmax)
## S3 method for class 'ash' plot(x, ..., xmin, xmax)
x |
the fitted ash object |
... |
Arguments to be passed to methods,such as graphical parameters (see |
xmin |
xlim lower range, default is the lowest value of betahat |
xmax |
xlim upper range, default is the highest value of betahat |
None
Generic function to extract which components of mixture are point mass on 0
pm_on_zero(m)
pm_on_zero(m)
m |
a mixture of k components generated by normalmix() or unimix() or igmix() |
a boolean vector indicating which components are point mass on 0
returns random samples from the posterior, given a prior distribution m and n observed datapoints.
post_sample(m, data, nsamp)
post_sample(m, data, nsamp)
m |
prior distribution (eg of type normalmix) |
data |
a list with components x and s, each vectors of length n, to be interpreted as a normally-distributed observations and corresponding standard errors |
nsamp |
number of random samples to return for each observation |
exported, but mostly users will want to use 'get_post_sample'
an nsamp by n matrix
returns random samples from the posterior, given a prior distribution m and n observed datapoints.
## S3 method for class 'normalmix' post_sample(m, data, nsamp)
## S3 method for class 'normalmix' post_sample(m, data, nsamp)
m |
mixture distribution with k components |
data |
a list with components x and s to be interpreted as a normally-distributed observation and its standard error |
nsamp |
number of samples to return for each observation |
a nsamp by n matrix
returns random samples from the posterior, given a prior distribution m and n observed datapoints.
## S3 method for class 'unimix' post_sample(m, data, nsamp)
## S3 method for class 'unimix' post_sample(m, data, nsamp)
m |
mixture distribution with k components |
data |
a list with components x and s to be interpreted as a normally-distributed observation and its standard error |
nsamp |
number of samples to return for each observation |
a nsamp by n matrix
Return the posterior on beta given a prior (g) that is
a mixture of normals (class normalmix) and observation
posterior_dist(g, betahat, sebetahat)
posterior_dist(g, betahat, sebetahat)
g |
a normalmix with components indicating the prior; works only if g has means 0 |
betahat |
(n vector of observations) |
sebetahat |
(n vector of standard errors/deviations of observations) |
This can be used to obt
A list, (pi1,mu1,sigma1) whose components are each k by n matrices where k is number of mixture components in g, n is number of observations in betahat
postmean
postmean(m, data)
postmean(m, data)
m |
mixture distribution with k components |
data |
details depend on the model |
output posterior mean-squared value given prior mixture m and data
postmean2(m, data)
postmean2(m, data)
m |
mixture distribution with k components |
data |
details depend on the model |
output posterior sd given prior mixture m and data
postsd(m, data)
postsd(m, data)
m |
mixture distribution with k components |
data |
details depend on the model |
Print the fitted distribution of beta values in the EB hierarchical model
## S3 method for class 'ash' print(x, ...)
## S3 method for class 'ash' print(x, ...)
x |
the fitted ash object |
... |
not used, included for consistency as an S3 generic/method. |
None
prunes out mixture components with low weight
prune(m, thresh = 1e-10)
prune(m, thresh = 1e-10)
m |
What is this argument? |
thresh |
the threshold below which components are removed |
Computes q values from a vector of local fdr estimates
qval.from.lfdr(lfdr)
qval.from.lfdr(lfdr)
lfdr |
a vector of local fdr estimates |
The q value for a given lfdr is an estimate of the (tail) False Discovery Rate for all findings with a smaller lfdr, and is found by the average of the lfdr for all more significant findings. See Storey (2003), Annals of Statistics, for definition of q value.
vector of q values
Takes raw data and sets up data object for use by ash
set_data(betahat, sebetahat, lik = NULL, alpha = 0)
set_data(betahat, sebetahat, lik = NULL, alpha = 0)
betahat |
vector of betahats |
sebetahat |
vector of standard errors |
lik |
a likelihood (see e.g., lik_normal()) |
alpha |
specifies value of alpha to use (model is for betahat/sebetahat^alpha | sebetahat) |
The data object stores both the data, and details of the model to be used for the data. For example, in the generalized version of ash the cdf and pdf of the likelihood are stored here.
data object (list)
Print summary of fitted ash object
## S3 method for class 'ash' summary(object, ...)
## S3 method for class 'ash' summary(object, ...)
object |
the fitted ash object |
... |
not used, included for consistency as an S3 generic/method. |
summary
prints the fitted mixture, the
fitted log likelihood with 10 digits and a flag to indicate
convergence
Creates an object of class tnormalmix (finite mixture of truncated univariate normals).
tnormalmix(pi, mean, sd, a, b)
tnormalmix(pi, mean, sd, a, b)
pi |
Cector of mixture proportions (length k say). |
mean |
Vector of means (length k). |
sd |
Vector of standard deviations (length k). |
a |
Vector of left truncation points of each component (length k). |
b |
Cector of right truncation points of each component (length k). |
An object of class “tnormalmix”.
tnormalmix(c(0.5,0.5),c(0,0),c(1,2),c(-10,0),c(0,10))
tnormalmix(c(0.5,0.5),c(0,0),c(1,2),c(-10,0),c(0,10))
Creates an object of class unimix (finite mixture of univariate uniforms)
unimix(pi, a, b)
unimix(pi, a, b)
pi |
vector of mixture proportions |
a |
vector of left hand ends of uniforms |
b |
vector of right hand ends of uniforms |
None
an object of class unimix
unimix(c(0.5,0.5),c(0,0),c(1,2))
unimix(c(0.5,0.5),c(0,0),c(1,2))
vectorized version of cdf_post
vcdf_post(m, c, data)
vcdf_post(m, c, data)
m |
mixture distribution with k components |
c |
a numeric vector |
data |
depends on context |
an n vector containing the cdf for beta_i at c
beta = rnorm(100,0,1) betahat= beta+rnorm(100,0,1) sebetahat=rep(1,100) ash.beta = ash(betahat,1,mixcompdist="normal") c = vcdf_post(get_fitted_g(ash.beta),seq(-5,5,length=1000),data = set_data(betahat,sebetahat))
beta = rnorm(100,0,1) betahat= beta+rnorm(100,0,1) sebetahat=rep(1,100) ash.beta = ash(betahat,1,mixcompdist="normal") c = vcdf_post(get_fitted_g(ash.beta),seq(-5,5,length=1000),data = set_data(betahat,sebetahat))
Given the individual component likelihoods for a mixture model, and a set of weights, estimates the mixture proportions by an EM algorithm.
w_mixEM(matrix_lik, prior, pi_init = NULL, weights = NULL, control = list())
w_mixEM(matrix_lik, prior, pi_init = NULL, weights = NULL, control = list())
matrix_lik |
a n by k matrix with (j,k)th element equal to |
prior |
a k vector of the parameters of the Dirichlet prior on |
pi_init |
the initial value of |
weights |
an n vector of weights |
control |
A list of control parameters for the SQUAREM algorithm, default value is set to be control.default=list(K = 1, method=3, square=TRUE, step.min0=1, step.max0=1, mstep=4, kr=1, objfn.inc=1,tol=1.e-07, maxiter=5000, trace=FALSE). |
Fits a k component mixture model
to independent
and identically distributed data with weights
.
Estimates mixture proportions
by maximum likelihood, or by maximum a posteriori (MAP) estimation for a Dirichlet prior on
(if a prior is specified). Here the log-likelihood for the weighted data is defined as
. Uses the SQUAREM package to accelerate convergence of EM. Used by the ash main function; there is no need for a user to call this
function separately, but it is exported for convenience.
A list, including the estimates (pihat), the log likelihood for each interation (B) and a flag to indicate convergence