Title: | Simple Bootstrap Routines |
---|---|
Description: | Simple bootstrap routines. |
Authors: | Roger D. Peng <[email protected]> |
Maintainer: | Roger D. Peng <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1-8 |
Built: | 2024-11-08 04:42:41 UTC |
Source: | https://github.com/rdpeng/simpleboot |
Construct a histogram of the bootstrap distribution of univariate statistic.
## S3 method for class 'simpleboot' hist(x, do.rug = FALSE, xlab = "Bootstrap samples", main = "", ...)
## S3 method for class 'simpleboot' hist(x, do.rug = FALSE, xlab = "Bootstrap samples", main = "", ...)
x |
An object of class |
do.rug |
Should a rug of the bootstrap distribution be plotted under the histogram? |
xlab |
The label for the x-axis. |
main |
The title for the histogram. |
... |
Other arguments passed to |
hist
constructs a histogram for the bootstrap distribution of a
univariate statistic. It cannot be used with linear model or loess
bootstraps. In the histogram a red dotted line is plotted denoting
the observed value of the statistic.
Nothing is returned.
Roger D. Peng
x <- rnorm(100) ## Bootstrap the 75th percentile b <- one.boot(x, quantile, R = 1000, probs = 0.75) hist(b)
x <- rnorm(100) ## Bootstrap the 75th percentile b <- one.boot(x, quantile, R = 1000, probs = 0.75) hist(b)
Bootstrapping of linear model fits (using lm
). Bootstrapping
can be done by either resampling rows of the original data frame or
resampling residuals from the original model fit.
lm.boot(lm.object, R, rows = TRUE, new.xpts = NULL, ngrid = 100, weights = NULL)
lm.boot(lm.object, R, rows = TRUE, new.xpts = NULL, ngrid = 100, weights = NULL)
lm.object |
A linear model fit, produced by |
R |
The number of bootstrap replicates to use. |
rows |
Should we resample rows? Setting |
new.xpts |
Values at which you wish to make new predictions. If specified, fitted values from each bootstrap sample will be stored. |
ngrid |
If |
weights |
Reseampling weights; a vector of length equal to the number of observations. |
Currently, "lm.simpleboot"
objects have a simple print
method (which shows the original fit), a summary
method and a
plot
method.
An object of class "lm.simpleboot"
(which is a list) containing the
elements:
method |
Which method of bootstrapping was used (rows or residuals). |
boot.list |
A list containing values from each of the bootstrap samples. Currently, bootstrapped values are model coefficients, residual sum of squares, R-square, and fitted values for predictions. |
orig.lm |
The original model fit. |
new.xpts |
The locations where predictions were made. |
weights |
The resampling weights. If none were used, this
component is |
Roger D. Peng
The plot.lm.simpleboot
method.
data(airquality) attach(airquality) set.seed(30) lmodel <- lm(Ozone ~ Wind) lboot <- lm.boot(lmodel, R = 1000) summary(lboot) ## With weighting w <- runif(nrow(model.frame(lmodel))) lbootw <- lm.boot(lmodel, R = 1000, weights = w) summary(lbootw) ## Resample residuals lboot2 <- lm.boot(lmodel, R = 1000, rows = FALSE) summary(lboot2)
data(airquality) attach(airquality) set.seed(30) lmodel <- lm(Ozone ~ Wind) lboot <- lm.boot(lmodel, R = 1000) summary(lboot) ## With weighting w <- runif(nrow(model.frame(lmodel))) lbootw <- lm.boot(lmodel, R = 1000, weights = w) summary(lbootw) ## Resample residuals lboot2 <- lm.boot(lmodel, R = 1000, rows = FALSE) summary(lboot2)
Methods for "lm.simpleboot"
class objects.
## S3 method for class 'lm.simpleboot' summary(object, ...) ## S3 method for class 'summary.lm.simpleboot' print(x, ...) ## S3 method for class 'lm.simpleboot' fitted(object, ...)
## S3 method for class 'lm.simpleboot' summary(object, ...) ## S3 method for class 'summary.lm.simpleboot' print(x, ...) ## S3 method for class 'lm.simpleboot' fitted(object, ...)
object |
An object of class |
x |
An object of class |
... |
Other arguments passed to and from other methods. |
print
is essentially the same as the usual
printing of a linear model fit, except the bootstrap standard errors
are printed for each model coefficient.
fitted
returns the fitted values from each
bootstrap sample for the predictor values specified by the
new.xpts
in lm.boot
(or from the grid if new.xpts
was not specified). This is a p x R matrix where p is the number of
points where prediction was desired and R is the number of bootstrap
samples specified. Using fitted
is the equivalent
of using samples(object, name = "fitted")
.
summary
returns a list containing the original
estimated coefficients and their bootstrap standard errors.
Roger D. Peng
lm.boot
.
data(airquality) attach(airquality) lmodel <- lm(Ozone ~ Wind + Solar.R) lboot <- lm.boot(lmodel, R = 300) summary(lboot)
data(airquality) attach(airquality) lmodel <- lm(Ozone ~ Wind + Solar.R) lboot <- lm.boot(lmodel, R = 300) summary(lboot)
Bootstrapping of loess fits produced by the loess
function in
the modreg
package. Bootstrapping can be done by resampling
rows from the original data frame or resampling residuals from the
original model fit.
loess.boot(lo.object, R, rows = TRUE, new.xpts = NULL, ngrid = 100, weights = NULL)
loess.boot(lo.object, R, rows = TRUE, new.xpts = NULL, ngrid = 100, weights = NULL)
lo.object |
A loess fit, produced by |
R |
The number of bootstrap replicates. |
rows |
Should we resample rows? Setting |
new.xpts |
Locations where new predictions are to be made. If
|
ngrid |
Number of grid points to use if |
weights |
Resampling weights; a vector with length equal to the number of observations. |
The user can specify locations for new predictions through
new.xpts
or an evenly spaced grid will be used. In either
case, fitted values at each new location will be stored from each
bootstrap sample. These fitted values can be retrieved using either
the fitted
method or the samples
function.
Note that the loess
function has many parameters for the user
to set that can be difficult to reproduce in the bootstrap setting.
Right now, the user can only specify the span
argument to
loess
in the original fit.
An object of class "loess.simpleboot"
(which is a list)
containing the elements:
method |
Which method of bootstrapping was used (rows or residuals). |
boot.list |
A list containing values from each of the bootstrap samples. Currently, only residual sum of squares and fitted values are stored. |
orig.loess |
The original loess fit. |
new.xpts |
The locations where predictions were made (specified
in the original call to |
Roger D. Peng
set.seed(1234) x <- runif(100) ## Simple sine function simulation y <- sin(2*pi*x) + .2 * rnorm(100) plot(x, y) ## Sine function with noise lo <- loess(y ~ x, span = .4) ## Bootstrap with resampling of rows lo.b <- loess.boot(lo, R = 500) ## Plot original fit with +/- 2 std. errors plot(lo.b) ## Plot all loess bootstrap fits plot(lo.b, all.lines = TRUE) ## Bootstrap with resampling residuals lo.b2 <- loess.boot(lo, R = 500, rows = FALSE) plot(lo.b2)
set.seed(1234) x <- runif(100) ## Simple sine function simulation y <- sin(2*pi*x) + .2 * rnorm(100) plot(x, y) ## Sine function with noise lo <- loess(y ~ x, span = .4) ## Bootstrap with resampling of rows lo.b <- loess.boot(lo, R = 500) ## Plot original fit with +/- 2 std. errors plot(lo.b) ## Plot all loess bootstrap fits plot(lo.b, all.lines = TRUE) ## Bootstrap with resampling residuals lo.b2 <- loess.boot(lo, R = 500, rows = FALSE) plot(lo.b2)
Methods for "loess.simpleboot"
class objects.
## S3 method for class 'loess.simpleboot' fitted(object, ...)
## S3 method for class 'loess.simpleboot' fitted(object, ...)
object |
An object of class |
... |
Other arguments passed to and from other methods. |
fitted
returns a n x R matrix of fitted values where n is the
number of new locations at which predictions were made and R is the
number of bootstrap replications used in the original loess
bootstrap. This is the equivalent of calling samples(object,
"fitted")
.
Nothing is returned.
Roger D. Peng
one.boot
is used for bootstrapping a univariate statistic for
one sample problems. Examples include the mean
, median
,
etc.
one.boot(data, FUN, R, student = FALSE, M, weights = NULL, ...)
one.boot(data, FUN, R, student = FALSE, M, weights = NULL, ...)
data |
The data. This should be a vector of numbers. |
FUN |
The statistic to be bootstrapped. This can be either a quoted string containing the name of a function or simply the function name. |
R |
The number of bootstrap replicates to use. |
student |
Should we do a studentized bootstrap? This requires a double bootstrap so it might take longer. |
M |
If |
weights |
Resampling weights; a vector of length equal to the number of observations. |
... |
Other (named) arguments that should be passed to |
An object of class "simpleboot"
, which is almost identical to the
regular "boot"
object. For example, the boot.ci
function can be used on this object.
Roger D. Peng
library(boot) set.seed(20) x <- rgamma(100, 1) b.mean <- one.boot(x, mean, 500) print(b.mean) boot.ci(b.mean) ## No studentized interval here hist(b.mean) ## Bootstrap with weights set.seed(10) w <- runif(100) bw <- one.boot(x, median, 100, weights = w) print(bw) ## Studentized bw.stud <- one.boot(x, median, R = 100, student = TRUE, M = 50, weights = w) boot.ci(bw.stud, type = "stud")
library(boot) set.seed(20) x <- rgamma(100, 1) b.mean <- one.boot(x, mean, 500) print(b.mean) boot.ci(b.mean) ## No studentized interval here hist(b.mean) ## Bootstrap with weights set.seed(10) w <- runif(100) bw <- one.boot(x, median, 100, weights = w) print(bw) ## Studentized bw.stud <- one.boot(x, median, R = 100, student = TRUE, M = 50, weights = w) boot.ci(bw.stud, type = "stud")
pairs.boot
is used to bootstrap a statistic which operates on
two samples and returns a single value. An example of such a
statistic is the correlation coefficient (i.e. cor
).
Resampling is done pairwise, so x
and y
must have the
same length (and be ordered correctly). One can alternatively pass a
two-column matrix to x
.
pairs_boot(x, y = NULL, FUN, R, student = FALSE, M, weights = NULL, ...)
pairs_boot(x, y = NULL, FUN, R, student = FALSE, M, weights = NULL, ...)
x |
Either a vector of numbers representing the first sample or a two column matrix containing both samples. |
y |
If NULL it is assumed that |
FUN |
The statistic to bootstrap. If |
R |
The number of bootstrap replicates. |
student |
Should we do a studentized bootstrap? This requires a double bootstrap so it might take longer. |
M |
If |
weights |
Resampling weights. |
... |
Other (named) arguments that should be passed to |
An object of class "simpleboot"
, which is almost identical to the
regular "boot"
object. For example, the boot.ci
function can be used on this object.
Roger D. Peng
library(boot) set.seed(1) x <- rnorm(100) y <- 2 * x + rnorm(100) boot.cor <- pairs_boot(x, y, FUN = cor, R = 100) boot.ci(boot.cor)
library(boot) set.seed(1) x <- rnorm(100) y <- 2 * x + rnorm(100) boot.cor <- pairs_boot(x, y, FUN = cor, R = 100) boot.ci(boot.cor)
perc
can be used to extract percentiles from the sampling
distribution of a statistic.
perc(boot.out, p = c(0.025, 0.975)) perc.lm(lm.boot.obj, p)
perc(boot.out, p = c(0.025, 0.975)) perc.lm(lm.boot.obj, p)
boot.out |
Output from either |
p |
numeric vector with values in [0, 1]. |
lm.boot.obj |
An object of class |
perc
automatically calls perc.lm
if boot.out
is
of the class "lm.simpleboot"
so there is no need to use
perc.lm
separately.
For bootstraps which are not linear model bootstraps, perc
returns a vector of percentiles of length
length(p)
. Linear interpolation of percentiles is done if
necessary. perc.lm
returns a matrix of percentiles
of each of the model coefficients. For example, if there are k model
coefficients, the perc.lm
returns a length(p)
by k matrix.
Roger D. Peng
x <- rnorm(100) b <- one.boot(x, median, R = 1000) perc(b, c(.90, .95, .99))
x <- rnorm(100) b <- one.boot(x, median, R = 1000) perc(b, c(.90, .95, .99))
Plot regression lines with bootstrap standard errors. This method only works for 2-D regression fits.
## S3 method for class 'lm.simpleboot' plot(x, add = FALSE, ...)
## S3 method for class 'lm.simpleboot' plot(x, add = FALSE, ...)
x |
An object of class |
add |
Switch indicating whether the regression line should be added to the current plot. |
... |
Additional arguments passed down to |
This function plots the data and the original regression line fit
along with +/- 2 bootstrap standard errors at locations specified by
the new.xpts
argument to lm.boot
(or on an evenly spaced
grid).
Nothing is returned.
Roger D. Peng
## None right now
## None right now
Plot loess lines with bootstrap standard errors.
## S3 method for class 'loess.simpleboot' plot(x, add = FALSE, all.lines = FALSE, ...)
## S3 method for class 'loess.simpleboot' plot(x, add = FALSE, all.lines = FALSE, ...)
x |
An object of class |
add |
Should the loess line be plotted over the current plot? |
all.lines |
Should we plot each of the individual loess lines from the bootstrap samples? |
... |
Other arguments passed to |
plot
constructs (and plots) the original loess fit and +/- 2
bootstrap standard errors at locations specified in the new.xpts
in loess.boot
(or on an evenly spaced grid).
Nothing is returned.
Roger D. Peng
## See the help page for `loess.boot' for an example.
## See the help page for `loess.boot' for an example.
Extract sampling distributions of various entities from either a linear model or a loess bootstrap. Entities for linear models are currently, model coefficients, residual sum of squares, R-square, and fitted values (given a set of X values in the original bootstrap). For loess, one can extract residual sum of squares and fitted values.
samples(object, name = c("fitted", "coef", "rsquare", "rss"))
samples(object, name = c("fitted", "coef", "rsquare", "rss"))
object |
The output from either |
name |
The name of the entity to extract. The default is fitted values. |
Either a vector or matrix depending on the entity extracted. For example, when extracting the sampling distributions for linear model coefficents, the return value is p x R matrix where p is the number of coefficients and R is the number of bootstrap replicates.
Roger D. Peng
data(airquality) attach(airquality) lmodel <- lm(Ozone ~ Solar.R + Wind) lboot <- lm.boot(lmodel, R = 100) ## Get sampling distributions for coefficients s <- samples(lboot, "coef") ## Histogram for the intercept hist(s[1,])
data(airquality) attach(airquality) lmodel <- lm(Ozone ~ Solar.R + Wind) lboot <- lm.boot(lmodel, R = 100) ## Get sampling distributions for coefficients s <- samples(lboot, "coef") ## Histogram for the intercept hist(s[1,])
two.boot
is used to bootstrap the difference between various
univariate statistics. An example is the difference of means.
Bootstrapping is
done by independently resampling from sample1
and sample2
.
two.boot(sample1, sample2, FUN, R, student = FALSE, M, weights = NULL, ...)
two.boot(sample1, sample2, FUN, R, student = FALSE, M, weights = NULL, ...)
sample1 |
First sample; a vector of numbers. |
sample2 |
Second sample; a vector of numbers. |
FUN |
The statistic which is applied to each sample. This can be a quoted string or a function name. |
R |
Number of bootstrap replicates. |
student |
Should we do a studentized bootstrap? This requires a double bootstrap so it might take longer. |
M |
If |
weights |
Resampling weights; a list with two components. The
first component of the list is a vector of weights for
|
... |
Other (named) arguments that should be passed to
|
The differences are always taken as FUN(sample1) -
FUN(sample2)
. If you want the difference to be reversed you need
to reverse the order of the arguments sample1
and
sample2
.
An object of class "simpleboot"
, which is almost identical to the
regular "boot"
object. For example, the boot.ci
function can be used on this object.
Roger D. Peng
library(boot) set.seed(50) x <- rnorm(100, 1) ## Mean 1 normals y <- rnorm(100, 0) ## Mean 0 normals b <- two.boot(x, y, median, R = 100) hist(b) ## Histogram of the bootstrap replicates b <- two.boot(x, y, quantile, R = 100, probs = .75)
library(boot) set.seed(50) x <- rnorm(100, 1) ## Mean 1 normals y <- rnorm(100, 0) ## Mean 0 normals b <- two.boot(x, y, median, R = 100) hist(b) ## Histogram of the bootstrap replicates b <- two.boot(x, y, quantile, R = 100, probs = .75)