Here follow some examples how to add components for computation to a sim_setup
. Three points can be accessed with
sim_comp_pop
- add a computation before samplingsim_comp_sample
- add a computation after samplingsim_comp_agg
- add a computation after aggregationlibrary(saeSim)
base_id(2, 3) %>% sim_gen_x() %>% sim_gen_e() %>% sim_gen_ec() %>%
sim_resp_eq(y = 100 + x + e) %>%
sim_comp_pop(comp_var(popMean = mean(y)), by = "idD")
## Source: local data frame [6 x 7]
##
## idD idU x e idC y popMean
## 1 1 1 -2.5058 1.950 FALSE 99.44 100.7
## 2 1 2 0.7346 2.953 FALSE 103.69 100.7
## 3 1 3 -3.3425 2.303 FALSE 98.96 100.7
## 4 2 1 6.3811 -1.222 FALSE 105.16 103.6
## 5 2 2 1.3180 6.047 FALSE 107.37 103.6
## 6 2 3 -3.2819 1.559 FALSE 98.28 103.6
The function comp_var
is a wrapper around dplyr::mutate
so you can add simple data manipulations. The argument by
is a little extension and lets you define operations in the scope of groups identified by a variable in the data.
Some statistics will be of interest frequently, which is why some predefined components are available.
sim_comp_n()
- adds the sample sizes in n
sim_comp_N()
- adds the population N in N
sim_comp_popMean()
- adds the population meanssim_comp_popVar()
- adds the population variancessim_base_lm() %>%
sim_comp_N() %>%
sim_comp_popMean() %>%
sim_comp_popVar() %>%
sim_sample() %>%
sim_comp_n()
## Source: local data frame [6 x 9]
##
## idD idU x e y N popMean popVar n
## 1 1 79 -1.8936 -0.99575 97.11 100 100.7 19.90 5
## 2 1 48 0.6130 1.64335 102.26 100 100.7 19.90 5
## 3 1 72 4.6416 -0.06541 104.58 100 100.7 19.90 5
## 4 1 81 0.1685 5.17994 105.35 100 100.7 19.90 5
## 5 1 96 -1.2800 8.31746 107.04 100 100.7 19.90 5
## 6 2 48 -0.5776 -3.54162 95.88 100 100.2 29.31 5
By adding computation functions you can extend the functionality of a sim_setup to wrap up your whole simulation. This will seperate the utility of this package from simply generating data. Add for example the linear predictor to the data:
comp_linearPredictor <- function(dat) {
dat$linearPredictor <- lm(y ~ x, dat) %>% predict
dat
}
sim_base_lm() %>%
sim_comp_pop(comp_linearPredictor)
## idD idU x e y linearPredictor
## 1 1 1 -4.4516 2.3544 97.90 95.58
## 2 1 2 5.3566 6.4636 111.82 105.27
## 3 1 3 -2.7164 0.1963 97.48 97.29
## 4 1 4 -0.2168 0.2930 100.08 99.76
## 5 1 5 -2.9725 7.3210 104.35 97.04
## 6 1 6 2.1499 -2.1982 99.95 102.10
Or, should this be desiarable, directly produce a list of lm
objects or add them as attribute to the data. However, the intended way of writing functions is that they will return the modified data set and has class 'data.frame'.
sim_base_lm() %>%
sim_comp_pop(function(dat) lm(y ~ x, dat)) %>%
sim(R = 1)
## [[1]]
##
## Call:
## lm(formula = y ~ x, data = dat)
##
## Coefficients:
## (Intercept) x
## 99.97 0.99
comp_linearModelAsAttr <- function(dat) {
attr(dat, "linearModel") <- lm(y ~ x, dat)
dat
}
dat <- sim_base_lm() %>%
sim_comp_pop(comp_linearModelAsAttr) %>%
as.data.frame
attr(dat, "linearModel")
##
## Call:
## lm(formula = y ~ x, data = dat)
##
## Coefficients:
## (Intercept) x
## 100.047 0.986
If you use any kind of sampling, the 'linearPredictor' can be added after sampling. This is where small area models are supposed to be applied.
sim_base_lm() %>%
sim_sample() %>%
sim_comp_sample(comp_linearPredictor)
## idD idU x e y linearPredictor
## 1 1 60 2.09821 -3.8323 98.27 102.03
## 2 1 88 -3.04640 -6.8891 90.06 96.87
## 3 1 58 7.48470 2.9024 110.39 107.44
## 4 1 39 -5.55043 -0.9880 93.46 94.35
## 5 1 35 -0.02147 6.6930 106.67 99.90
## 6 2 93 -0.61504 -0.6941 98.69 99.31
Should you want to apply area level models, use sim_comp_agg
instead.
sim_base_lm() %>%
sim_sample() %>%
sim_agg() %>%
sim_comp_agg(comp_linearPredictor)
## Source: local data frame [6 x 5]
##
## idD x e y linearPredictor
## 1 1 0.76379 -1.08301 99.68 100.95
## 2 2 -0.35907 -0.04239 99.60 99.59
## 3 3 2.43399 1.78646 104.22 102.98
## 4 4 0.50121 2.39171 102.89 100.63
## 5 5 0.09757 1.45589 101.55 100.14
## 6 6 1.84827 -2.99871 98.85 102.27