This vignette will take a more detailed look at the matched.set
object, which is the core object within the package that captures all of the information associated with treated units and their matched control units.
First, we will create a smaller subset of the dem
data set, which is included in the package. This is just to make some of our results easier to read. We then use the DisplayTreatment function to get a sense of treatment variation within the subset of data.
library(PanelMatch)
## Registered S3 methods overwritten by 'tibble':
## method from
## format.tbl pillar
## print.tbl pillar
uid <-unique(dem$wbcode2)[1:10]
subdem <- dem[dem$wbcode2 %in% uid, ]
DisplayTreatment(unit.id = "wbcode2", time.id = "year", treatment = 'dem', data = subdem)
We can use the PanelMatch
function with refinement.method
set to none
to obtain a PanelMatch
object, from which we will extract a matched.set
object. PanelMatch
returns an S3 object of the PanelMatch
class. These objects are just lists with some additional attributes. Here, we will focus on one element contained within PanelMatch
objects: matched.set
objects. Within the PanelMatch
object, this element is always named either att
, art
, or atc
reflecting the name of the specified qoi. When qoi = ate
, then there are two matched.set
objects included in the resulting PanelMatch
call. Specifically, there will be two matched sets named att
and atc
, respectively.
PM.results <- PanelMatch(lag = 4, time.id = "year", unit.id = "wbcode2",
treatment = "dem", refinement.method = "none",
data = subdem, match.missing = TRUE,
qoi = "att" ,outcome.var = "y",
lead = 0, forbid.treatment.reversal = FALSE)
#Extract the matched.set object
msets <- PM.results$att
In implementation, the matched.set
is just a named list with some added attributes (e.g. lag window size, names of treatment, unit, and time variables) and a structured name scheme. Each entry in the list is a vector containing the unit ids of control units that are in a matched set. Additionally, each entry corresponds to a time/unit id pair (the unit id of a treated unit and the time at which treatment occurred). This is reflected in the names of each element of the list, as the name scheme [id varable]
.[time variable]
is used.
Matched set objects are implemented as lists, but the default printing behavior resembles that of a data frame. One can toggle a verbose
option on the print
method to print as a list and also display a less summarized version of the matched set data.
# observe the naming scheme
names(msets)
## [1] "4.1992" "4.1997" "6.1973" "6.1983" "7.1991" "12.1992" "13.2003"
## [8] "7.1998"
#data frame printing view: useful as a summary view with large data sets
# first column is unit id variable, second is time variable, and
# third is the number of controls in that matched set
print(msets)
## wbcode2 year matched.set.size
## 1 4 1992 2
## 2 4 1997 1
## 3 6 1973 1
## 4 6 1983 2
## 5 7 1991 4
## 6 12 1992 2
## 7 13 2003 2
## 8 7 1998 0
# prints as a list, shows all data at once
print(msets, verbose = TRUE)
## $`4.1992`
## [1] 3 13
## attr(,"treatment.change")
## [1] 1
## attr(,"control.change")
## [1] 0 0
## attr(,"weights")
## 3 13
## 0.5 0.5
##
## $`4.1997`
## [1] 7
## attr(,"treatment.change")
## [1] 1
## attr(,"control.change")
## [1] 0
## attr(,"weights")
## 7
## 1
##
## $`6.1973`
## [1] 13
## attr(,"treatment.change")
## [1] 1
## attr(,"control.change")
## [1] 0
## attr(,"weights")
## 13
## 1
##
## $`6.1983`
## [1] 4 13
## attr(,"treatment.change")
## [1] 1
## attr(,"control.change")
## [1] 0 0
## attr(,"weights")
## 4 13
## 0.5 0.5
##
## $`7.1991`
## [1] 3 4 12 13
## attr(,"treatment.change")
## [1] 1
## attr(,"control.change")
## [1] 0 0 0 0
## attr(,"weights")
## 3 4 12 13
## 0.25 0.25 0.25 0.25
##
## $`12.1992`
## [1] 3 13
## attr(,"treatment.change")
## [1] 1
## attr(,"control.change")
## [1] 0 0
## attr(,"weights")
## 3 13
## 0.5 0.5
##
## $`13.2003`
## [1] 3 12
## attr(,"treatment.change")
## [1] 1
## attr(,"control.change")
## [1] 0 0
## attr(,"weights")
## 3 12
## 0.5 0.5
##
## $`7.1998`
## numeric(0)
## attr(,"treatment.change")
## [1] 1
##
## attr(,"lag")
## [1] 4
## attr(,"t.var")
## [1] "year"
## attr(,"id.var")
## [1] "wbcode2"
## attr(,"treatment.var")
## [1] "dem"
## attr(,"refinement.method")
## [1] "none"
## attr(,"match.missing")
## [1] TRUE
Note that in the verbose print view above, one can see that each control unit will have an associated weight. These are the weights that are assigned from the refinement process. In this example, no refinement has been applied so each control unit in a matched set will have equal weight. See the Using PanelMatch
vignette for more about refinement.
Weights are attributes for each element in the matched.set
list. As such, they can also be extracted as follows:
# weights for control units in first matched set
attr(msets[[1]], "weights")
## 3 13
## 0.5 0.5
Note that this returns a vector of weights. The names of each element in the vector corresponds to the control unit that weight is associated with. Weights are normalized should always sum to 1 within each matched set.
The '[' and '[[' operators are implemented for matched.set
objects and should work intuitively.
Using '[' returns a subsetted matched.set
object (list). The additional attributes will be copied and transferred as well with the custom operator. Note how, by default, it prints like the full form of the matched.set
. Using '[[' will return the unit ids of the control units in the specified matched set.
Since matched.set
objects are just lists with attributes, you can expect the [
and [[
functions to work similarly to how they would with a list. So, for instance, users can extract information about matched sets using numerical indices or by taking advantage of the naming scheme.
msets[1]
## wbcode2 year matched.set.size
## 1 4 1992 2
#prints the control units in this matched set
msets[[1]]
## [1] 3 13
## attr(,"treatment.change")
## [1] 1
## attr(,"control.change")
## [1] 0 0
## attr(,"weights")
## 3 13
## 0.5 0.5
msets["4.1992"] #equivalent to msets[1]
## wbcode2 year matched.set.size
## 1 4 1992 2
msets[["4.1992"]] #equivalent to msets[[1]]
## [1] 3 13
## attr(,"treatment.change")
## [1] 1
## attr(,"control.change")
## [1] 0 0
## attr(,"weights")
## 3 13
## 0.5 0.5
Calling plot
on a matched.set
object will display a histogram of the sizes of the matched sets. By default, the number of empty matched sets (treated unit/time id pairs with no suitable controls for a match) is noted with a vertical bar at x = 0. One can include empty sets in the histogram by setting the include.empty.sets
argument to TRUE
.
plot(msets, xlim = c(0, 4))
# Use full data
PM.results.full <- PanelMatch(lag = 4, time.id = "year", unit.id = "wbcode2",
treatment = "dem", refinement.method = "none",
data = dem, match.missing = TRUE,
qoi = "att" ,outcome.var = "y",
lead = 0, forbid.treatment.reversal = FALSE)
#Extract the matched.set object
plot(PM.results.full$att)
The summary
function provides a variety of information about the sizes of matched sets, the unit and time ids of treated units, the number of empty sets, and the lag window size. The summary
function also has an option to print only the overview
data frame. Toggle this by setting verbose = FALSE
print(summary(msets))
## $overview
## wbcode2 year matched.set.size
## 1 4 1992 2
## 2 4 1997 1
## 3 6 1973 1
## 4 6 1983 2
## 5 7 1991 4
## 6 12 1992 2
## 7 13 2003 2
## 8 7 1998 0
##
## $set.size.summary
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 1.00 2.00 1.75 2.00 4.00
##
## $number.of.treated.units
## [1] 8
##
## $num.units.empty.set
## [1] 1
##
## $lag
## [1] 4
print(summary(msets, verbose = FALSE))
## wbcode2 year matched.set.size
## 1 4 1992 2
## 2 4 1997 1
## 3 6 1973 1
## 4 6 1983 2
## 5 7 1991 4
## 6 12 1992 2
## 7 13 2003 2
## 8 7 1998 0
DisplayTreatment
functionPassing a matched set (one treated unit and its corresponding set of controls) to the DisplayTreatment
function will visually highlight the lag window histories used to create that matched set. There is also an option to only display units from the matched set (and the treated unit), which can be achieved by setting show.set.only
to TRUE
.
#pass matched.set object using the `[` operator
DisplayTreatment(unit.id = "wbcode2", time.id = "year", treatment = 'dem', data = subdem, matched.set = msets[1])
# only show matched set units
DisplayTreatment(unit.id = "wbcode2", time.id = "year", treatment = 'dem',
data = subdem, matched.set = msets[1],
show.set.only = TRUE, y.size = 15, x.size = 13)