density now gives warnings when called with weights, avoid this.
Check for tied p-values fails on M1mac.
Fix documentation bug.
Register default methods.
NA in responses when plotting. Reported by Tyson H. Holmes.
Address random CRAN errors (honesty checks).
Update reference output.
Test constparty vignette code in tests, to avoid a NOTE about missing RWeka on Solaris.
method argument to
glmtree. The default is to use
"glm.fit" (as was hard-coded previously) but this can also be
changed, e.g., to
"brglmFit" from brglm2 for bias-reduced
estimation of generalized linear models.
Fix LaTeX problem.
Better checks for response classes, fixing a bug reported by John Ogawa.
"xlevels" attribute for the
regressors is preserved in the models fitted within the trees. Thus, predicting
for data whose
"xlevels" do not match, an error is generated now
(as opposed to warning and partially incorrect predictions).
Add an experimental implementation of honesty.
maxvar argument to
ctree_control for restricting
the number of split variables to be used in a tree.
all.equal must not check environments.
Non-standard variable names are now handled correctly within
extree_data, prompted by
Deal with non-integer
Handle NAs in
Fix an issue with printing of tied p-values.
pruning of modelparty objects failed to get the fitted slot right.
In R-devel, c(<factor>) now returns factors, rendering code in .simplify_pred overly pedantic.
NAMESPACE fixes: party is only suggested.
Remove warning about response not being a factor in
predict.cforest. Reported by Stephen Milborrow.
Trying to split in a variable where all observations were missing nevertheless produced a split, as reported by Kevin Ummel.
update reference output, fix RNGversion
varimp runs in parallel mode, optionally.
cforest can now be used
to specify a matrix of weights (number of observations
times number of trees) to be used for tree induction (this was
always possible in
party::cforest. This was advertised in
the documentation but actually not implemented so far.
predict did not pay attention to
xlev; this caused
problems when empty factor levels were removed prior to tree fitting.
nodeprune may have got fitted terminal node numbers wrong,
spotted by Jason Parker.
mob() using the
cluster argument with a
variable sometimes lead to
NAs in the covariance matrix estimate
if empty categories occured in subgroups. The problem had been introduced
in version 1.2-0 and has been fixed now.
Methods for the
sctest generic from the strucchange
package are now dynamically registered if strucchange is attached.
Alternatively, the methods can be called directly using their full names
prune.modelparty function is now fully exported but
it is also registered with the
prune generic from rpart.
scale argument for
For simple regression forests, predicting
the conditional mean by nearest neighbor weights with
scale = TRUE is now
equivalent to the aggregation of means. The unscaled version
proposed in <doi:10.1002/sim.1593> can be obtained with
scale = FALSE.
Bug fix for case weights in
mob() in previous version (1.2-0)
introduced a bug in the handling of proportionality weights. Both cases
are handled correctly now.
glmtree can now handle
caseweights = TRUE correctly
vcov other than the default
glm objects are adjusted by correcting the dispersion
estimate and the degrees of freedom.
lookahead did not work in the presence of missing values.
partykit::ctree did not work when partykit was not
node_inner now allows to set a different
gpar(fontsize = ...)
in the inner nodes compared to the overall tree.
splittest asked for Monte-Carlo p-values, even
when the test statistic was used as criterion.
We welcome Heidi Seibold as co-author!
Internal re-organisation for
ctree by means of new extensible
tree infrastructure (available in
extree_fit). Certain parts of the new infrastructure
are still experimental.
ctree is fully backward
Use libcoin for computing linear test statistics
and p-values for
Use inum for binning (the new
Quadratic test statistics for splitpoint selection are now available for
ctree_control(splitstat = "quadratic").
Maximally selected test statistics for variable selection are now available for
ctree_control(splittest = TRUE).
Missing values can be treated as a separate category, also
for splits in numeric variables in
ctree_control(MIA = TRUE).
Permutation variable importance, including conditional variable importance, was added to partykit.
offset argument in
get_paths for computing paths to nodes.
node_barplot gained a
that can be used to draw text labels for the
margins used in
plot.party can now
also be set by the user.
Bug fix in
weights are used and
caseweights = TRUE (the default).
The statistics for the parameter instability tests were computed incorrectly and
consequently the selection of splitting variables and also the stopping criterion
Avoid log(p) values of
mob() by replacing weighted averaging
with naive averaging in the response surface regression output in case
the p values are below machine precision.
as.party method for
without any splits only returned a naked
rather than a full
party. This has been corrected
nodeapply did not produce the same results for permutations of
Spotted by Heidi Seibold.
Out-of-bag predictions in
predict.cforest were incorrect.
predict was only considered when
newdata was given. Spotted by Heidi Seibold.
Don't try to search for binary splits in unordered factors with more than 31
levels. This potentially caused an integer overrun in previous versions.
party::ctree() uses an approximation for binary split
searches in unordered factors; thus, using party might be an
Proper support of quasi-families in
NA handling by following the majority was
potentially incorrect in
Minor speed improvements.
Breaking ties before variable selection was suboptimal for very small log-p-values.
Added a new function
palmtree that fits partially
additive (generalized) linear model trees. These employ
model-based recursive partitioning (
mob) based on (generalized)
linear models with some local (i.e., leaf-specific) and
some global (i.e., constant throughout the tree) regression
Splits in ordinal variables are now represented correctly
in the (still internal)
Kaplan-Meier curves in
"constparty" trees were
plotted incorrectly due to use a wrong scaling of the
x-axis. Spotted by Peter Calhoun <firstname.lastname@example.org>.
quote(stats::model.frame) instead of
as.party methods for
Weka_tree now have a
data = TRUE argument
so that by default the data is preserved in the
object (instead of an empty model frame).
predict method for
did not work for one-row data frames, fixed now.
just arguments to
for more fine control of x-axis labeling (e.g., with 45 degree
The partykit package has now been published in Journal of Machine Learning Research, 16, 3905-3909. https://jmlr.org/papers/v16/hothorn15a.html
Added support for setting background in panel functions.
as.list() method for
erroneously created an object
thisnode in the calling
environment which is avoided now.
Bug fix in
plot() method for
In the previous partykit version clipping was accidentally
also applied to the axes labels.
plot(..., type = "simple")
did not work correctly whereas
yielded the desired visualization. Now internally
is called also in the former case.
as.simpleparty() method now preserves p-values from
constparty objects (if any).
getCall() method for
predict() method for
offset (if any) was sometimes ignored. It is
now always used in the prediction.
logrank_trafo from coin.
nodeprune(..., ids = 1) did not prune the tree to the root
node. Fixed now.
na.omit instead of
predict.party now features new
perm argument for
permuting splits in specific variables (useful for computing
permutation variable importances).
The support for (generalized) linear model trees with just
a constant regressor has been improved. Now
lmtree(y ~ x1 + x2)
is short for
lmtree(y ~ 1 | x1 + x2), analogously for
Plotting now also works properly in this case.
as.party() method for
"rpart" objects did not
work if one of the partitioning variables was a
variable rather than a
"factor". A suitable work-around has
node_barplot() panel function can now also be used
for multivariate responses, e.g., when all responses are numeric and
on the same scale.
The package now also includes a new data set
which is essentially a copy of the
spider data from the package
mvpart that is currently archived on CRAN. The documentation has
been improved somewhat and is likely to change further to explain how
the data has been transformed in De'ath (2002).
The survival tree example for the GBSG2 data was broken due to the response being (incorrectly) also part of the explanatory variables. Fixed by using the latest Formula package (at least version 1.2-1).
Version 1.0-0 published. This version is described in the MLOSS paper accepted for publication by the Journal of Machine Learning Research today.
The unbiased version of
replace = FALSE) is
now the default (as in party).
Register all S3 methods in
mob() interface by a
cluster argument. This can be
a vector (numeric, integer, factor) with cluster IDs that are
then passed on to the 'fit' function (if supported) and used
for clustering the covariance matrix in the parameter stability
glmtree() hence both gained a
argument which is used only for cluster covariances but not
for the model estimation (i.e., corresponding to a working
Optionally, the parameters' variance-covariance matrix in
can now be estimated by the sandwich matrix instead of the default
outer-product-of-gradients (OPG) matrix or the information matrix.
cforest() available with extended
prediction facilities. Both the internal representation and the user interface
are still under development are likely to change in future versions.
Added multicore support to
If control argument
cores is specified (e.g.,
cores = 4) then the
search for the best variable or split point (often involving numerous model fits in
mob() or resampling in
ctree()) is carried out using
rathern than sequential
sapply(). Additionally, other
applyfuns can be provided, e.g., using networks of workstations etc.
Bug fix in
mob() that occurred when regressor variables and
partitioning variables overlapped and were not sorted in the
underlying model frame.
mvpart was archived 2014-12-15.
Fixed an uninitialized memory issue reported by valgrind.
partykit now depends on R version >= 3.1.0 in order to import the
depth() generic from the grid package.
The print methods for
partynode objects with only a root node
was modified. Now, the terminal panel function is also applied
if there is only a root node (while previously it was not).
ctree() now catches
sum(weights) <= 1 situations before they
lead to an error.
Code from suggested packages is included by using
:: syntax as
required by recent R versions.
ctree() can now be a function which will be
updated in every node.
A small demo briefly illustrating some memory and speed properties
has been added. It can be run interactively via
demo("memory-speed", package = "partykit").
Section 3 of the "constparty" vignette now shows how properties of a new tree algorithm can be assessed by using partykit building blocks.
Major improved version of partykit. The previously existing functions in the package were tested and enhanced, new functions and extensive vignettes added.
Extended and improved introductory documentation. The basic classes
and class constructors
party are introduced in
much more detail now in
vignette("partykit", package = "partykit").
constparty (inheriting from
party) for representing
objects with constant fits in the nodes (along with coercion methods
J48, etc.) is now described in more detail in the new
vignette("constparty", package = "partykit").
The package now includes a reimplementation of the model-based
recursive partitioning algorithm (MOB) using partykit infrastructure.
The generic algorithm, the corresponding convenience interfaces
glmtree() as well as various illustrations and possible
extensions are described in detail in the new
vignette("mob", package = "partykit").
Improved implementation of conditional inference trees (CTree), see
vignette("ctree", package = "partykit") for details.
nodeprune() generic for pruning nodes in all
party trees and
Deal with empty levels in
teststat = "quad"
(bug reported by Wei-Yin Loh <loh_at_stat.wisc.edu>).
predict() method for
type = "prob" now returns
ECDF for numeric responses and
type = "response" the (weighted) mean.
New panel function
node_ecdf() for plotting empirical cumulative
distribution functions in the terminal leaves of
Bug fix in
as.party() method for J48 trees with ordered factors.
Fix C code problems reported by clang under OS X.
node_surv() for plotting survival ctrees. Accompanying
infrastructure for survival trees was enhanced.
ctree() now checks for (and does not allow)
x >= max(x) splits.
Added ipred to the list of suggested packages due to usage of GlaucomaM and GBSG2 data in tests/examples.
node_terminal() panel-generating function is now customizable
by a FUN argument that is passed to
plot() method for
simpleparty object now sets up a formatting
function passed to
formatinfo(), both in
Fixed bug in
pmmlTreeModel() for processing label IDS in splits when
not all levels are present.
Cleaned up unused variables in C code and partial argument matching in R code.
First CRAN release.
vignette("partykit", package = "partykit") for a (somewhat rough)
introduction to the package and its classes/methods.