R packages under analysis were retrieved from CRAN/Biocoductor on 2021-10-28. There are <%=n_cran%> packages from CRAN and <%=n_bioc%> packages from Bioconductor (bioc version 3.14).
Depends
,
Imports
, LinkingTo
, Suggestes
and
Enhances
fields in its DESCRIPTION
file. We define the following dependency categories
for package P:
Depends
, Imports
and LinkingTo
fields
(packages in the red box in the following diagram). They are also called the
strong direct dependency packages of P. Parent packages are enforced to be installed when installing package P.Suggests
and
Enhances
are also included (package category A, B,
C and D, plus all packages listed in the box of package P). It
simulates when moving all Suggests
and Enhances
packages to
Depends
/Imports
of P, the number of strong dependency
packages.Next various measures for the heaviness are defined as follows:
Suggests
of P. In
other words, the heaviness measures the number of additionally uniquely required
packages that A brings to P.Suggests
, the heaviness of P on its child packages is
calculated as $\frac{1}{K_c} \sum_k^{K_c}(n_{1k} - n_{2k})$. So here the heaviness measures
the average number of additional packages P brings to its child packages.Suggests
. Then we denote
$n_{2k}$ as the number of strong dependencies of $B_k$ in the modified dependency graph.
The heaviness of P on its downstream packages is
calculated as $\frac{1}{K_d} \sum_k^{K_d}(n_{1k} - n_{2k})$.Here the measure of Heaviness of a package on all its child packages is more important to developers, since it tells how many additional depedency packages are expected to be imported when they add a new parent package to their packages.
All these measures have a trend that small $K$ (i.e. number of parents, children or downstream packages) leads to high heaviness values. Packages with small $K$ are in general of less interests. What is more important is to see, e.g. which package heavily affects a lot of children or downstream packages (i.e. with large $K$). Thus, the original definition of heaviness is adjusted correspondingly to decrease the heaviness more for smaller $K$. A detailed explanation of the adjusted heaviness can be found in the tab "Heaviness analysis".
The previous definition of heaviness only measures the effect of a single package. Here we define another measure called "co-heaviness from parent package" that measures the number of additional dependency packages simultaneously imported by two parents.
Denote P's two strong parent packages as A and B,
denote $S_A$ as the set of reduced dependency packages when only moving A to
Suggests
of P, denote $S_B$ as the set of reduced dependency packages when
only moving B to Suggests
of P, and denote $S_{AB}$ as the set of reduced dependency
packages when moving A and B together to Suggests
of P, the co-heaviness of
A, B on P is calculatd as $ \left | S_{AB} \setminus \cup (S_A, S_B) \right | $ where $|A|$
is the number of elements in set A and $A \setminus B$ is the set of elements in A but not in B.
Legends:
High heaviness Packages with adjusted heaviness on child packages higher than <%=CUTOFF$adjusted_heaviness_on_children[2]%>.
Median heaviness Packages with adjusted heaviness on child packages between <%=CUTOFF$adjusted_heaviness_on_children[1]%> and <%=CUTOFF$adjusted_heaviness_on_children[2]%>.
reducible Packages whose parent's heaviness could be reduced, i.e. only a limited number of functions are imported from parent.
Columns: Heaviness from parent packages Heaviness on child/downstream packages
The full table of dependency heaviness analysis can be obtained by df = pkgndep::all_pkg_stat_snapshot()
.
Loading content...