The {rosv} package provides two core purposes:
Consequently, the functions in {rosv} relate to querying or downloading content from OSV, parsing JSON content, and generating tables and lists regarding key information on package vulnerabilities. The OSV database is not specific to R related repositories such as CRAN; users can access information on any ecosystem available in OSV. For R users who also dabble in Python, they can search for package vulnerabilities within the PyPI repository while remaining in an R interface.
The following examples will outline an assortment of package functionality but first we must load the package!
One of the simplest queries is to provide a package and ecosystem and
return a TRUE
/FALSE
response informing you if
that package has ever been listed with a vulnerability.
The most basic usage of {rosv} is to pull all versions of an
ecosystem’s packages (e.g. PyPI or CRAN) listed in the OSV database.
This can be achieved using high-level functions such as
osv_query()
and create_osv_list()
.
Lower-level functions return more detail about the API request and
response contained within the R6 object. These are more flexible than
their higher-level alternatives. By default, the results of these
functions will be cached. This can be overriden by specifying
cache = FALSE
.
The higher-level API query function osv_query()
aligns
returned content type and parsing method, making it the preferred choice
for the typical use-case.
# Returns entire response object to parse as you please.
osv_query_1('dask', ecosystem = 'PyPI')
# Returns the vulnerability IDs for packages in list
osv_querybatch('dask', ecosystem = 'PyPI')
# Return vulnerabilities from different ecosystems as vectors
osv_querybatch(c('dask', 'readxl'), ecosystem = c('PyPI', 'CRAN'))
# Grab details by vulns ID
osv_vulns('PYSEC-2021-387')
By default, results from queries using API helpers
(e.g. osv_query()
or osv_querybatch()
) will
cache results using memoise::memoise()
. The caching can be
turned off directly using function parameters or globally reset using
clear_osv_cache()
. Caching is the default behavior to help
enforce polite access of the OSV API.
When using a product such as {miniCRAN} or Posit Package Manager there may be corporate requirements to limit what packages users can install. Although having a whitelist is often recommended, it should either specify the exact versions that are approved or exclude packages with known vulnerabilities. Given the sheer amount of packages and their versions, this is often difficult. The following method will take a vector of packages (from PyPI) and cross-reference against the OSV database. If packages are identified they are either entirely dropped, or the specific versions with flagged vulnerabilities are excluded.
# List of packages of interest
python_pkg <- c('dask', 'tensorflow', 'keras')
# Create the xref whitelist
pypi_vul <- create_osv_list(as.data.frame = TRUE)
xref_pkg_list <- create_ppm_xref_whitelist(python_pkg, pypi_vul)
# Output requirements.txt which can be used with PPM product
writeLines(xref_pkg_list, file.path(tempdir(), 'requirements.txt'))