Getting started with the robis package

This package is a client for the OBIS API. It includes functions for data access, as well as a few helper functions for visualizing occurrence data and extracting nested MeasurementOrFact or DNADerivedData records.

First some packages:

library(robis)
library(dplyr)
library(ggplot2)

Occurrence data

The occurrence() function provides access to raw occurrence data. For example, to fetch all occurrences by scientific name:

occ <- occurrence("Abra aequalis")
occ
ggplot(occ) +
  geom_bar(aes(date_year), stat = "count", width = 1)

Alternatively, occurrences can be fetched by AphiaID:

occurrence(taxonid = 293683)

Other parameters include geometry, which accepts polygons in WKT format:

occurrence("Abra alba", geometry = "POLYGON ((2.59689 51.16772, 2.62436 51.14059, 2.76066 51.19225, 2.73216 51.20946, 2.59689 51.16772))")

WKT strings can be created by drawing on a map using the get_geometry() function.

A convenience function map_leaflet() is provided to visualize occurrences on an interactive map:

map_leaflet(occurrence("Abra sibogai"))

Checklists

The checklist() function returns all taxa observed for a given set of filters.

cl <- checklist("Semelidae")
cl
ggplot(cl %>% filter(!is.na(genus))) +
  geom_bar(aes(genus)) +
  coord_flip() +
  ylab("species count")

Just like the occurrence() function, checklist() accepts WKT geometries:

checklist(geometry = "POLYGON ((2.59689 51.16772, 2.62436 51.14059, 2.76066 51.19225, 2.73216 51.20946, 2.59689 51.16772))")

MeasurementOrFact records

The package also provides access to MeasurementOrFact records associated with occurrences. When calling occurrence(), MeasurementOrFact records can be included by setting mof = true.

occ <- occurrence("Abra tenuis", mof = TRUE)

MeasurementOrFact records are nested in the occurrence, but the unnest_extension() function allows you to extract them to a flat data frame. Use the fields parameter to indicate which occurrence fields need to be preserved in the measurements table.

mof <- unnest_extension(occ, extension = "MeasurementOrFact", fields = c("scientificName", "decimalLongitude", "decimalLatitude"))
mof

Note that the MeasurementOrFact fields can be used as parameters to the occurrence() function. For example, to only get occurrences with associated biomass measurements:

library(dplyr)

occurrence("Abra tenuis", mof = TRUE, measurementtype = "biomass") %>%
  unnest_extension(extension = "MeasurementOrFact")

DNADerivedData records

Just like MeasurementOrFact records, nested DNADerivedData records can be extracted from the occurrence results.

occ <- occurrence("Prymnesiophyceae", datasetid = "62b97724-da17-4ca7-9b26-b2a22aeaab51", dna = TRUE)
occ
dna <- unnest_extension(occ, extension = "DNADerivedData", fields = c("scientificName"))

dna %>%
  select(scientificName, target_gene, DNA_sequence)