# Getting Started with NEONiso

Rich Fiorella; updated: February 26, 2021

library(NEONiso)

# NEONiso quickstart: an example workflow

The goal of this document is to help NEONiso users up and running with a local copy of calibrated NEON atmospheric isotope data. Currently, NEONiso has mature functions for handling and processing atmospheric CO2 isotope data, and functions to work with water isotope data are earlier in their development. Therefore, this document will focus on the use of the functions for working with atmospheric CO2 data.

This document will cover three important steps of working with NEON isotope data: 1) retrieving data through the NEON API, 2) calibrating the data, and 3) running a set of diagnostic scripts to inspect the data.

## NEON Data Retrieval

Atmospheric isotope data products from NEON are included as part of the eddy covariance bundled data product (ID DP4.00200.001). NEONiso includes a function for downloading and managing a local archive of eddy covariance bundles: manage_local_EC_archive. For example, the following command will download all of the data for the ONAQ site:

manage_local_EC_archive(file_dir = "~/Desktop", get = TRUE, unzip_files = TRUE, sites = "ONAQ")

The sites argument can also be a vector of multiple sites you’d like to request, or “all” to pull data from all of the sites (note this will be >100 GB and growing as of February 2021). The EC bundles are gzipped on the remote server, and this function will also unzip these files after they’ve been downloaded.

## Data calibration

Two methods are available to calibrate NEON Carbon isotope data and they take slightly different approaches: a) the ‘Bowling_2003’ method calibrates 12CO2 and 13CO2 mole fractions independently, while b) the ‘linreg’ method calibrates d13C and CO2 directly without converting to isotopologue mole fractions. The method is specified as an argument to calibrate_carbon_bymonth(). Both methods yield very similar results, but the error and precision estimates are slightly better from the calibrate_carbon_Bowling2003() function (Fiorella et al., 2021; JGR-Biogeosciences) [https://doi.org/10.1029/2020JG005862].

The simplest way to interact with these functions is to place it inside of a for loop (or an apply statement), and call them for each file you would like to calibrate:

for (i in 1:length(fnames.out)) {
calibrate_carbon_bymonth(fnames[i],fnames.out[i],site=site.code[i], method = "Bowling_2003")
}

Two different methods to calibrate are available: a) a ‘gain and offset’ regression using 12CO2 and 13CO2 mole fractions (method = ‘Bowling_2003’, after Bowling et al. 2003) and b) a ‘direct’ linear regression of CO2 and d13C values to generate calibration transfer functions (method = ‘linreg’). The function ‘calibrate_carbon_bymonth()’ assumes you have vectors of: 1) input file names (fnames), 2) output file names (fnames.out), and 3) 4 letter NEON site codes corresponding to each entry in input file names (fnames). An example of code to generate these three vectors is provided below:

data.dir <- '/your/path/here/DP4_00200_001/'

fnames <- list.files(path = data.dir,
pattern = '.h5',
recursive = TRUE,
full.names = TRUE)

# unselect gz files.
fnames <- fnames[!grepl('.gz',fnames)]

fname.byfolder <- strsplit(fnames, split=".", fixed = TRUE)
site.code  <- sapply(fname.byfolder,'[[',3)

# inspect site.code in the environment: is it a vector that with repeated "ONAQ"?

fnames.tmp <- gsub(".h5",".calibrated.h5",fnames)
fnames.spt <- strsplit(fnames.tmp, split = "/")
fnames.out <- sapply(fnames.spt, '[[', 7)

# create new output directory
outpaths   <- paste0(your_path,'/ONAQ/output/')
sapply(unique(outpaths),dir.create,showWarnings=FALSE) # apply function used here to generalize in case you wanted to run all sites

# update fnames.out to include desired output paths.
fnames.out <- paste0(outpaths,"/",fnames.out)