Introduction

The USgas package provides an overview of demand for natural gas in the US in a time-series format. That includes the following dataset: * usgas - The monthly consumption of natural gas in the US/state level by end-use since 1973 for US level and 1989 for state level. It includes the following end-use categories: - Commercial Consumption - Delivered to Consumers - Electric Power Consumption - Industrial Consumption - Lease and Plant Fuel Consumption - Pipeline Fuel Consumption - Residential Consumption - Vehicle Fuel Consumption

The package also includes the following datasets, from previous release:

The us_total, us_monthly, and us_residential can be derived out of the usgas dataset. Therefore, those datasets in the process of deprication and will be removed in the next release to CRAN.

Data source: The US Energy Information Administration API

The usgas dataset

The usgas dataset provides a 313 time series focusing on the consumption of natural gas by end use in the US (aggregated and state level). It includes the following fields:

library(USgas)

data("usgas")

str(usgas)
#> 'data.frame':    92783 obs. of  5 variables:
#>  $ date     : Date, format: "1973-01-01" "1973-01-01" ...
#>  $ process  : chr  "Commercial Consumption" "Residential Consumption" "Commercial Consumption" "Residential Consumption" ...
#>  $ state    : chr  "U.S." "U.S." "U.S." "U.S." ...
#>  $ state_abb: chr  "U.S." "U.S." "U.S." "U.S." ...
#>  $ y        : int  392315 843900 394281 747331 310799 648504 231943 465867 174258 326313 ...
#>  - attr(*, "units")= chr "MMCF"
#>  - attr(*, "product_name")= chr "Natural Gas"
#>  - attr(*, "source")= chr "EIA API: https://www.eia.gov/opendata/browser/natural-gas"

head(usgas)
#>         date                 process state state_abb      y
#> 1 1973-01-01  Commercial Consumption  U.S.      U.S. 392315
#> 2 1973-01-01 Residential Consumption  U.S.      U.S. 843900
#> 3 1973-02-01  Commercial Consumption  U.S.      U.S. 394281
#> 4 1973-02-01 Residential Consumption  U.S.      U.S. 747331
#> 5 1973-03-01  Commercial Consumption  U.S.      U.S. 310799
#> 6 1973-03-01 Residential Consumption  U.S.      U.S. 648504

The dataset includes the state level and the US aggregate level (labeled under U.S.):

unique(usgas$state)
#>  [1] "U.S."                 "Oregon"               "Virginia"            
#>  [4] "Rhode Island"         "Arizona"              "Washington"          
#>  [7] "South Dakota"         "New Jersey"           "Florida"             
#> [10] "Alabama"              "Louisiana"            "Illinois"            
#> [13] "Colorado"             "New Hampshire"        "Maine"               
#> [16] "Iowa"                 "Alaska"               "California"          
#> [19] "Michigan"             "West Virginia"        "North Dakota"        
#> [22] "Utah"                 "Pennsylvania"         "Missouri"            
#> [25] "Montana"              "Texas"                "Idaho"               
#> [28] "Delaware"             "South Carolina"       "New Mexico"          
#> [31] "Massachusetts"        "Georgia"              "Arkansas"            
#> [34] "New York"             "Nebraska"             "Tennessee"           
#> [37] "Indiana"              "District of Columbia" "Minnesota"           
#> [40] "Wisconsin"            "Vermont"              "Hawaii"              
#> [43] "Wyoming"              "Maryland"             "Kansas"              
#> [46] "Ohio"                 "Mississippi"          "Nevada"              
#> [49] "North Carolina"       "Oklahoma"             "Kentucky"            
#> [52] "Connecticut"

In the example below, we will subset and plot the consumption by end-use in the US:

us_agg <- usgas[which(usgas$state == "U.S."),]

head(us_agg)
#>         date                 process state state_abb      y
#> 1 1973-01-01  Commercial Consumption  U.S.      U.S. 392315
#> 2 1973-01-01 Residential Consumption  U.S.      U.S. 843900
#> 3 1973-02-01  Commercial Consumption  U.S.      U.S. 394281
#> 4 1973-02-01 Residential Consumption  U.S.      U.S. 747331
#> 5 1973-03-01  Commercial Consumption  U.S.      U.S. 310799
#> 6 1973-03-01 Residential Consumption  U.S.      U.S. 648504

Let’s now use plotly to plot those series:

library(plotly)

plot_ly(data = us_agg,
        x = ~ date,
        y = ~ y,
        color = ~ process,
        type = "scatter",
        mode = "line") |> 
  layout(title = "US Monthly Consumption of Natural Gas by End Use",
         yaxis = list(title = "MMCF"),
         xaxis = list(title = "Source: EIA Website"),
         legend = list(x = 0,
                       y = 0.95))

Similarly, we can subset a couple of states and visualize them. For example, let’s visualize the residential consumption in the New England states. We will start by subsetting the corresponding states in New England, transform the data.frame to wide format, and reorder by date:

ne <- c("Connecticut", "Maine", "Massachusetts",
        "New Hampshire", "Rhode Island", "Vermont")
ne_gas <-  usgas[which(usgas$state %in% ne),]

head(ne_gas)
#>           date                process         state state_abb    y
#> 495 1989-01-01 Commercial Consumption  Rhode Island        RI 1032
#> 505 1989-01-01 Commercial Consumption New Hampshire        NH  842
#> 506 1989-01-01 Commercial Consumption         Maine        ME  229
#> 523 1989-01-01 Commercial Consumption Massachusetts        MA 7394
#> 533 1989-01-01 Commercial Consumption       Vermont        VT  315
#> 544 1989-01-01 Commercial Consumption   Connecticut        CT 3909

Next, let’s use the process column to extract the residential consumption and plot it:

ne_gas[which(ne_gas$process == "Residential Consumption"),] |>
  plot_ly(x = ~ date,
          y = ~ y,
          color = ~ state,
          type = "scatter",
          mode = "line") |> 
  layout(title = "Monthly Residential Consumption of Natural Gas in New England",
         yaxis = list(title = "MMCF"),
         xaxis = list(title = "Source: EIA Website"),
         legend = list(x = 0,
                       y = 1))