tidytransit makes it easy to work with transit data by simplifying General Transit Feed Specification data (the standard format for storing transit data) into tidyverse and sf-friendly dataframes. Use it to map existing stops and routes, calculate transit frequencies, and validate transit feeds.
tidytransit is a fork of gtfsr, published to CRAN, with frequency calculation functions, and without GTFS-specific interactive cartography features.
This package requires a working installation of sf.
# Once sf is installed, you can install from CRAN with:
install.packages('tidytransit')
# For the development version from Github:
# install.packages("devtools")
devtools::install_github("r-transit/tidytransit")
For some users, sf
is impractical to install due to system level dependencies. For these users, trread
may work better. It has more limited functionality, but it can read GTFS tables into R.
This example uses NYC MTA subway schedule data to identify shortest median headways by route, pulling data directly from the MTA’s GTFS URL.
library(tidytransit)
library(dplyr)
# Read in GTFS feed
# here we use a feed included in the package, but note that you can read directly from the New York City Metropolitan Transit Authority using the following URL:
#nyc <- import_gtfs("http://web.mta.info/developers/data/nyct/subway/google_transit.zip")
local_gtfs_path <- system.file("extdata",
"google_transit_nyc_subway.zip",
package = "tidytransit")
nyc <- import_gtfs(local_gtfs_path,
local=TRUE)
## Unzipped the following files to directory '/var/folders/hk/12t7fl6d08s7r494zd9sb8v40000gn/T//RtmpT1HZG2'...
## [1] "agency.txt"
## [2] "calendar_dates.txt"
## [3] "calendar.txt"
## [4] "libloc_185_302b971d5ddbcf93.rds"
## [5] "libloc_213_722ad5f9e07c7fe1.rds"
## [6] "repos_https%3A%2F%2Fcran.rstudio.com%2Fbin%2Fmacosx%2Fel-capitan%2Fcontrib%2F3.5.rds"
## [7] "routes.txt"
## [8] "rs-graphics-911183dd-d5d7-4595-af33-cbd4bab415df"
## [9] "shapes.txt"
## [10] "stop_times.txt"
## [11] "stops.txt"
## [12] "transfers.txt"
## [13] "trips.txt"
## Reading agency_df
## Reading calendar_dates_df
## Reading calendar_df
## Reading routes_df
## Reading shapes_df
## Reading stop_times_df
## Reading stops_df
## Reading transfers_df
## Reading trips_df
## ...done.
## Testing data structure...
## ...passed. Valid GTFS object.
## Converting stops to simple features
## Converting routes to simple features
# Get route frequencies
nyc_route_freqs <- nyc %>%
get_route_frequency()
# Find routes with shortest median headways
nyc_fastest_routes <- nyc_route_freqs %>%
filter(median_headways < 25) %>%
arrange(median_headways)
knitr::kable(head(nyc_fastest_routes))
route_id | median_headways | mean_headways | st_dev_headways | stop_count |
---|---|---|---|---|
GS | 4 | 4 | 0.01 | 4 |
L | 4 | 4 | 0.13 | 48 |
1 | 5 | 5 | 0.14 | 76 |
7 | 5 | 5 | 0.29 | 44 |
6 | 6 | 7 | 2.84 | 76 |
E | 6 | 23 | 53.01 | 48 |
route_id | median_headways | mean_headways | st_dev_headways | stop_count |
---|---|---|---|---|
GS | 4 | 4 | 0.01 | 4 |
L | 4 | 4 | 0.13 | 48 |
1 | 5 | 5 | 0.14 | 76 |
7 | 5 | 5 | 0.29 | 44 |
6 | 6 | 7 | 2.84 | 76 |
E | 6 | 23 | 53.01 | 48 |
You can also identify shortest headways by stop.
nyc_stop_freqs <- nyc %>%
get_stop_frequency(by_route = FALSE) %>%
inner_join(nyc$stops_df, by = "stop_id") %>%
select(stop_name, direction_id, stop_id, headway) %>%
arrange(headway)
head(nyc_stop_freqs)
## # A tibble: 6 x 4
## # Groups: direction_id, stop_id [6]
## stop_name direction_id stop_id headway
## <chr> <int> <chr> <dbl>
## 1 Times Sq - 42 St 0 902N 3.60
## 2 Grand Central - 42 St 1 901S 3.60
## 3 Times Sq - 42 St 1 902S 3.60
## 4 Grand Central - 42 St 0 901N 3.61
## 5 Mets - Willets Point 0 702N 3.72
## 6 Junction Blvd 0 707N 3.72
## # A tibble: 6 x 4
## # Groups: direction_id, stop_id [6]
## direction_id stop_id stop_name headway
## <int> <chr> <chr> <dbl>
## 1 0 902N Times Sq - 42 St 3.60
## 2 1 901S Grand Central - 42 St 3.60
## 3 1 902S Times Sq - 42 St 3.60
## 4 0 901N Grand Central - 42 St 3.61
## 5 0 702N Mets - Willets Point 3.72
## 6 0 707N Junction Blvd 3.72
Perhaps you want to map subway routes and color-code each route by how often trains come.
When you call import_gtfs
, tidytransit attempts to append a simple features
data frame for stops and routes to the list of gtfs dataframes.
You can join these simple features dataframes to the frequency calculation dataframes above and then plot them.
routes_sf_frequencies <- nyc$routes_sf %>%
inner_join(nyc_fastest_routes, by = "route_id") %>%
select(median_headways,
mean_headways,
st_dev_headways,
stop_count)
plot(routes_sf_frequencies)
Source: Wikimedia, user -stk.