datagovindia is a wrapper around >80,000 APIS of the Government of India’s open data platform data.gov.in. Here is a small guide to take you thorugh the package. Primarily,the functionality is centered around three aspects :
The APIs from the portal are scraped every week to update a list of all APIs and the information attached to them like sector, source, field names etc. The website data.gov.in provides a search functionality through string searches and drop down menus but these are very limited. The functions in this package allows one to have more robust string based searches.
A user can search by API title, description, organization type, organization (ministry), sector and sources. Briefly there are two types of functions here, the first lets the user get a list of all available and unique organization type, organization (ministry), sector and sources and the other lets one “search” by these criteria and more.
Here is a demonstration of the former (getting only the first few values)
###List of organizations (or ministries)
get_list_of_organizations() %>%
head
#> [1] "Ministry of Environment and Forests"
#> [2] "Central Pollution Control Board"
#> [3] "Ministry of Home Affairs"
#> [4] "Department of Home"
#> [5] "Registrar General and Census Commissioner, India"
#> [6] "Ministry of Agriculture and Farmers Welfare"
###List of sectors
get_list_of_sectors() %>%
head
#> [1] "Industrial Air Pollution" "Census and Surveys"
#> [3] "Census" "Statistics"
#> [5] "Agriculture" "Agricultural Marketing"
Once you have an idea about what you want to look for in the API, search queries can be constructed using titles, descriptions as well as the categories explored earlier. A data.frame with information of APIs matching the search keywords is returned. Multiple search functions can be applied over each other utilising the data.frame structure of the result.
index_name | title | description | org_type | org | sector | source | created_date | updated_date |
---|---|---|---|---|---|---|---|---|
583f10fa-a19e-4a08-85f1-69dcf64438f4 | Details of Number of industries inspected and Directions issued under Section 5 of Environment (Protection) Act, 1986 by Central Pollution Control Board (CPCB) since 2016-17 till 14.06.2019 (From: Ministry of Environment, Forest and Climate Change) | Details of Number of industries inspected and Directions issued under Section 5 of Environment (Protection) Act, 1986 by Central Pollution Control Board (CPCB) since 2016-17 till 14.06.2019 (From: Ministry of Environment, Forest and Climate Change) | Central | Rajya Sabha | All | data.gov.in | 2021-03-04T06:52:31Z | 2021-03-12T17:56:27Z |
b8e4ff80-ec3c-439c-aebb-f27eabe410b3 | State/UT-wise Number of Complying and Non-Complying Locations w.r.t. Heavy Metals According Central Pollution Control Board (CPCB) during 2017 (From : Ministry of Environment, Forest and Climate Change) | State/UT-wise Number of Complying and Non-Complying Locations w.r.t. Heavy Metals According Central Pollution Control Board (CPCB) during 2017 (From : Ministry of Environment, Forest and Climate Change) | Central | Rajya Sabha | All | data.gov.in | 2021-03-04T06:37:26Z | 2021-03-04T06:37:26Z |
##Multiple Criteria
dplyr::intersect(search_api_by_title(title_contains = "pollution"),
search_api_by_organization(organization_name_contains = "pollution"))
index_name | title | description | org_type | org | sector | source | created_date | updated_date |
---|---|---|---|---|---|---|---|---|
0579cf1f-7e3b-4b15-b29a-87cf7b7c7a08 | Details of Comprehensive Environmental Pollution Index (CEPI) Scores and Status of Moratorium in Critically Polluted Areas (CPAs) in India | NA | Central | Ministry of Environment and Forests|Central Pollution Control Board | Industrial Air Pollution|Water Quality|Natural Resources|Environment and Forest | data.gov.in | 2017-06-08T16:36:24Z | 2018-11-30T02:35:16Z |
Once you have found the right API for your use, take a a note of the “index_name” of that API, for example, “0579cf1f-7e3b-4b15-b29a-87cf7b7c7a08” corresponds to the API for “Details of Comprehensive Environmental Pollution Index (CEPI) Scores and Status of Moratorium in Critically Polluted Areas (CPAs) in India”. index_name will be essential for both getting to know more about the API or to even get data from it.