Map new concepts

library(ontologics)
library(dplyr, warn.conflicts = FALSE)

When an ontology exists already, it can be used either by looking up which concepts exist in it, which relations the concepts have among each other, by adding and linking new concepts to the already existing concepts or by extending the harmonised concepts based on external ontologies.

Relate an external dataset to harmonized concepts

When adding (or mapping) new concepts, we first have to define the source of the new concepts.

# already existing ontology for some project about crops
crops <- load_ontology(path = system.file("extdata", "crops.rds", package = "ontologics"))

# where we have to set the external dataset as new source
crops <- new_source(name = "externalDataset",
                    version = "0.0.1",
                    description = "a vocabulary",
                    homepage = "https://www.something.net",
                    license = "CC-BY-4.0",
                    ontology = crops)

# new concepts that occur in the external dataset, which should be harmonised with the ontology
externalConcepts <- c("Wheat", "NUTS", "Avocado")

The new concepts are from different conceptual levels, both ‘Wheat’ and ‘Avocado’ are the crop itself, while ‘NUTS’ is an aggregate of various crops (such as walnut, hazelnut, etc). Let’s first find out whether these concepts are in fact new concepts because they are missing from the ontology.

missingConcepts <- get_concept(label = externalConcepts, ontology = crops)
missingConcepts %>% 
  select(1:5) %>% 
  kable()
id label description class has_broader
.02.09 Wheat NA class .02
.10 NUTS NA group NA

This tells us that that both ‘NUTS’ and ‘Wheat’ don’t seem to be missing from the ontology. Moreover, we see that ‘Wheat’ is a class and not a crop and ‘NUTS’ also isn’t crop and doesn’t have any broader concept. While Avocado is missing from the ontology entirely, the crop Wheat is also missing so that both have to be defined as new concept.

By studying the ontology (not shown here), we can identify the semantic relation between the new concepts and some of the already harmonized concepts. First of all, to assign new concepts from an external ontology, a harmonised concept must already exist in the harmonised ontology. For the new harmonized concepts, we chose the lower capital letter words to show the difference between those and the external concepts and we assign them as narrower concepts to the respective already existing, harmonized (broader) concepts.

broaderConcepts <- get_concept(label = c("Wheat", "Tropical and subtropical Fruit"), 
                               ontology = crops)

crops <- new_concept(new = c("wheat", "avocado"),
                     broader = broaderConcepts,
                     class = "crop",
                     ontology = crops)

Eventually, all new (external) concepts can be mapped to already harmonized concepts. Even though ‘NUTS’ already exists, this also applies to this new concept, because the already existing concept ‘NUTS’ doesn’t necessarily have to be the same as the new concept ‘NUTS’. This all depends on the respective description of the harmonized and external concepts. When setting a new mapping, the type and the certainty of the match have to be defined. As we have defined the closely related concepts as new concepts, we can chose a close match for all external concepts. Certainty has the value 3, because it is clear, assigned according to a given definition.

toMap <- get_concept(label = c("wheat", "NUTS", "avocado"),
                     ontology = crops)

crops <- new_mapping(new = externalConcepts,
                     target = toMap,
                     match = c("close", "close", "close"),
                     source = "externalDataset",
                     certainty = 3,
                     ontology = crops)

Extend the classes

It may be, moreover, that we want to (or have to) add new concepts, that do not have a class defined yet. This can be the case when we have to nest new concepts, out of a range of concepts that do have a valid class, into concepts that are already at the lowest possible hierarchical level. These concepts are, when specified with class = NA, class = c(bla, blubb, NA), or class = NULL, assigned the actual class undefined and you are informed about how to proceed.

broaderConcepts <- get_concept(label = c("wheat", "wheat"),
                               ontology = crops)

# for (some of) these concepts we do not know the class ...
crops <- new_concept(new = c("wheat1", "wheat2"),
                     broader = broaderConcepts,
                     class = NA_character_,
                     ontology = crops)
#> Warning: some new concepts (wheat1, wheat2) don't have a class; please define
#> this with 'new_class()' and re-run 'new_concept()' with these concepts and the
#> new class.

make_tree(label = "Wheat", ontology = crops) %>%
  select(1:5) %>%
  kable()
id label description class has_broader
.02.09 Wheat NA class .02
.02.09.01 wheat NA crop .02.09
.02.09.01.01 wheat1 NA undefined .02.09.01
.02.09.01.02 wheat2 NA undefined .02.09.01

# ... ok, then let's specify that class and re-run new_concept
crops <- new_class(new = "cultivar", target = "crop", 
                   description = "type of plant that people have bred for desired traits", 
                   ontology = crops)

crops <- new_concept(new = c("wheat1", "wheat2"),
                     broader = broaderConcepts,
                     class = "cultivar",
                     ontology = crops)

Now we can check whether the updated ontology is as we’d expect, for example by looking at the tree of the respective items again. We should expect that the new harmonized concepts now appear in the ontology and that they have some link to an external concept defined.

make_tree(label = "Wheat", ontology = crops) %>%
  select(1:5) %>%
  kable()
id label description class has_broader
.02.09 Wheat NA class .02
.02.09.01 wheat NA crop .02.09
.02.09.01.03 wheat1 NA cultivar .02.09.01
.02.09.01.04 wheat2 NA cultivar .02.09.01
make_tree(label = "NUTS", ontology = crops) %>%
  select(1:5) %>%
  kable()
id label description class has_broader
.10 NUTS NA group NA
.10.01 Treenuts NA class .10
.10.02 Other nuts NA class .10
make_tree(label = "FRUIT", ontology = crops) %>%
  select(1:5) %>%
  kable()
id label description class has_broader
.06 FRUIT NA group NA
.06.01 Berries NA class .06
.06.02 Citrus Fruit NA class .06
.06.03 Grapes NA class .06
.06.04 Pome Fruit NA class .06
.06.05 Stone Fruit NA class .06
.06.06 Tropical and subtropical Fruit NA class .06
.06.06.01 avocado NA crop .06.06
.06.07 Other fruit NA class .06
# and finally a list of all external concepts
get_concept(external = TRUE, ontology = crops) %>% 
  kable()
id label description has_source has_broader
externalDataset_1 Wheat NA 2 .02.09
externalDataset_2 NUTS NA 2 .06.06
externalDataset_3 Avocado NA 2 NA