match_01a
|
Detect genus sp. , genus ssp. and
genus spp.
|
first word (“genus”)
|
exact
|
APC accepted taxon concepts, other APC taxon concepts, APNI
|
genus
|
First goal is to align 2-word strings that indicate an unknown species
within a genus (or family)
|
match_01b
|
Detect genus sp. , genus ssp. and
genus spp.
|
first word (“genus”)
|
fuzzy
|
APC accepted taxon concepts
|
genus
|
NA
|
match_01c
|
Detect genus sp. , genus ssp. and
genus spp.
|
first word (“genus”)
|
fuzzy
|
other APC taxon concepts
|
genus
|
NA
|
match_02a
|
Detect family sp. , family ssp. and
family spp.
|
first word (“genus”)
|
exact
|
APC accepted taxon concepts
|
family
|
NA
|
match_03a
|
Detect -- , -- (intergrade taxa) and align to
genus
|
first word (“genus”)
|
exact
|
APC accepted taxon concepts, other APC taxon concepts, APNI
|
genus
|
Next find strings that indicate a name reflects an intergrade between
two taxa. These names can only be aligned to a genus.
|
match_03b
|
Detect -- , -- (intergrade taxa) and align to
genus
|
first word (“genus”)
|
fuzzy
|
APC accepted taxon concepts
|
genus
|
NA
|
match_03c
|
Detect -- , -- (intergrade taxa) and align to
genus
|
first word (“genus”)
|
fuzzy
|
other APC taxon concepts
|
genus
|
NA
|
match_03d
|
Detect -- , -- (intergrade taxa) and align to
genus
|
first word (“genus”)
|
fuzzy
|
APNI
|
genus
|
NA
|
match_03e
|
Detect -- , -- (intergrade taxa), but fail to
align to genus
|
NA
|
no match
|
NA
|
genus
|
NA
|
match_04a
|
Detect \ (indecision between taxa) and align to genus.
|
first word (“genus”)
|
exact
|
APC accepted taxon concepts, other APC taxon concepts, APNI
|
genus
|
Next find strings that indicate a name reflects a data collector’s
indecision about which of two (or more) taxa is the appropriate taxon.
These names can only be aligned to a genus.
|
match_04b
|
Detect \ (indecision between taxa) and align to genus.
|
first word (“genus”)
|
fuzzy
|
APC accepted taxon concepts
|
genus
|
NA
|
match_04c
|
Detect \ (indecision between taxa) and align to genus.
|
first word (“genus”)
|
fuzzy
|
other APC taxon concepts
|
genus
|
NA
|
match_04d
|
Detect \ (indecision between taxa) and align to genus.
|
first word (“genus”)
|
fuzzy
|
APNI
|
genus
|
NA
|
match_04e
|
Detect \ (indecision between taxa), but fail to align to
genus
|
NA
|
no match
|
NA
|
genus
|
NA
|
match_05a
|
Detect scientific names, including authorship
|
original_name
|
exact
|
APC accepted taxon concepts
|
species/infraspecific
|
Check if strings are full scientific names, including authorship.
|
match_05b
|
Detect scientific names, including authorship
|
original_name
|
exact
|
other APC taxon concepts
|
species/infraspecific
|
NA
|
match_06a
|
Detect canonical names, lacking authorship
|
cleaned_name
|
exact
|
APC accepted taxon concepts
|
species/infraspecific
|
Check if strings are taxon names, lacking authorship.
|
match_06b
|
Detect canonical names, lacking authorship
|
cleaned_name
|
exact
|
other APC taxon concepts
|
species/infraspecific
|
NA
|
match_07a
|
Detect canonical names, lacking authorship
|
stripped_name
|
fuzzy
|
APC accepted taxon concepts
|
species/infraspecific
|
NA
|
match_07b
|
Detect canonical names, lacking authorship
|
stripped_name
|
fuzzy
|
other APC taxon concepts
|
species/infraspecific
|
NA
|
match_08a
|
Detect canonical names, lacking authorship
|
cleaned_name
|
exact
|
APNI
|
species/infraspecific
|
NA
|
match_09a
|
Detect aff , affinis (affinity to) and align to
genus
|
first word (“genus”)
|
exact
|
APC accepted taxon concepts, other APC taxon concepts, APNI
|
genus
|
Find strings that indicate a name that indicates an affinity to a
specific taxon, but the name itself is not that taxon. Such names,
unless documented in APC (i.e. matches 6, 7 above) can only be aligned
to genus.
|
match_09b
|
Detect aff , affinis (affinity to) and align to
genus
|
first word (“genus”)
|
fuzzy
|
APC accepted taxon concepts
|
genus
|
NA
|
match_09c
|
Detect aff , affinis (affinity to) and align to
genus
|
first word (“genus”)
|
fuzzy
|
other APC taxon concepts
|
genus
|
NA
|
match_09d
|
Detect aff , affinis (affinity to) and align to
genus
|
first word (“genus”)
|
fuzzy
|
APNI
|
genus
|
NA
|
match_09e
|
Detect aff , affinis (affinity to), but fail to
align to genus
|
NA
|
no match
|
NA
|
genus
|
NA
|
match_10a
|
Detect canonical names, lacking authorship
|
stripped_name
|
imprecise fuzzy
|
APC accepted taxon concepts
|
species/infraspecific
|
Further checks if strings are taxon names, lacking authorship, now with
imprecise fuzzy matching
|
match_10b
|
Detect canonical names, lacking authorship
|
stripped_name
|
imprecise fuzzy
|
other APC taxon concepts
|
species/infraspecific
|
NA
|
match_11a
|
Detect x (hybrid taxon) and align to genus
|
first word (“genus”)
|
exact
|
APC accepted taxon concepts, other APC taxon concepts, APNI
|
genus
|
Find strings that indicate a name that is a hybrid between two taxa.
Such names, unless documented in APC (i.e. matches 6, 7 above) can only
be aligned to genus.
|
match_11b
|
Detect x (hybrid taxon) and align to genus
|
first word (“genus”)
|
fuzzy
|
APC accepted taxon concepts
|
genus
|
NA
|
match_11c
|
Detect x (hybrid taxon) and align to genus
|
first word (“genus”)
|
fuzzy
|
other APC taxon concepts
|
genus
|
NA
|
match_11d
|
Detect x (hybrid taxon) and align to genus
|
first word (“genus”)
|
fuzzy
|
APNI
|
genus
|
NA
|
match_11e
|
Detect x (hybrid taxon), but fail to align to genus
|
NA
|
no match
|
NA
|
genus
|
NA
|
match_12a
|
Detect canonical names, by checking first three words in string
|
trinomial (from stripped_name_2)
|
exact
|
APC accepted taxon concepts
|
species/infraspecific
|
Check if the first three words in the name string match with a taxon
name, allowing notes to be discarded. Also useful for aligning phrase
names.
|
match_12b
|
Detect canonical names, by checking first three words in string
|
trinomial (from stripped_name_2)
|
exact
|
other APC taxon concepts
|
species/infraspecific
|
NA
|
match_13a
|
Detect canonical names, by checking first three words in string
|
trinomial (from stripped_name_2)
|
fuzzy
|
APC accepted taxon concepts
|
species/infraspecific
|
NA
|
match_13b
|
Detect canonical names, by checking first three words in string
|
trinomial (from stripped_name_2)
|
fuzzy
|
other APC taxon concepts
|
species/infraspecific
|
NA
|
match_14a
|
Detect canonical names, by checking first two words in string
|
binomial (from stripped_name_2)
|
exact
|
APC accepted taxon concepts
|
species/infraspecific
|
Check if the first two words in the name string match with a taxon name,
allowing notes and invalid infraspecific names to be discarded. Also
useful for aligning phrase names.
|
match_14b
|
Detect canonical names, by checking first two words in string
|
binomial (from stripped_name_2)
|
exact
|
other APC taxon concepts
|
species/infraspecific
|
NA
|
match_15a
|
Detect canonical names, by checking first two words in string
|
binomial (from stripped_name_2)
|
fuzzy
|
APC accepted taxon concepts
|
species/infraspecific
|
NA
|
match_15b
|
Detect canonical names, by checking first two words in string
|
binomial (from stripped_name_2)
|
fuzzy
|
other APC taxon concepts
|
species/infraspecific
|
NA
|
match_16a
|
Detect canonical names, lacking authorship
|
stripped_name
|
fuzzy
|
APNI
|
species/infraspecific
|
Further checks if strings are APNI taxon names, lacking authorship, now
with fuzzy matching or considering just the first three or two words in
the string.
|
match_17a
|
Detect canonical names, lacking authorship
|
stripped_name
|
imprecise fuzzy
|
APNI
|
species/infraspecific
|
NA
|
match_18a
|
Detect canonical names, by checking first three words in string
|
trinomial (from stripped_name_2)
|
exact
|
APNI
|
species/infraspecific
|
NA
|
match_19a
|
Detect canonical names, by checking first two words in string
|
binomial (from stripped_name_2)
|
exact
|
APNI
|
species/infraspecific
|
NA
|
match_20a
|
Detect an APC-accepted genus, by checking the first word in the string
|
first word (“genus”)
|
exact
|
APC accepted taxon concepts
|
genus
|
Check if the first two word in the name string match with a taxon name,
allowing an alignment to the genus-level or family-level
|
match_20b
|
Detect an APC-known genus, by checking the first word in the string
|
first word (“genus”)
|
exact
|
other APC taxon concepts
|
genus
|
NA
|
match_20c
|
Detect an APC-known genus, by checking the first word in the string
|
first word (“genus”)
|
exact
|
APNI
|
genus
|
NA
|
match_21a
|
Detect an APC-known family, by checking the first word in the string
|
first word (“genus”)
|
exact
|
APC accepted taxon concepts
|
family
|
NA
|
match_22a
|
Detect an APC-accepted genus, by checking the first word in the string
|
first word (“genus”)
|
fuzzy
|
APC accepted taxon concepts
|
genus
|
NA
|
match_22b
|
Detect an APC-known genus, by checking the first word in the string
|
first word (“genus”)
|
fuzzy
|
other APC taxon concepts
|
genus
|
NA
|