translate(expression = "", snames = "")compute(expression = "", data, separate = FALSE)findRows(expression = "", ttobj, remainders = FALSE)
These functions interpret an expression written in a SOP (sum of products) form, for both crisp and multivalue QCA. The function translate() translates the expression into a standard (canonical) SOP form using a matrix of implicants, while compute() uses the first to compute the scores based on a particular dataset.
The function findRows() takes a QCA expression written in SOP form, and applies it on a truth table to find all rows that match the pattern in the expression.
For crisp sets notation, upper case letters are considered the presence of that causal condition, and lower case letters are considered the absence of the respective causal condition. Tilde is recognized as a negation, even in combination with upper/lower letters.
Functions similar to translate() and compute() have initially been written by Jirka Lewandowski (2015) but the actual code in these functions has been completely re-written to integrate it with the package QCAGUI, and expanded with more extensive functionality (see details and examples below).
A SOP ("sum of products") is also known as a DNF ("disjunctive normal form"), or in other
words a "union of intersections", for example A*D + B*c
.
The same expression can be written in multivalue notation: A{1}*D{1} + B{1}*C{0}
.
Both types of expressions are valid, and yield the same result on the same dataset.
For multivalue notation, expressions can contain multiple values to translate, separated
by a comma. If B was a multivalue causal condition, an expression could be:
A{1} + B{1,2}*C{0}
.
In this example, all values in B equal to either 1 or 2 will be translated to 1, and the rest of the (multi)values will be translated to 0.
In multivalue notation, causal snames are expected as upper case letters, and they will be converted to upper case by default.
The function automatically detects the use of tilde "~" as a negation for a particular
causal condition. ~A
does two things: it identifies the presence of causal
condition A
(because it was specified as upper case) and it recognizes that it
must be negated, because of the tilde. It works even combined with lower case names:
~a
, which is interpreted as A
.
For multivalue notation, a pseudo-standard is applied. For a binary causal condition,
A{0}
is the negation of A
, and ~A{0}
can be interpreted
as the presence of A
. Starting from these two agreed statements, when multiple
values are supplied, the pseudo-standard interprets anything that contains a value of 0 as the
absence of causal condition: A{0,2}
will be translated as 0, and upon recoding in
the real data, values 0 and 2 will be recoded to 0 and the rest of the values to 1.
Similarly, negations work with multivalue snames: ~A{1,2}
is be interpreted as
"all values except 1 and 2 should be translated as 0", whereas ~A{0,2}
will be
translated as 1, and all other values except 0 and 2 will be recoded to 0.
The use of the product operator *
is redundant when causal snames' names are single
letters (for example AD + Bc
), and is also redundant for multivalue data, where
product terms can be separated by using the curly brackets notation.
When causal snames are binary and their names have multiple letters (for example
AA + CC*bb
), the use of the product operator *
is preferable but the
function manages to translate an expression even without it (AA + CCbb
) by searching
deep in the space of the conditions' names, at the cost of slowing down for a high number of causal
conditions. For this reason, an arbitrary limit of 7 causal snames
is imposed, to write an
expression with.
0 | absence of a causal condition |
1 | presence of a causal condition |
-1 | causal condition was eliminated |
Jirka Lewandowski (2015) QCAtools: Helper functions for QCA in R. R package version 0.1
translate("A + B*C")A B C A 1 B*C 1 1# same thing in multivalue notation translate("A{1} + B{1}*C{1}")A B C A{1} 1 B{1}C{1} 1 1# using upper/lower letters translate("A + b*C")A B C A 1 b*C 0 1# the negation with tilde is recognised translate("~A + b*C")A B C ~A 0 b*C 0 1# even in combination of upper/lower letters translate("~A + ~b*C")A B C ~A 0 ~b*C 1 1# and even for multivalue variables translate("~A{1} + ~B{0}*C{1}")A B C ~A{1} 0 ~B{0}C{1} 1 1# in multivalue notation, the product sign * is redundant translate("C{1} + T{2} + T{1}V{0} + C{0}")C T V C{1} 1 T{2} 1 T{1}V{0} 1 0 C{0} 0# multiple values can be specified translate("C{1} + T{1,2} + T{1}V{0} + C{0}")C T V C{1} 1 T{1,2} 1 T{1}V{0} 1 0 C{0} 0# or even negated translate("C{1} + ~T{1,2} + T{1}V{0} + C{0}")C T V C{1} 1 ~T{1,2} 0 T{1}V{0} 1 0 C{0} 0# if the expression does not contain the product sign * # snames are required to complete the translation translate("AB + cD", "A, B, C, D")A B C D AB 1 1 cD 0 1# snames are not required translate("PER*FECT + str*ing")FECT ING PER STR PER*FECT 1 1 str*ing 0 0# snames are required translate("PERFECT + string", "PER, FECT, STR, ING")PER FECT STR ING PERFECT 1 1 string 0 0# it works even with overlapping columns # SU overlaps with SUB in SUBER, but the result is still correct translate("SUBER + subset", "SU, BER, SUB, SET")SU BER SUB SET SUBER 1 1 subset 0 0# error because combinations of condition names clash (not run) translate("SUPER + subset", "SUP, ER, SU, PER, SUB, SET") # to print _all_ codes from the standard output matrix (obj <- translate("A + b*C"))A B C A 1 b*C 0 1print(obj, original = TRUE) # also prints the -1 codeA B C A 1 -1 -1 b*C -1 0 1# for compute() data(CVF) compute("natpride + GEOCON", data = CVF)[1] 0.95 0.35 0.35 0.78 0.40 0.78 0.78 0.78 0.78 0.17 0.78 0.35 0.95 0.95 0.71 [16] 0.95 0.78 0.35 0.95 0.49 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95data(CVF) compute("natpride + GEOCON", data = CVF, separate = TRUE)natpride GEOCON 1 0.23 0.95 2 0.12 0.35 3 0.09 0.35 4 0.20 0.78 5 0.40 0.35 6 0.34 0.78 7 0.04 0.78 8 0.13 0.78 9 0.30 0.78 10 0.17 0.05 11 0.46 0.78 12 0.07 0.35 13 0.14 0.95 14 0.25 0.95 15 0.71 0.35 16 0.75 0.95 17 0.13 0.78 18 0.05 0.35 19 0.40 0.95 20 0.49 0.35 21 0.38 0.95 22 0.62 0.95 23 0.14 0.95 24 0.25 0.95 25 0.12 0.95 26 0.94 0.95 27 0.66 0.95 28 0.57 0.95 29 0.59 0.95# for findRows() data(LC) ttLC <- truthTable(LC, "SURV") findRows("DEV*ind*STB", ttLC)[1] 18 22 26 30findRows("DEV*ind*STB", ttLC, remainders = TRUE)[1] 18 26 30