2 Example for using iotarelr in practice

2.1 Estimating the values

At the beginning, data generated by at least two coders is needed. Let us assume that four coders analyzed 20 textual fragments with a coding scheme consisting of three categories A, B, and C.

library(iotarelr)
coder_1<-c("A","A","C","C","B","A","A","B","C","A","B","C","A","B","A","B","C","A","A","C")
coder_2<-c("A","A","C","B","B","A","A","B","C","A","A","C","A","C","A","B","A","A","A","C")
coder_3<-c("A","A","C","C","B","A","A","B","C","A","B","C","A","B","A","B","C","A","A","C")
coder_4<-c("A","C","C","C","B","A","A","B","C","A","B","A","A","B","A","C","B","A","A","C")

coded_data<-cbind(coder_1,coder_2,coder_3,coder_4)

In this example, the characteristics are saved as characters. The package also supports that the categories are stored as integers. The only important aspect is that the rows must contain the coding units (e.g., the textual fragments), and that the columns represent the different coders. In the next step, the estimation of iota and its elements starts via the function compute_iota1().

results<-compute_iota1(data=coded_data)

2.2 The alpha values

results$alpha
#>              A        B         C
#> [1,] 0.5647321 0.140625 0.2020089

The component $alpha saves the values for the chance-corrected alpha reliabilities. These values describe the extend in which the true characteristic of a coding unit is discovered. It ranges from 0 to 1. 0 indicates the absence of reliability. That is, the assignment of the true category equals a random selection between the categories. 1 indicates that the true value is always recovered.

In the current example, the chance-corrected alpha value for category A is relatively high. This means that the coding scheme leads coders to assign characteristic A to a coding unit if the true characteristic of the coding unit is A with a high probability. In contrast, the values for the other two categories are very low. This indicates that a coding unit with the true characteristic B or C is often assigned to the other categories. That is, the coding scheme does not ensure that coders discover the true category if the true category is B or C.

2.3 The beta values

To provide a more detailed insight of the coding scheme, the beta values account for the errors which occur in the case that the true category is not discovered. That is, the beta values describe the extent to which a category is influenced by errors made in other categories. For example, if the true category of a coding unit is A and a coder does not assign A to that coding unit, the data for category B and C are biased. The data representing categories B and C comprises a coding unit that should not be part of the data.

The chance-corrected values for the beta reliabilities are stored in $beta and range between 0 and 1. 0 indicates that the beta reliability equals the beta reliability in the case of complete guessing. 1 indicate the absence of any beta errors.

results$beta
#>             A         B         C
#> [1,] 0.746875 0.6835938 0.6203125

In the current example, the chance-corrected beta reliabilities are relatively high. This means that the different categories are not strongly influenced by errors in the other categories.

2.4 The iota values

Iota values summarize the different types of errors for each category by averaging the chance-corrected reliabilities. They are stored in $iota.

results$iota
#>              A         B         C
#> [1,] 0.6558036 0.4121094 0.4111607

Iota can range between 0 and 1. 0 indicates that the quality of the coding of a category equals random guessing. Codings of this category are not reliable. 1 indicates a perfect reliability of a category. That is, the true value of a coding unit is recovered if the true category is the category under investigation and errors made by coding coding units of other categories do note influence the data of the category under investigation. In the current example category A is quite reliable while the other categories are not.

2.5 Assignment-Error Matrix (AEM)

The assignment-error matrix combines the alpha and beta values and provides the most detailed description of a coding scheme. It is based on the raw estimates without any chance-correction.

results$assignment_error_matrix
#>           A         B         C
#> A 0.4285714 0.4545455 0.5454545
#> B 0.4000000 0.8461538 0.6000000
#> C 0.4444444 0.5555556 0.7857143

The AEM has to be read row by row because the rows represent the true category of a coding unit and the columns represent the assigned categories. The values on the diagonal represent the alpha-error of the categories. That is the probability not to assign the true category to a coding unit. The other cells describe the probability to assign the category to the other categories. That is, they inform about the probability to choose a category under the condition that the true category is not recovered.

In the example, the alpha-error of category A is about .43 meaning that a coding unit of category A is coded as another category in about 43 %. Or in other words: The probability to assign the true category to a coding unit of category A is about 57%. Thus, the second and third cell in the first row mean: If the true category of a coding unit belonging to category A is not recovered, about 45% of the codings are assigned as category B and 55% of the codings are assigned to category C. Thus, category C suffers more from errors made with coding units truly belonging to category A than category B.

The alpha-error of category B is about .85. Thus in about 85% of cases the coding units truly belonging to category B are assigned to another category. In other words: The probability to recover the true category is about 15 % if the true category of the coding unit is B. In the case that this error occurs, 40% of the cases are assigned as category A and 60 % are assigned to category C. Thus, the data of category C suffers more from errors made on coding units belonging truly to category B than category A.

The alpha-error of category C is about 79%. Thus, in about 79% of the cases a coding unit truly belonging to category C is assigned to category A or B. If this error occurs 44 % of the codings units are treated as category A and 56% are treated as category B. In consequence, category B suffers more form errors made with codings units of category C than category A. In other words: The data for category B is more strongly biased by errors of category C than the data of category A.

2.6 Scale level

The measures described above provide detailed insights into the reliability of every single category which is a new feature for content analysis and very helpful for constructing a coding scheme or for evaluating data of empirical studies. In many applications however, the values have to be summarized to values representing the quality of the complete coding scheme. In the Iota Concept this is done by averaging the iota values for each category. The value is accessible by $average_iota.

results$average_iota
#> [1] 0.4930246

In the current example average iota is about .49. At the moment only a rule of thumb for ordinal data exist. According to Berding et al. (2022) an average iota of at least .474 is necessary for an acceptable level of reliability on the scale level. For a “good” reliability average iota should be at least .601.

Old 1) How to use Iota1

Florian Berding and Julia Pargmann

2024-01-25

Important Note

1 Introduction