General

The aim of the package listArray is to create a data object which looks like an array, but behaves like a list. Thus, something like

Additionally, any R object should possible as index, thus it should work:

The package hash does something similar. But the keys follow the list element naming convention and does not allow array indices.

Note that neither speed nor memory efficiency have any importance for this implementation.

Generate a listArray from a vector, matrix or array

A listArray is a list (or environment), therefore list (or environment) operations can be applied:

l <- listArray(1)
length(l)
#> [1] 1
names(l)
#> [1] "58,0a,00,00,00,03,00,03,06,03,00,03,05,00,00,00,00,05,55,54,46,2d,38,00,00,00,13,00,00,00,01,00,00,00,0e,00,00,00,01,3f,f0,00,00,00,00,00,00"
l[[1]]
#> [1] 1
class(l)
#> [1] "listArray" "list"

Thus, every R object which used in the index is translated to a unique name. The original indices can be obtained as string by:

keys(l)
#> [1] "1"

Unnamed or named vectors

For creating vectors the listArray function can be used.

# unnamed vectors
v <- 1:5
l <- listArray(v)
keys(l)
#> [1] "1" "2" "3" "4" "5"
#
l <- listArray(letters[1:5])
l[1]
#> [1] "a"
# named vector
v <- 1:5
names(v) <- letters[1:5]
l <- listArray(v)
l["a"]
#> [1] 1

Matrix

For matrices or arrays:

m <- matrix(1:9, 3, 3)
l <- listArray(m)
l[2,3]  # should be 8
#> [1] 8

Names or numbers as indices

Since for listArrays l[1] and l["A"] is something different, you have to decide with named vectors, matrices or arrays if you use the names or numbers. The default is to use names if available.

m <- matrix(1:4, 2, 2)
colnames(m) <- LETTERS[1:2]
l <- listArray(m)
keys(l)
#> [1] "1, \"A\"" "2, \"A\"" "1, \"B\"" "2, \"B\""

You can force with use.names=FALSE that always numerical indices will used

m <- matrix(1:4, 2, 2)
colnames(m) <- LETTERS[1:2]
l <- listArray(m, use.names=FALSE)
keys(l)
#> [1] "1, 1" "2, 1" "1, 2" "2, 2"

Ignore entries in a vector, matrix or array

Sometimes you may not want to store certain elements of vector, matrix or array; just think in terms of sparse objects.

m <- diag(3)
l <- listArray(m, ignore=0)
keys(l)
#> [1] "1, 1" "2, 2" "3, 3"

The parameter ignore can be either a table of values to exclude or a function which returns for a vector a logical vector with TRUE (= value excluded) and FALSE (= value included).

nozeroes <- function(v) { v==0 }
#
m <- diag(3)
l <- listArray(m, ignore=nozeroes)
keys(l)
#> [1] "1, 1" "2, 2" "3, 3"

List or environment as base?

Rather than using a list it is possible to use an environment as base which might be of interest for package developers.

e <- listArray(env=TRUE)
class(e)
#> [1] "listArray"   "environment"
e[1] <- "hello world"
e[1]
#> [1] "hello world"
ls(e)
#> [1] "58,0a,00,00,00,03,00,03,06,03,00,03,05,00,00,00,00,05,55,54,46,2d,38,00,00,00,13,00,00,00,01,00,00,00,0e,00,00,00,01,3f,f0,00,00,00,00,00,00"

Access a listArray object

Extract operators

You simply use the [ operator to access listArray elements.

l <- listArray()
l[0] <- 1
l[0]
#> [1] 1
l[pi] <- pi
l[pi]
#> [1] 3.141593
anotherpi <- pi
l[anotherpi]
#> [1] 3.141593
l[1,-2] <- 3
l[1,-2]
#> [1] 3

Difference between listArray and vector

A listArray considers each index element as different. The following works for vectors:

m <- 1:5
m[1:2]
#> [1] 1 2

But l[1:2] returns NULL since the index 1:2 does not exist.

l <- listArray(m)
l[1:2]
#> NULL
keys(l)
#> [1] "1" "2" "3" "4" "5"

Difference between listArray and matrix/array

Similarly, it holds

m <- matrix(1:4, 2, 2)
m[1,]
#> [1] 1 3
l <- listArray(m)
l[1,] # will even throw an error
#> Error in key(...): invalid index?

Normalization and the options listArray.XXX

To achieve, e.g. that l[1:3] and l[c(1,2,3)] address the same element, as we would expect, we need some kind of normalization. Since 1:3 and c(1,2,3) a different R objects, a normalization is internally done.

identical(1:2, c(1,2)) # delivers FALSE!
#> [1] FALSE
# but
m <- matrix(1:9, 3, 3)
m[1:2,2]
#> [1] 4 5
m[c(1,2),2]
#> [1] 4 5

There are two problems

  1. 1:3 is of class integer whereas c(1,2,3) is of class numeric and
  2. 1:3 is a compact sequence in R whereas c(1,2,3) is a full vector

Therefore, normalization currently consists of

  1. converting in indices everything from class integer to numeric and
  2. expand compact sequences like 1:3 to real vectors, e.g. c(1,2,3).

The normalization steps can be switched off by setting the options listArray.expand and listArray.int2num.

l <- listArray()
l[1:3] <- 1
l[c(1,2,3)]
#> [1] 1
options(listArray.expand=FALSE) # now 1:3 != c(1,2,3)
l <- listArray()
l[1:3] <- 1
l[c(1,2,3)]
#> NULL

The default is that listArray.expand and listArray.int2num are not set which is interpreted as listArray.expand=TRUE and listArray.int2num=TRUE.

The workhorse function key

The main function to create a string from a set of R objects is key. By using l[...] internally is called l[[key(...)]]. Thus, you could only use key rather than a listArray object.

The normalization are done via

  1. rapply(l, expand, classes=c("numeric", "integer"), how="replace") with expand <- function(x) { unserialize(serialize(x, connection=NULL, version=2)) } and
  2. rapply(l, as.numeric, classes="integer", how="replace").

In future might be further normalization necessary then the two above.

Acknowledgments

Thanks to Henrik Bengtsson and Duncan Murdoch which hinted me how to normalize a compact sequence in R without writing C++ code.