Strings

Markus Wamser

2018-01-13

A few examples of the string helper functions and a comparison to substr and other R base functions.

First of all, we want the package loaded and attached. Same for the benchmarks.

library(Wmisc)

A string from (http://slipsum.com/).

s <- "You think water moves fast? You should see ice. It moves like it has a mind. Like it knows it killed the world once and got a taste for murder. After the avalanche, it took us a week to climb out. Now, I don't know exactly when we turned on each other, but I know that seven of us survived the slide... and only five made it out. Now we took an oath, that I'm breaking now. We said we'd say it was the snow that killed the other two, but it wasn't. Nature is lethal but it doesn't hold a candle to man."

Head and Tail

demo

strHead(s)
## [1] "Y"
strHeadLower(s)
## [1] "y"
strTail(s)
## [1] "ou think water moves fast? You should see ice. It moves like it has a mind. Like it knows it killed the world once and got a taste for murder. After the avalanche, it took us a week to climb out. Now, I don't know exactly when we turned on each other, but I know that seven of us survived the slide... and only five made it out. Now we took an oath, that I'm breaking now. We said we'd say it was the snow that killed the other two, but it wasn't. Nature is lethal but it doesn't hold a candle to man."

compare to built-in substr

substr(s,1,1)
## [1] "Y"
tolower(substr(s,1,1))
## [1] "y"
substring(s,2)
## [1] "ou think water moves fast? You should see ice. It moves like it has a mind. Like it knows it killed the world once and got a taste for murder. After the avalanche, it took us a week to climb out. Now, I don't know exactly when we turned on each other, but I know that seven of us survived the slide... and only five made it out. Now we took an oath, that I'm breaking now. We said we'd say it was the snow that killed the other two, but it wasn't. Nature is lethal but it doesn't hold a candle to man."

benchmark

The benchmark results will vary greatly depending on the version of R used. With R 3.3 (upwards) the built-in functions should be preferred.

if (requireNamespace("DiagrammeR", quietly = TRUE)) { # only benchmark if microbenchmark is installed
  library(microbenchmark)

  microbenchmark(substr(s,1,1),strHead(s),times=100000)
  microbenchmark(tolower(substr(s,1,1)),strHeadLower(s),times=100000)
  microbenchmark(substring(s,2),strTail(s),times=100000)
} else {
  print("The benchmarks were skipped as package microbenchmark is not installed.")
} 
## Warning: package 'microbenchmark' was built under R version 3.4.3
## Unit: microseconds
##             expr   min    lq     mean median    uq      max neval
##  substring(s, 2) 4.333 4.727 5.226814  5.121 5.514   90.584 1e+05
##       strTail(s) 5.120 5.908 7.284357  5.909 6.302 5777.985 1e+05

Take and Drop

substr(s,1,42)
## [1] "You think water moves fast? You should see"
strTake(s,42)
## [1] "You think water moves fast? You should see"
substring(s,43)
## [1] " ice. It moves like it has a mind. Like it knows it killed the world once and got a taste for murder. After the avalanche, it took us a week to climb out. Now, I don't know exactly when we turned on each other, but I know that seven of us survived the slide... and only five made it out. Now we took an oath, that I'm breaking now. We said we'd say it was the snow that killed the other two, but it wasn't. Nature is lethal but it doesn't hold a candle to man."
strDrop(s,42)
## [1] " ice. It moves like it has a mind. Like it knows it killed the world once and got a taste for murder. After the avalanche, it took us a week to climb out. Now, I don't know exactly when we turned on each other, but I know that seven of us survived the slide... and only five made it out. Now we took an oath, that I'm breaking now. We said we'd say it was the snow that killed the other two, but it wasn't. Nature is lethal but it doesn't hold a candle to man."
if (requireNamespace("DiagrammeR", quietly = TRUE)) { # only benchmark if microbenchmark is installed
  microbenchmark(substr(s,1,42),strTake(s,42),times=100000)
  microbenchmark(substring(s,43),strDrop(s,42),times=100000)
}  
## Unit: microseconds
##              expr   min    lq     mean median    uq        max neval
##  substring(s, 43) 4.333 4.727 5.647717  5.121 5.515   4641.765 1e+05
##    strDrop(s, 42) 5.514 6.302 9.015308  6.303 6.696 116355.227 1e+05