{mosaicCalc} Quick Reference

Installing {mosaicCalc}

First, install R and RStudio according to the instructions given by your instructor. Opening R or RStudio will present you with an “R console.”

Then, you need to install the {mosaicCalc} software and a lot of related software. Do this from your R console:

install.packages("mosaic", "ggformula", "mosaicCalc")

This will take some minutes, but the process is entirely automatic.

You only need to install the packages once. If you switch to another computer, however, you may need to install the packages on that computer.
Starting an R session

When you sit down to work with {mosaicCalc}, you need to have an “R session” started. This is typically accomplished by clicking on the RStudio icon:

Alternatively, you may still have an R session open from your previous activities.

Before you can use {mosaicCalc}, you need to tell R to access the software. This is done with

library(mosaicCalc)

If you fail to do this, when you start using {mosaicCalc} functions you will encounter an error message in this format

Error in makeFun() : could not find function "makeFun"
To fix this, go back and run the library(mosaicCalc) command again.
Assignment: Giving names to values
Function application: Evaluating a function on an input
Writing mathematical formulas
Tilde expressions

A tilde expression is a special feature of R that let’s you write down a mathematical expression without having the expression evaluated. We use tilde expressions to construct our own mathematical functions and for a handful of other, related purposes such as plotting.

Constructing your own mathematical functions
Graphing functions with a single input
Drawing contour plots

In MOSAIC Calculus, we generally use a contour-plot format to display graphically a function with two inputs. This is done in much the same way as slice_plot().

Evaluating mathematical functions with two inputs

When you use makeFun() to define a function of two inputs, the created function will take two arguments, one for each of the inputs. Those arguments are named and it is a good idea when evaluating the function you created using the names explicitly.

h <- makeFun(x + 15*y - x*y ~ .)
Layering graphics

Sometimes you may want to compare two or more functions in the same graphics frame. You do this by drawing the individual functions in the usual way, but connecting the commands with a pipe, signified by the token %>%

Data frames

Essentially all of the data we use in this course is arranged as data frames. (This is also true in statistics and data science generally.) A data frame is a rectangular array. Each column is called a variable, each row is a case. For instance, in a data frame about people, each row might be an individual person. The different variables record different aspects of that person: height, age, sex, state of residence, etc. All of the entries within a column must be the same kind of thing: a number for height, a postal abbreviation for state, and so on. There are three kinds of things you will do with data frames:

  1. Look at them to orient yourself or browse.
  2. Access a variable.
  3. Wrangle them, for example extracting a subset of cases or combining data from two different data frames. Data wrangling is an important skill, but it isn’t the topic of this course. So when you need to wrangle data, we’ll tell you how.

For this course, almost all data will be provided by giving you the name of dataframes. Sometimes this name will be simple and in the usual form, e.g. EbolaAll. Other time, the name will be preceded with information about where the computer should look for the data, e.g. palmerpenguins::penguins.

Some simple, helpful commands for orienting yourself to a data frame: - names(EbolaAll) tells you the names of the columns in the data frame. - head(EbolaAll) shows the first several rows. - DT::datatable(EbolaAll) will show the entire data frame interactively, allowing you to page through the data. - help(EbolaAll) displays documentation about the data frame.

Once you know the basic facts of a data frame—what the variable names are and what kind of thing each variable records—you are ready to use the data in your work. Two common tasks are (1) to plot one variable versus another and (2) to use the variables in some calculation, such as fitting a model. Both tasks use much the same syntax based on tilde expressions.

For instance:

  1. to plot the number of Ebola cases in Guinea versus date: gf_point(Gcases ~ Date, data = EbolaAll). (See the “plotting data” section of this document.) The variable names (Gcases and Date) are used in the tilde expression, while the name of the data frame is specified in other argument, the named argument data=.
  2. For model-fitting, see below.
Plotting data

There is one type of data graphic that we will be using in this course: the point plot (also known as a “scatter plot”). There are, of course, many other kinds of statistical graphics such as density plots, jittered plots, etc. which you will learn about in a statistics course, but we will not use them here.

A point plot always involves two variables from a data frame. To illustrate, consider the palmerpenguins::penguins data frame, where each case is a individual penguin.

kable(penguins)
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
Adelie Torgersen 39.1 18.7 181 3750 male 2007
Adelie Torgersen 39.5 17.4 186 3800 female 2007
Adelie Torgersen 40.3 18.0 195 3250 female 2007
Adelie Torgersen NA NA NA NA NA 2007
Adelie Torgersen 36.7 19.3 193 3450 female 2007
Adelie Torgersen 39.3 20.6 190 3650 male 2007
Adelie Torgersen 38.9 17.8 181 3625 female 2007
Adelie Torgersen 39.2 19.6 195 4675 male 2007
Adelie Torgersen 34.1 18.1 193 3475 NA 2007
Adelie Torgersen 42.0 20.2 190 4250 NA 2007
Adelie Torgersen 37.8 17.1 186 3300 NA 2007
Adelie Torgersen 37.8 17.3 180 3700 NA 2007
Adelie Torgersen 41.1 17.6 182 3200 female 2007
Adelie Torgersen 38.6 21.2 191 3800 male 2007
Adelie Torgersen 34.6 21.1 198 4400 male 2007
Adelie Torgersen 36.6 17.8 185 3700 female 2007
Adelie Torgersen 38.7 19.0 195 3450 female 2007
Adelie Torgersen 42.5 20.7 197 4500 male 2007
Adelie Torgersen 34.4 18.4 184 3325 female 2007
Adelie Torgersen 46.0 21.5 194 4200 male 2007
Adelie Biscoe 37.8 18.3 174 3400 female 2007
Adelie Biscoe 37.7 18.7 180 3600 male 2007
Adelie Biscoe 35.9 19.2 189 3800 female 2007
Adelie Biscoe 38.2 18.1 185 3950 male 2007
Adelie Biscoe 38.8 17.2 180 3800 male 2007
Adelie Biscoe 35.3 18.9 187 3800 female 2007
Adelie Biscoe 40.6 18.6 183 3550 male 2007
Adelie Biscoe 40.5 17.9 187 3200 female 2007
Adelie Biscoe 37.9 18.6 172 3150 female 2007
Adelie Biscoe 40.5 18.9 180 3950 male 2007
Adelie Dream 39.5 16.7 178 3250 female 2007
Adelie Dream 37.2 18.1 178 3900 male 2007
Adelie Dream 39.5 17.8 188 3300 female 2007
Adelie Dream 40.9 18.9 184 3900 male 2007
Adelie Dream 36.4 17.0 195 3325 female 2007
Adelie Dream 39.2 21.1 196 4150 male 2007
Adelie Dream 38.8 20.0 190 3950 male 2007
Adelie Dream 42.2 18.5 180 3550 female 2007
Adelie Dream 37.6 19.3 181 3300 female 2007
Adelie Dream 39.8 19.1 184 4650 male 2007
Adelie Dream 36.5 18.0 182 3150 female 2007
Adelie Dream 40.8 18.4 195 3900 male 2007
Adelie Dream 36.0 18.5 186 3100 female 2007
Adelie Dream 44.1 19.7 196 4400 male 2007
Adelie Dream 37.0 16.9 185 3000 female 2007
Adelie Dream 39.6 18.8 190 4600 male 2007
Adelie Dream 41.1 19.0 182 3425 male 2007
Adelie Dream 37.5 18.9 179 2975 NA 2007
Adelie Dream 36.0 17.9 190 3450 female 2007
Adelie Dream 42.3 21.2 191 4150 male 2007
Adelie Biscoe 39.6 17.7 186 3500 female 2008
Adelie Biscoe 40.1 18.9 188 4300 male 2008
Adelie Biscoe 35.0 17.9 190 3450 female 2008
Adelie Biscoe 42.0 19.5 200 4050 male 2008
Adelie Biscoe 34.5 18.1 187 2900 female 2008
Adelie Biscoe 41.4 18.6 191 3700 male 2008
Adelie Biscoe 39.0 17.5 186 3550 female 2008
Adelie Biscoe 40.6 18.8 193 3800 male 2008
Adelie Biscoe 36.5 16.6 181 2850 female 2008
Adelie Biscoe 37.6 19.1 194 3750 male 2008
Adelie Biscoe 35.7 16.9 185 3150 female 2008
Adelie Biscoe 41.3 21.1 195 4400 male 2008
Adelie Biscoe 37.6 17.0 185 3600 female 2008
Adelie Biscoe 41.1 18.2 192 4050 male 2008
Adelie Biscoe 36.4 17.1 184 2850 female 2008
Adelie Biscoe 41.6 18.0 192 3950 male 2008
Adelie Biscoe 35.5 16.2 195 3350 female 2008
Adelie Biscoe 41.1 19.1 188 4100 male 2008
Adelie Torgersen 35.9 16.6 190 3050 female 2008
Adelie Torgersen 41.8 19.4 198 4450 male 2008
Adelie Torgersen 33.5 19.0 190 3600 female 2008
Adelie Torgersen 39.7 18.4 190 3900 male 2008
Adelie Torgersen 39.6 17.2 196 3550 female 2008
Adelie Torgersen 45.8 18.9 197 4150 male 2008
Adelie Torgersen 35.5 17.5 190 3700 female 2008
Adelie Torgersen 42.8 18.5 195 4250 male 2008
Adelie Torgersen 40.9 16.8 191 3700 female 2008
Adelie Torgersen 37.2 19.4 184 3900 male 2008
Adelie Torgersen 36.2 16.1 187 3550 female 2008
Adelie Torgersen 42.1 19.1 195 4000 male 2008
Adelie Torgersen 34.6 17.2 189 3200 female 2008
Adelie Torgersen 42.9 17.6 196 4700 male 2008
Adelie Torgersen 36.7 18.8 187 3800 female 2008
Adelie Torgersen 35.1 19.4 193 4200 male 2008
Adelie Dream 37.3 17.8 191 3350 female 2008
Adelie Dream 41.3 20.3 194 3550 male 2008
Adelie Dream 36.3 19.5 190 3800 male 2008
Adelie Dream 36.9 18.6 189 3500 female 2008
Adelie Dream 38.3 19.2 189 3950 male 2008
Adelie Dream 38.9 18.8 190 3600 female 2008
Adelie Dream 35.7 18.0 202 3550 female 2008
Adelie Dream 41.1 18.1 205 4300 male 2008
Adelie Dream 34.0 17.1 185 3400 female 2008
Adelie Dream 39.6 18.1 186 4450 male 2008
Adelie Dream 36.2 17.3 187 3300 female 2008
Adelie Dream 40.8 18.9 208 4300 male 2008
Adelie Dream 38.1 18.6 190 3700 female 2008
Adelie Dream 40.3 18.5 196 4350 male 2008
Adelie Dream 33.1 16.1 178 2900 female 2008
Adelie Dream 43.2 18.5 192 4100 male 2008
Adelie Biscoe 35.0 17.9 192 3725 female 2009
Adelie Biscoe 41.0 20.0 203 4725 male 2009
Adelie Biscoe 37.7 16.0 183 3075 female 2009
Adelie Biscoe 37.8 20.0 190 4250 male 2009
Adelie Biscoe 37.9 18.6 193 2925 female 2009
Adelie Biscoe 39.7 18.9 184 3550 male 2009
Adelie Biscoe 38.6 17.2 199 3750 female 2009
Adelie Biscoe 38.2 20.0 190 3900 male 2009
Adelie Biscoe 38.1 17.0 181 3175 female 2009
Adelie Biscoe 43.2 19.0 197 4775 male 2009
Adelie Biscoe 38.1 16.5 198 3825 female 2009
Adelie Biscoe 45.6 20.3 191 4600 male 2009
Adelie Biscoe 39.7 17.7 193 3200 female 2009
Adelie Biscoe 42.2 19.5 197 4275 male 2009
Adelie Biscoe 39.6 20.7 191 3900 female 2009
Adelie Biscoe 42.7 18.3 196 4075 male 2009
Adelie Torgersen 38.6 17.0 188 2900 female 2009
Adelie Torgersen 37.3 20.5 199 3775 male 2009
Adelie Torgersen 35.7 17.0 189 3350 female 2009
Adelie Torgersen 41.1 18.6 189 3325 male 2009
Adelie Torgersen 36.2 17.2 187 3150 female 2009
Adelie Torgersen 37.7 19.8 198 3500 male 2009
Adelie Torgersen 40.2 17.0 176 3450 female 2009
Adelie Torgersen 41.4 18.5 202 3875 male 2009
Adelie Torgersen 35.2 15.9 186 3050 female 2009
Adelie Torgersen 40.6 19.0 199 4000 male 2009
Adelie Torgersen 38.8 17.6 191 3275 female 2009
Adelie Torgersen 41.5 18.3 195 4300 male 2009
Adelie Torgersen 39.0 17.1 191 3050 female 2009
Adelie Torgersen 44.1 18.0 210 4000 male 2009
Adelie Torgersen 38.5 17.9 190 3325 female 2009
Adelie Torgersen 43.1 19.2 197 3500 male 2009
Adelie Dream 36.8 18.5 193 3500 female 2009
Adelie Dream 37.5 18.5 199 4475 male 2009
Adelie Dream 38.1 17.6 187 3425 female 2009
Adelie Dream 41.1 17.5 190 3900 male 2009
Adelie Dream 35.6 17.5 191 3175 female 2009
Adelie Dream 40.2 20.1 200 3975 male 2009
Adelie Dream 37.0 16.5 185 3400 female 2009
Adelie Dream 39.7 17.9 193 4250 male 2009
Adelie Dream 40.2 17.1 193 3400 female 2009
Adelie Dream 40.6 17.2 187 3475 male 2009
Adelie Dream 32.1 15.5 188 3050 female 2009
Adelie Dream 40.7 17.0 190 3725 male 2009
Adelie Dream 37.3 16.8 192 3000 female 2009
Adelie Dream 39.0 18.7 185 3650 male 2009
Adelie Dream 39.2 18.6 190 4250 male 2009
Adelie Dream 36.6 18.4 184 3475 female 2009
Adelie Dream 36.0 17.8 195 3450 female 2009
Adelie Dream 37.8 18.1 193 3750 male 2009
Adelie Dream 36.0 17.1 187 3700 female 2009
Adelie Dream 41.5 18.5 201 4000 male 2009
Gentoo Biscoe 46.1 13.2 211 4500 female 2007
Gentoo Biscoe 50.0 16.3 230 5700 male 2007
Gentoo Biscoe 48.7 14.1 210 4450 female 2007
Gentoo Biscoe 50.0 15.2 218 5700 male 2007
Gentoo Biscoe 47.6 14.5 215 5400 male 2007
Gentoo Biscoe 46.5 13.5 210 4550 female 2007
Gentoo Biscoe 45.4 14.6 211 4800 female 2007
Gentoo Biscoe 46.7 15.3 219 5200 male 2007
Gentoo Biscoe 43.3 13.4 209 4400 female 2007
Gentoo Biscoe 46.8 15.4 215 5150 male 2007
Gentoo Biscoe 40.9 13.7 214 4650 female 2007
Gentoo Biscoe 49.0 16.1 216 5550 male 2007
Gentoo Biscoe 45.5 13.7 214 4650 female 2007
Gentoo Biscoe 48.4 14.6 213 5850 male 2007
Gentoo Biscoe 45.8 14.6 210 4200 female 2007
Gentoo Biscoe 49.3 15.7 217 5850 male 2007
Gentoo Biscoe 42.0 13.5 210 4150 female 2007
Gentoo Biscoe 49.2 15.2 221 6300 male 2007
Gentoo Biscoe 46.2 14.5 209 4800 female 2007
Gentoo Biscoe 48.7 15.1 222 5350 male 2007
Gentoo Biscoe 50.2 14.3 218 5700 male 2007
Gentoo Biscoe 45.1 14.5 215 5000 female 2007
Gentoo Biscoe 46.5 14.5 213 4400 female 2007
Gentoo Biscoe 46.3 15.8 215 5050 male 2007
Gentoo Biscoe 42.9 13.1 215 5000 female 2007
Gentoo Biscoe 46.1 15.1 215 5100 male 2007
Gentoo Biscoe 44.5 14.3 216 4100 NA 2007
Gentoo Biscoe 47.8 15.0 215 5650 male 2007
Gentoo Biscoe 48.2 14.3 210 4600 female 2007
Gentoo Biscoe 50.0 15.3 220 5550 male 2007
Gentoo Biscoe 47.3 15.3 222 5250 male 2007
Gentoo Biscoe 42.8 14.2 209 4700 female 2007
Gentoo Biscoe 45.1 14.5 207 5050 female 2007
Gentoo Biscoe 59.6 17.0 230 6050 male 2007
Gentoo Biscoe 49.1 14.8 220 5150 female 2008
Gentoo Biscoe 48.4 16.3 220 5400 male 2008
Gentoo Biscoe 42.6 13.7 213 4950 female 2008
Gentoo Biscoe 44.4 17.3 219 5250 male 2008
Gentoo Biscoe 44.0 13.6 208 4350 female 2008
Gentoo Biscoe 48.7 15.7 208 5350 male 2008
Gentoo Biscoe 42.7 13.7 208 3950 female 2008
Gentoo Biscoe 49.6 16.0 225 5700 male 2008
Gentoo Biscoe 45.3 13.7 210 4300 female 2008
Gentoo Biscoe 49.6 15.0 216 4750 male 2008
Gentoo Biscoe 50.5 15.9 222 5550 male 2008
Gentoo Biscoe 43.6 13.9 217 4900 female 2008
Gentoo Biscoe 45.5 13.9 210 4200 female 2008
Gentoo Biscoe 50.5 15.9 225 5400 male 2008
Gentoo Biscoe 44.9 13.3 213 5100 female 2008
Gentoo Biscoe 45.2 15.8 215 5300 male 2008
Gentoo Biscoe 46.6 14.2 210 4850 female 2008
Gentoo Biscoe 48.5 14.1 220 5300 male 2008
Gentoo Biscoe 45.1 14.4 210 4400 female 2008
Gentoo Biscoe 50.1 15.0 225 5000 male 2008
Gentoo Biscoe 46.5 14.4 217 4900 female 2008
Gentoo Biscoe 45.0 15.4 220 5050 male 2008
Gentoo Biscoe 43.8 13.9 208 4300 female 2008
Gentoo Biscoe 45.5 15.0 220 5000 male 2008
Gentoo Biscoe 43.2 14.5 208 4450 female 2008
Gentoo Biscoe 50.4 15.3 224 5550 male 2008
Gentoo Biscoe 45.3 13.8 208 4200 female 2008
Gentoo Biscoe 46.2 14.9 221 5300 male 2008
Gentoo Biscoe 45.7 13.9 214 4400 female 2008
Gentoo Biscoe 54.3 15.7 231 5650 male 2008
Gentoo Biscoe 45.8 14.2 219 4700 female 2008
Gentoo Biscoe 49.8 16.8 230 5700 male 2008
Gentoo Biscoe 46.2 14.4 214 4650 NA 2008
Gentoo Biscoe 49.5 16.2 229 5800 male 2008
Gentoo Biscoe 43.5 14.2 220 4700 female 2008
Gentoo Biscoe 50.7 15.0 223 5550 male 2008
Gentoo Biscoe 47.7 15.0 216 4750 female 2008
Gentoo Biscoe 46.4 15.6 221 5000 male 2008
Gentoo Biscoe 48.2 15.6 221 5100 male 2008
Gentoo Biscoe 46.5 14.8 217 5200 female 2008
Gentoo Biscoe 46.4 15.0 216 4700 female 2008
Gentoo Biscoe 48.6 16.0 230 5800 male 2008
Gentoo Biscoe 47.5 14.2 209 4600 female 2008
Gentoo Biscoe 51.1 16.3 220 6000 male 2008
Gentoo Biscoe 45.2 13.8 215 4750 female 2008
Gentoo Biscoe 45.2 16.4 223 5950 male 2008
Gentoo Biscoe 49.1 14.5 212 4625 female 2009
Gentoo Biscoe 52.5 15.6 221 5450 male 2009
Gentoo Biscoe 47.4 14.6 212 4725 female 2009
Gentoo Biscoe 50.0 15.9 224 5350 male 2009
Gentoo Biscoe 44.9 13.8 212 4750 female 2009
Gentoo Biscoe 50.8 17.3 228 5600 male 2009
Gentoo Biscoe 43.4 14.4 218 4600 female 2009
Gentoo Biscoe 51.3 14.2 218 5300 male 2009
Gentoo Biscoe 47.5 14.0 212 4875 female 2009
Gentoo Biscoe 52.1 17.0 230 5550 male 2009
Gentoo Biscoe 47.5 15.0 218 4950 female 2009
Gentoo Biscoe 52.2 17.1 228 5400 male 2009
Gentoo Biscoe 45.5 14.5 212 4750 female 2009
Gentoo Biscoe 49.5 16.1 224 5650 male 2009
Gentoo Biscoe 44.5 14.7 214 4850 female 2009
Gentoo Biscoe 50.8 15.7 226 5200 male 2009
Gentoo Biscoe 49.4 15.8 216 4925 male 2009
Gentoo Biscoe 46.9 14.6 222 4875 female 2009
Gentoo Biscoe 48.4 14.4 203 4625 female 2009
Gentoo Biscoe 51.1 16.5 225 5250 male 2009
Gentoo Biscoe 48.5 15.0 219 4850 female 2009
Gentoo Biscoe 55.9 17.0 228 5600 male 2009
Gentoo Biscoe 47.2 15.5 215 4975 female 2009
Gentoo Biscoe 49.1 15.0 228 5500 male 2009
Gentoo Biscoe 47.3 13.8 216 4725 NA 2009
Gentoo Biscoe 46.8 16.1 215 5500 male 2009
Gentoo Biscoe 41.7 14.7 210 4700 female 2009
Gentoo Biscoe 53.4 15.8 219 5500 male 2009
Gentoo Biscoe 43.3 14.0 208 4575 female 2009
Gentoo Biscoe 48.1 15.1 209 5500 male 2009
Gentoo Biscoe 50.5 15.2 216 5000 female 2009
Gentoo Biscoe 49.8 15.9 229 5950 male 2009
Gentoo Biscoe 43.5 15.2 213 4650 female 2009
Gentoo Biscoe 51.5 16.3 230 5500 male 2009
Gentoo Biscoe 46.2 14.1 217 4375 female 2009
Gentoo Biscoe 55.1 16.0 230 5850 male 2009
Gentoo Biscoe 44.5 15.7 217 4875 NA 2009
Gentoo Biscoe 48.8 16.2 222 6000 male 2009
Gentoo Biscoe 47.2 13.7 214 4925 female 2009
Gentoo Biscoe NA NA NA NA NA 2009
Gentoo Biscoe 46.8 14.3 215 4850 female 2009
Gentoo Biscoe 50.4 15.7 222 5750 male 2009
Gentoo Biscoe 45.2 14.8 212 5200 female 2009
Gentoo Biscoe 49.9 16.1 213 5400 male 2009
Chinstrap Dream 46.5 17.9 192 3500 female 2007
Chinstrap Dream 50.0 19.5 196 3900 male 2007
Chinstrap Dream 51.3 19.2 193 3650 male 2007
Chinstrap Dream 45.4 18.7 188 3525 female 2007
Chinstrap Dream 52.7 19.8 197 3725 male 2007
Chinstrap Dream 45.2 17.8 198 3950 female 2007
Chinstrap Dream 46.1 18.2 178 3250 female 2007
Chinstrap Dream 51.3 18.2 197 3750 male 2007
Chinstrap Dream 46.0 18.9 195 4150 female 2007
Chinstrap Dream 51.3 19.9 198 3700 male 2007
Chinstrap Dream 46.6 17.8 193 3800 female 2007
Chinstrap Dream 51.7 20.3 194 3775 male 2007
Chinstrap Dream 47.0 17.3 185 3700 female 2007
Chinstrap Dream 52.0 18.1 201 4050 male 2007
Chinstrap Dream 45.9 17.1 190 3575 female 2007
Chinstrap Dream 50.5 19.6 201 4050 male 2007
Chinstrap Dream 50.3 20.0 197 3300 male 2007
Chinstrap Dream 58.0 17.8 181 3700 female 2007
Chinstrap Dream 46.4 18.6 190 3450 female 2007
Chinstrap Dream 49.2 18.2 195 4400 male 2007
Chinstrap Dream 42.4 17.3 181 3600 female 2007
Chinstrap Dream 48.5 17.5 191 3400 male 2007
Chinstrap Dream 43.2 16.6 187 2900 female 2007
Chinstrap Dream 50.6 19.4 193 3800 male 2007
Chinstrap Dream 46.7 17.9 195 3300 female 2007
Chinstrap Dream 52.0 19.0 197 4150 male 2007
Chinstrap Dream 50.5 18.4 200 3400 female 2008
Chinstrap Dream 49.5 19.0 200 3800 male 2008
Chinstrap Dream 46.4 17.8 191 3700 female 2008
Chinstrap Dream 52.8 20.0 205 4550 male 2008
Chinstrap Dream 40.9 16.6 187 3200 female 2008
Chinstrap Dream 54.2 20.8 201 4300 male 2008
Chinstrap Dream 42.5 16.7 187 3350 female 2008
Chinstrap Dream 51.0 18.8 203 4100 male 2008
Chinstrap Dream 49.7 18.6 195 3600 male 2008
Chinstrap Dream 47.5 16.8 199 3900 female 2008
Chinstrap Dream 47.6 18.3 195 3850 female 2008
Chinstrap Dream 52.0 20.7 210 4800 male 2008
Chinstrap Dream 46.9 16.6 192 2700 female 2008
Chinstrap Dream 53.5 19.9 205 4500 male 2008
Chinstrap Dream 49.0 19.5 210 3950 male 2008
Chinstrap Dream 46.2 17.5 187 3650 female 2008
Chinstrap Dream 50.9 19.1 196 3550 male 2008
Chinstrap Dream 45.5 17.0 196 3500 female 2008
Chinstrap Dream 50.9 17.9 196 3675 female 2009
Chinstrap Dream 50.8 18.5 201 4450 male 2009
Chinstrap Dream 50.1 17.9 190 3400 female 2009
Chinstrap Dream 49.0 19.6 212 4300 male 2009
Chinstrap Dream 51.5 18.7 187 3250 male 2009
Chinstrap Dream 49.8 17.3 198 3675 female 2009
Chinstrap Dream 48.1 16.4 199 3325 female 2009
Chinstrap Dream 51.4 19.0 201 3950 male 2009
Chinstrap Dream 45.7 17.3 193 3600 female 2009
Chinstrap Dream 50.7 19.7 203 4050 male 2009
Chinstrap Dream 42.5 17.3 187 3350 female 2009
Chinstrap Dream 52.2 18.8 197 3450 male 2009
Chinstrap Dream 45.2 16.6 191 3250 female 2009
Chinstrap Dream 49.3 19.9 203 4050 male 2009
Chinstrap Dream 50.2 18.8 202 3800 male 2009
Chinstrap Dream 45.6 19.4 194 3525 female 2009
Chinstrap Dream 51.9 19.5 206 3950 male 2009
Chinstrap Dream 46.8 16.5 189 3650 female 2009
Chinstrap Dream 45.7 17.0 195 3650 female 2009
Chinstrap Dream 55.8 19.8 207 4000 male 2009
Chinstrap Dream 43.5 18.1 202 3400 female 2009
Chinstrap Dream 49.6 18.2 193 3775 male 2009
Chinstrap Dream 50.8 19.0 210 4100 male 2009
Chinstrap Dream 50.2 18.7 198 3775 female 2009

Suppose we want to look at flipper length versus body mass. The appropriate command is:

gf_point(flipper_length_mm ~ body_mass_g, data = palmerpenguins::penguins)
#> Warning: Removed 2 rows containing missing values (`geom_point()`).

Each row in the data frame generates one dot in the plot: the \(y\) and \(x\) coordinates of the dot are set by the value of the variables named in the tilde expression. \(y\) is always the first name, on the left-hand side of the tilde.

There is a wide variety of ways to customize the plot: size, shape, transparency of the dots, etc. A statistics course can show you why you would want to use these modalities. For our purposes in this course, there are just three sorts of customizations:

  1. Rather than a linear axis (the default), show the data using a log axis. The penguin data is not suitable for an example, because the linear axes work perfectly well. (All penguins are approximately the same size.) So, as an example, we’ll use a very small data set from Robert Boyle’s experiments around 1660 on the relationship bretween the pressure and volume of a gas (at constant temperature).
# linear axes: the default
gf_point(pressure ~ volume, data = Boyle) 


# Log-log axes
gf_point(pressure ~ volume, data = Boyle) %>%
  gf_refine(scale_x_log10(), scale_y_log10())

To make a semi-log plot, use the scale_y_log10() and leave out the scale_x_log10() argument to gf_refine(0).

Note: Log-log or semi-log axes are a valuable way to present data to a human reader, because the value of the variables for any point can be read directly from the axes. But sometimes your purpose is not to present data but to estimate a parameter in a power-law or exponential relationship. For the purpose of estimating parameters, better to make an ordinary plot with linear axes, but plot the log of the variable(s) rather than the variables themselves. For instance:

gf_point(log(pressure) ~ log(volume), data = Boyle)

  1. Using color to display a third variable in a scatter plot. An example with the penguin data will suffice.
gf_point(flipper_length_mm ~ body_mass_g, 
         data = palmerpenguins::penguins,
         color = ~ sex)
#> Warning: Removed 2 rows containing missing values (`geom_point()`).

Notice the tilde before the variable name in the color= argument. You can also set the color to be a fixed values, e.g. color="magenta".

  1. Set the limits of an axis. Sometimes the upper and lower bounds on the axes selected by R are inappropriate for your purpose. When this is the case, you can set the limits yourself by piping the graphic to the gf_lims() function. For instance, the penguin graph above shows that females have somewhat smaller flipper length than males, and somewhat smaller weights as well. But because zero was not used as the lower limit of the axes, the plot overstates the sex differences. Including zero as an axis limit is often appropriate.
gf_point(flipper_length_mm ~ body_mass_g, 
         data = palmerpenguins::penguins,
         color = ~ sex) %>%
  gf_lims(y=c(0,235), x=c(0,6500))
#> Warning: Removed 2 rows containing missing values (`geom_point()`).

Fitting functions to data

Finding parameters to match a function to data can be a matter of trial and error. The fitModel() R/mosaic function can be used to polish a preliminary fit.

For instance, as every chemistry student knows, Boyle’s Law states that at constant temperature, pressure and volume are inversely related: \(P = a V^{-1}\). Let’s see how well Boyle’s data matches his law by fitting a general power-law form \(P = a V^{n}\) to his data.

At the core of fitModel() is a tilde expression that specifies the name of the function output (on the left side of the tilde) and a functional form written in terms of the names of the inputs to the function. Parameters are written using names.

mod <- fitModel(pressure ~ a*volume^n, data = Boyle)

The object created by fitModel() is a function of the variables on the right-hand side of the tilde expression, just volume here. You can use that mathematical function like any other that you create with makeFun() or the like. For instance:

gf_point(pressure ~ volume, data = Boyle) %>%
  slice_plot(mod(volume) ~ volume, color="magenta")

But unlike other functions, you can interrogate the functions produced by fitModel(0) to see the numerical values of the parameters. This is done with the coef() function (as in “coefficients”).

coef(mod)
#>            a            n 
#> 1370.8544316   -0.9931137
If Boyle had been able to measure and control the temperature in his experiments, he would have found that the parameter a is proportional to temperature.
Default values for parameters in functions

Suppose you were creating a function to represent the distance travelled by an object in free fall as a function of time from some initial point in time \(t_0\). The mathematical relationship is \[\text{dist}(t) \equiv v_0 \left[\strut t - t_0\right] + \frac{1}{2} g \left[\strut t - t_0\right]^2\]

To evaluate this function, you need to know the three parameters \(v_0, t_0,\) and \(g\). On Earth, \(g=-9.8\) meters/sec\(^2\), but you may not know \(v_0\) or \(t_0\) until it comes time to use the function in some context. It’s tempting to put in numerical values for the parameters to make the function easy to use. For instance, if the object starts from rest at time \(t=0\), it would be tempting to make the R definition something like:

dist <- makeFun(0*(t-0) - 9.8*(t-0)^2 / 2 ~ .) or even makeFun(-9.8*t^2 / 2 ~ .)

A better practice is to create the function with the parameter names shown explicitly:

dist <- makeFun(v0 * (t - t0) + g*(t-t0)^2/2 ~ .)

Unfortunately, the resulting function will have four arguments which have to be specified every time you use it. A nice compromise is to assign default values for the parameters. For instance \(t_0 = 0\) and \(v_0 = 0\) and \(g = -9.8\) meters/sec\(^2\) would be sensible.

You specify the default parameters by adding them as additional arguments after the tilde expression.

dist <- makeFun(v0*(t-t0) + g*(t-t0)^2 / 2 ~ ., g=-9.8, v0=0, t0=0)

This way, you can use the function simply when the default parameters are appropriate, or modify them as needed. For instance, the free-fall distance over two seconds:

on_Earth <- dist(2)
on_Mars  <- dist(2, g=-3.7)
Calculating derivatives

“Calculating a derivative” means to find the function that is the derivative of a specified function. In R/mosaic this can be done in exactly the same way that makeFun() works. For instance:

df <- D(exp(t) * cos(t) ~ t)
df
#> function (t) 
#> cos(t) * exp(t) - sin(t) * exp(t)
slice_plot(df(t) ~ t, bounds(t=0:10))

The name on the right-hand side of the tilde expression will be the “with respect to” variable. If you want a second derivative, use and expression like t + t on the right-hand side. You can also calculate mixed partial derivatives with right-hand side expressions like t & x.

R/mosaic knows how to use both numerical and symbolic methods. To force the use of numerical methods, use numD() instead of D().

Calculating anti-derivatives

Anti-derivatives can be computed with antiD(). For example:

antiD(sin(omega*t) ~ t)
#> function (t, C = 0, omega) 
#> C - cos(t * omega)/omega

R/mosaic knows only a few symbolic anti-derivatives. When it doesn’t know the symbolic form, the function produced by antiD() will use numerical methods.

Solving (zero-finding)

“Solving” means to find an input \(x^\star\) that will generate an output of \(v\) from \(f(x)\). That is, the answer \(x^\star\) will give \(f(x^\star) = v\). There may be no solutions, one solutions, several or many solutions.

In R/mosaic, solving is implemented as the Zeros() function. Zeros() looks for solutions where \(f(x^\star) = 0\), but any solution problem, regardless of \(v\), can be placed in this form.

Zeros() is used in the same way as many other R/mosaic functions: the first argument is a tilde expression, the second is a bounds. The output will be a data-frame with the solutions found. For instance, here we find the inputs to \(\sin(x)\) that produce an output of 0.5. Notice that we have created a new function (\(\sin(x) - 0.5\)) whose output will be zero at the solutions we seek.

solutions <- Zeros(sin(x) - 0.5 ~ x, bounds(x=0:10))
slice_plot(sin(x) ~ x, bounds(x=0:10)) %>%
  gf_point(0.5 ~ x, data = solutions)

solutions
#> # A tibble: 4 × 2
#>       x      .output.
#>   <dbl>         <dbl>
#> 1 0.524 -0.0000000658
#> 2 2.62  -0.00000731  
#> 3 6.81   0.00000389  
#> 4 8.90   0.00000192
Optimization (with derivatives)

In optimization, you have an objective function \(f(x)\) and you seek the argmin(s) or argmax(es). The textbook introduces optimization using a technique involving differentiating and solving, that is, finding \(x^\star\) such that \(\partial_x f(x^\star) = 0\). We’ll illustrate with a made-up function:

f <- rfun(~ x, seed=943) # a random function
slice_plot(f(x) ~ x, bounds(x=-5:5))

It’s so easy to spot the argmins and argmaxes from a graph that it’s reasonable to wonder why we need derivatives and solving to do it. The answer is that we are helping you establish a conceptual foundation for more interesting optimization problems where you can’t just look at a graph. So let’s step through the problem formally using derivatives.

  1. Construct the derivative of the objective function with respect to its argument.
df <- D(f(x) ~ x)

If there were symbolic parameters in your definition of \(f()\), you would need to resolve them at this point, assigning the parameters definite numerical values. 2. Find the zeros of the derivative function:

dzeros <- Zeros(df(x) ~ x, bounds(x=-5:5))
dzeros
#> # A tibble: 5 × 2
#>         x     .output.
#>     <dbl>        <dbl>
#> 1 -4.14   -0.000000647
#> 2 -2.29    0.00000685 
#> 3 -0.0973  0.00000215 
#> 4  1.43   -0.000000893
#> 5  3.85   -0.000000254

The output of Zeros() has two columns. The first lists the solutions, the second gives the value of the function at those solutions. Note that the function output is very close to zero, there’s not really any information there. And there’s nothing that tells you whether a given \(x\) is an argmax or an argmin. Of course, you could figure this out by referring to the graph of \(f(x)\) itself, but let’s be a little more formal by extracting this information from dzeros, f(), and the second derivative of f(). To do this, we introduce a new function, mutate(), that operates on data frames. And we’ll need the second derivative \(\partial_{xx}f()\) to establish whether the function is concave up or down at the solutions found.

ddf <- D(f(x) ~ x & x)

Here’s where mutate() comes in, allowing us to apply the functions f() and ddf() to the x values in the solution. The first command might look strange. It means, “Take the old data frame dzeros, add some new columns to it using mutate(), and then calling the new data frame by the old name dzeros.

dzeros <- dzeros %>%
  mutate(val = f(x), convexity = sign(ddf(x)))
dzeros
#> # A tibble: 5 × 4
#>         x     .output.    val convexity
#>     <dbl>        <dbl>  <dbl>     <dbl>
#> 1 -4.14   -0.000000647  2.22         -1
#> 2 -2.29    0.00000685   0.412         1
#> 3 -0.0973  0.00000215   9.25         -1
#> 4  1.43   -0.000000893  5.47          1
#> 5  3.85   -0.000000254 10.9          -1

From the sign of the convexity at the solutions \(x\), you can tell immediately which rows correspond to an argmin (positive convexity) and which to an argmax (negative convexity).

For fun, we can graph our findings on top of the function \(f()\) to confirm that we have things right:

slice_plot(f(x) ~ x, bounds(x=-5:5)) %>%
  gf_point(val ~ x, data = dzeros, color= ~ convexity)
Optimization (generally)

Optimization is such an important procedure that clever algorithms to handle all sorts of difficulties has been invented. This is an extensive topic in its own right, but for simplicity, R/mosaic provides a function argM() that works with functions with multiple inputs, pretty much automatically. It’s called argM() because it returns both argmins and argmaxes, along with some supplemental information.

argM(f(x) ~ x, bounds(x=-5:5))
#> # A tibble: 2 × 3
#>       x .output. concavity
#>   <dbl>    <dbl>     <dbl>
#> 1 -2.29    0.412         1
#> 2  3.85   10.9          -1

argM() will not necessarily find all the local argmins and argmaxes. But it also works for functions with multiple inputs.

f2 <- makeFun(x*sin(sqrt(3+x))*(1+cos(y))-y ~ .)
solns <- argM(f2(x, y) ~ x & y, bounds(x=c(-3,3), y=c(-3,3)))
solns
#> # A tibble: 1 × 3
#>       x     y .output.
#>   <dbl> <dbl>    <dbl>
#> 1 -2.21 0.622    -3.73
contour_plot(f2(x,y) ~ x & y, bounds(x=c(-3,3), y=c(-3,3))) %>%
  gf_point(y ~ x, data = solns, color="red")
Vectors and matrices To be released.
Finding solutions to the target problem To be released.
Solving differential equations To be released.

Examples of commands

Basic modeling functions
identity_fun  <- makeFun(x ~ x)
constant_fun  <- makeFun(1 ~ x)
straight_line <- makeFun(m*x + b ~ x, b=0, m=1)
exponential   <- makeFun(exp(k * x) ~ x, k=1)
power_law     <- makeFun(x^p ~ x, p = 1/2)
sinusoid      <- makeFun(sin(2*pi*(t-t0)/P) ~ t, P=2, t0=0)
logarithm     <- makeFun(log(x, base=exp(1)) ~ x)
gaussian      <- makeFun(dnorm(x, mean, sd) ~ x, mean=0, sd=1)
sigmoid       <- makeFun(pnorm(x, mean, sd) ~ x, mean=0, sd=1)

identity_fun(3)
#> [1] 3
constant_fun(3)
#> [1] 1
power_law(3)
#> [1] 1.732051
Assembling functions
# Linear combination (example)
f <- makeFun(a0 + a1*exp(k * x) ~ ., a0=30, a1=150, k=-0.5)
# Product (example)
g <- makeFun(dnorm(x, mean=0, sd=3) * sin(2*pi*t/P) ~ ., P=3)
# Composition (example)
h <- makeFun(exp(sin(2*pi*t/P) ~ x) ~ ., P = 3)
Graphing functions
slice_plot(dnorm(x, mean=1, sd=2) ~ x, 
           bounds(x=-5:5))
contour_plot(dnorm(x, mean=1, sd=2) * pnorm(y, mean=-3, sd=1) ~ x + y,
             bounds(x=-5:5, y=-5:5))
Calculus operations
f <- makeFun(exp(-0.5*x) * sin(2*pi*x/3) ~ .)
df <- D(f(x) ~ x)
slice_plot(df(x) ~ x, bounds(x=-5:5)) %>%
  slice_plot(f(x) ~ x, bounds(x=-5:5), color="orange3")

f <- makeFun(dnorm(x, mean=1, sd=2) ~ .)
F <- antiD(f(x) ~ x)
slice_plot(F(x) ~ x, bounds(x=-5:5)) %>%
  slice_plot(f(x) ~ x, color="orange3")

# Set "constant of integration"
slice_plot(F(x) ~ x, bounds(x=-5:5)) %>%
  slice_plot(f(x) ~ x, color="orange3") %>%
  slice_plot(F(x, C=0.25) ~ x, color="green")

# Definite integral
F(5) - F(-5)
#> [1] 0.9759
f <- makeFun(exp(sin(2*pi*x/3)) - 0.5 ~ .)
Zeros <- findZeros(f(x) ~ x, near=0, within=5)
Zeros
#>         x
#> 1 -4.1343
#> 2 -3.3656
#> 3 -1.1344
#> 4 -0.3657
#> 5  1.8656
#> 6  2.6343
#> 7  4.8657
slice_plot(f(x) ~ x, bounds(x=-5:5)) %>%
  gf_hline(yintercept=0, color="orange3") %>%
  gf_vline(xintercept= ~ x, color="dodgerblue", data=Zeros)
Fitting functions to data

Use fitModel() to fit a function of one variable to data.

gf_point(temp ~ time, data = CoolingWater)

# Eyeball half-life at 25
k0 <- -log(2)/25
mod <- fitModel(temp ~ A + B*exp(-k*time), data=CoolingWater,
                start=list(k=k0))
Plot <- gf_point(temp ~ time, data = CoolingWater) %>%
  slice_plot(mod(time) ~ time, color="dodgerblue", alpha=0.25, size=2) 
Linear and quadratic approximations

Approximate a function \(f(x)\) around a selected point \(x=x_0\).

f <- makeFun(exp(-0.5*x)*sin(2*pi*x/3) ~ .)

df <- D(f(x) ~ x)
center_fun_on <- 0.9
ddf <- D(df(x) ~ x) # alternatively, D(f(x) ~ x + x)
lin_approx <- makeFun(f(x0) + df(x0)*(x-x0) ~ x, x0 = center_fun_on)
quad_approx <- makeFun(lin_approx(x) + 0.5*ddf(x0)*(x-x0)^2 ~ x, x0 = center_fun_on)
slice_plot(f(x) ~ x, bounds(x=0:1.5), size=2) %>%
  slice_plot(lin_approx(x) ~ x, color="blue") %>%
  slice_plot(quad_approx(x) ~ x, bounds(x=0:1.5), color="orange") %>%
  gf_vline(xintercept = center_fun_on, alpha=0.2, color="yellow", size=3)