problem-10.1

problem-10.1  We begin by loading in the data set and looking at the names.
> library(MASS)                 # loads data set

For the model of highway mileage by horsepower we expect a negative correlation. A scatterplot confirms this.
> plot(MPG.highway ~ Horsepower, data = Cars93)
> res = lm(MPG.highway ~ Horsepower, data = Cars93)
> res

Call:
lm(formula = MPG.highway ~ Horsepower, data = Cars93)

Coefficients:
(Intercept)   Horsepower
     38.150       -0.063

> predict(res, newdata=data.frame(Horsepower=225))
[1] 23.97
    
Modeling highway mileage by automobile weight should have a similar negative correlation. Again we confirm and make the requested predictions.
> f = MPG.highway ~ Weight
> plot(f, data=Cars93)
> res = lm(f, data=Cars93)
> res

Call:
lm(formula = f, data = Cars93)

Coefficients:
(Intercept)       Weight
   51.60137     -0.00733

> predict(res, newdata=data.frame(Weight=c(2524, 6400)))
     1      2
33.108  4.708

The prediction for the MINI Cooper may be close, but there is no reason to expect the prediction for the HUMMER to be close, as the value of the predictor is outside the range of the data.
The variable Min.Price records the value of the stripped-down version of the car, and Max.Price records the fully equipped version. We'd expect that Max.Price would roughly be a fixed amount more than Min.Price, as the differences-the cost of leather seats, a bigger engine, perhaps- are roughly the same for each car. Checking, we have:
> f = Max.Price ~ Min.Price
> plot(f, data=Cars93)
> res = lm(f,data=Cars93)
> abline(res)
> res

Call:
lm(formula = f, data = Cars93)

Coefficients:
(Intercept)    Min.Price
       2.31         1.14

The slope of 1.14 indicates that perhaps add-ons for more expensive cars cost more, but in this case it appears to be due to the one large outlier, as robust regression estimates are much closer to 1:
> rlm(f, data=Cars93)
Call:
rlm(formula = f, data = Cars93)
Converged in 7 iterations

Coefficients:
(Intercept)   Min.Price
      3.609       1.029

Degrees of freedom: 93 total; 91 residual
Scale estimate: 3.18

A scatterplot matrix may show additional linear relationships. These are produced with the pairs() command, as in pairs(Cars93). Doing so directly produces too many scatterplots. We can trim down the size of the data frame then plot again. Doing so using only the nonfactors can be done as follows:
> cars = Cars93[,sapply(Cars93, function(x) !is.factor(x))]
> pairs(cars)

Looking at the plots produced we see, for example, that variables 1 and 2, 2 and 3, 4 and 5, etc., are linearly related. These variables can be identified from the graphic if the monitor is large enough, or with the command names(cars).