A discharge rating curve is a model that describes the relationship between water stage and discharge in a river. The rating curve is estimated from paired observations of stage and discharge and it is used to predict discharge for a given stage. This is the main practical usage of rating curves as stage is substantially easier to directly observe than discharge. Four different discharge rating curve models are implemented in this R package using a Bayesian hierarchical model:

`plm0()`

- Power-law model with a constant variance (hence the 0). This is a Bayesian hierarchical implementation of the most commonly used discharge rating curve model in hydrological practice.

`plm()`

- Power-law model with variance that varies with stage.

`gplm0()`

- Generalized power-law model with a constant variance (hence the 0). The generalized power-law is introduced in Hrafnkelsson et al. (2020).

`gplm()`

- Generalized power-law model with variance that varies with stage. The generalized power-law is introduced in Hrafnkelsson et al. (2020).

For further details about the different models, see Hrafnkelsson et al. (2020). The models differ in their complexity and `gplm` is the most flexible and complex model. We will focus on the use of `gplm` throughout this introduction vignette and explore the different ways to fit the `gplm` and visualize its output. However the API of the functions for the other three models are completely identical so this vignette also helps users to run those models.

```
> library(bdrc)
> set.seed(1) #set seed for reproducibility
```

We will use a dataset from a stream gauging station called Krokfors in Sweden that comes with the package:

```
> data(krokfors)
> krokfors
#> W Q
#> 1 9.478000 10.8211700
#> 2 8.698000 1.5010000
#> 3 9.009000 3.3190000
#> 4 8.097000 0.1595700
#> 5 9.104000 4.5462500
#> 6 8.133774 0.2121178
#> 7 8.569583 1.1580000
#> 8 9.139151 4.8110000
#> 9 9.464250 10.9960000
#> 10 8.009214 0.0984130
#> 11 8.961843 2.7847910
#> 12 8.316000 0.6631890
#> 13 8.828716 1.8911800
#> 14 9.897000 20.2600000
#> 15 7.896000 0.0190000
#> 16 9.534000 12.1000000
#> 17 9.114000 4.3560000
#> 18 8.389000 0.6200000
#> 19 8.999000 2.6800000
#> 20 9.099000 3.7310000
#> 21 8.502000 0.8930000
#> 22 8.873000 1.9000000
#> 23 8.240000 0.3200000
#> 24 9.219000 5.9000000
#> 25 9.271000 6.9000000
#> 26 8.370000 0.4420000
#> 27 9.431000 9.0000000
```

It is very simple to fit a discharge rating curve with the *bdrc* package. All you need are two mandatory input arguments, `formula` and `argument`. The `formula` is of the form `y`~`x` where `y` is discharge in m\(^3/\)s and `x` is stage in m (it is very important that the data is in the correct units). The `data` argument must be a data.frame including `x` and `y` as column names. In our case, in Krokfors, a column named Q includes discharge measurements and W the stage measurements. We are ready to fit a discharge rating curve using the `gplm` function:

`> gplm.fit <- gplm(Q~W,data=krokfors,parallel=TRUE,num_cores=2) # parallel=TRUE by default and by default, the number of cores is detected on the machine`

`gplm` returns an object of class “gplm” which we can summarize and visualize using familiar functions such as

```
> summary(gplm.fit)
#>
#> Formula:
#> Q ~ W
#> Latent parameters:
#> lower-2.5% median-50% upper-97.5%
#> a 1.50 1.94 2.23
#> b 1.83 1.84 1.84
#>
#> Hyperparameters:
#> lower-2.5% median-50% upper-97.5%
#> c 7.71211 7.8102 7.856
#> sigma_beta 0.42368 0.7066 1.288
#> phi_beta 0.57452 1.1910 2.874
#> sigma_eta 0.00329 0.0912 0.534
#> eta_1 -4.93445 -4.2566 -3.557
#> eta_2 -5.94891 -4.0659 -2.280
#> eta_3 -6.87756 -4.2049 -1.611
#> eta_4 -7.71587 -4.4949 -1.223
#> eta_5 -8.24321 -4.6293 -0.680
#> eta_6 -8.78471 -4.6121 -0.250
#>
#> DIC: -1.470964
```

and

`> plot(gplm.fit)`

In the next section we will dive deeper into visualizing the “gplm” object.

The *bdrc* package provides several tools to visualize the results from model objects which can give insight into the physical properties of the river at hand. For instance, the hyperparameter \(c\) corresponds to the stage of zero discharge. To visualize the posterior of \(c\), we can write

`> plot(gplm.fit,type='histogram',param='c')`

Technically, instead of inferring \(c\) directly, \(h_{min}-c\) is inferred where \(h_{min}\) is the lowest stage value in the data. Since \(h_{min}-c\) is strictly positive, it is transformed during Bayesian inference s.t. it has support on the real line with \(\zeta=log(h_{min}-c)\). To plot the transformed posterior we write

`> plot(gplm.fit,type='histogram',param='c',transformed=TRUE)`

the `param` argument can also be a vector of different parameter names. To visualize the posterior distribution of many parameters at once we can for instance write

`> plot(gplm.fit,type='histogram',param=c('a','c'))`

There is a shorthand to visualize all hyperparameters at once

`> plot(gplm.fit,type='histogram',param='hyperparameters')`

Similarly, writing ‘latent_parameters’ plots the latent parameters in one plot. To plot the hyperparameters transformed, on the same scale as in the Bayesian inference, we write

`> plot(gplm.fit,type='histogram',param='hyperparameters',transformed=TRUE)`

Finally, we can visualize certain components of the model that varies with stage, the power-law exponent, \(f(h)\), and the standard deviation on the data level \(\sigma_{\varepsilon}(h)\). `gplm0` and `gplm` generalize the power law exponent, model it as a stochastic process,\(f(h)=b+\beta(h)\), which assumed to be twice differentiable.`plm` and `plm0` on the other hand model the exponent as a fixed constant \(f(h)=b\). We can plot the inferred power-law exponent with

`> plot(gplm.fit,type='f')`

`plm` and `gplm` model the standard deviation, \(\sigma_{\varepsilon}\) on the data level as a function of stage, namely using B-splines basis functions, while `plm0` and `gplm0` model the variance is a fixed constant. We can plot the standard deviation by writing

`> plot(gplm.fit,type='sigma_eps')`

To get a visualization summary of your model, the ‘panel’ option in the plot method is useful:

`> plot(gplm.fit,type='panel',transformed=TRUE)`

The package has several functions for convergence diagnostics of a bdrc model, most notably the residual plot, tracing plots, autocorrelation plot and the Gelman-Rubin statistic plot. The log-residuals can be plotted with

`> plot(gplm.fit,type='residuals')`

The log-residuals are calculated by subtracting the posterior estimate of log-discharge,\(log(\hat{Q})\), from the observed log-discharge. Additionally, the plot includes the 95% predictive intervals or log(Q) (- -) and 95% credible intervals for the expected value of log(Q) (—), reflecting the rating curve uncertainty.

`> plot(gplm.fit,type='trace',param='c',transformed=TRUE)`

To plot a trace plot for all hyperparameters, you write similarly as above

`> plot(gplm.fit,type='trace',param='hyperparameters',transformed=TRUE)`