---
title: "Stand-level"
author: "Anika Seppelt"
#date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{2. Stand-level}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

Once all trees were detected from the normalized point cloud and the tree list was generated, metrics and variables at stand level can be estimated. These estimations can be based on different plot designs, that are circular fixed area, k-tree and angle-count plots. The sizes of the plots can be regulated by defining the radius, k and the basal area factor (BAF) respectively.

```{r include=FALSE}
dir.data <- system.file("data", package="FORTLS")
setwd(dir.data)
load("Rioja.data.RData")
tree.list <- Rioja.data$tree.tls

library(FORTLS)
```

## Computing stand-level metrics and variables

In order to calculate metrics and variables at stand level, the function **`metrics.variables`** is applied as follows:

```{r eval=FALSE, include=TRUE}
metrics <- metrics.variables(tree.tls = tree.list, 
                  tree.ds = tree.ds, tree.field = tree.field,
                  plot.design = c("fixed.area", "k.tree", "angle.count"),
                  plot.parameters = data.frame(radius = 10, k = 10, BAF = 1),
                  scan.approach = "single", var.metr = NULL,
                  dbh.min = 4, h.min = 1.3, max.dist = Inf, 
                  dir.data = dir.data, save.result = FALSE, dir.result = NULL)
```

The input data frame is introduced in **`tree.tls`** and should contain information about the trees detected from TLS based data e.g., the output data frame of the functions `tree.detection.single.scan` or `tree.detection.multi.scan`. The path of the directory of the .txt files containing the reduced point cloud data generated by the function `normalize` can be specified in **`dir.data`** otherwise the current working directory will be assigned to this argument. If **`save.result`** is set to `save.result = TRUE` (default setting), the output files will be saved to the directory indicated in **`dir.result`**.

The scan approach, either TLS single-scan (`"single"`) or TLS multi-scan and SLAM point clouds approaches (`"multi"`) has to be specified in the argument **`scan.approach`**. By default, the argument is set to `"multi"`.

If only a selection of metrics and variables should be calculated, the metrics and variables of interest can be specified as a vector in **`var.metr`**. If nothing is specified, the argument is set to `NULL` and all possible metrics and variables will be calculated. 

#### Distance sampling (`distance.sampling` function)

The distance sampling method is used for the correction of occlusion effects. When the `distance.sampling` function was applied, the list with tree detection probabilities (i.e., the output file of the mentioned function) can be introduced in **`tree.ds`**. By default, this argument is set to `NULL` and metrics using distance sampling corrections will not be calculated. The `distance.sampling` is applied as follows:

```{r eval=TRUE, include=TRUE}
tree.ds <- distance.sampling(tree.tls = tree.list,
                             id.plots = NULL,
                             strata.attributes = NULL)
```

The **`distance.sampling`** function computes the probability of detection of trees depending on their distance to the TLS device. Detection functions are fitted to the histogram representing the distribution of the trees relative to their horizontal distances. The computation of the detection functions is based on the data frame of the detected trees, i.e. the output file of the `tree.detection.single.scan` or `tree.detection.multi.scan` (here `tree.list`), which is introduced in the argument **`tree.tls`**. The plots to be analysed can be specified in **`id.plots`** by inserting a vector containing the plot identification numbers. If not specified, this argument is set to `id.plots = NULL` and all plots will be considered.  

Two detection functions are fitted, that are the half normal and hazard rate functions, with and without $dbh$ as covariate. These functions describe the decreasing detection probability with increasing distance. The probabilities are used for correcting estimation bias of the stand-level variables caused by the lack of detection of trees due to occlusion.

A list with three elements is generated by the `distance.sampling` function. The data frame **`tree`** represents the detection probabilities for each tree of all plots according to the four different detection functions. The columns `P.hn` and `P.hn.cov` show the detection probabilities calculated by the half normal function without and with $dbh$ as covariate, while `P.hr` and `P.hr.cov` contain the probabilities according to the hazard rate function without and with covariate respectively. 

```{r eval=TRUE, include=TRUE}
head(tree.ds$tree)
```

The data frame **`par`** shows the parameters of the detection functions and **`AIC`** the Akaike information criterion (AIC) for every detection function fit. 

```{r eval=TRUE, include=TRUE}
head(tree.ds$par)

head(tree.ds$AIC)
```

#### Additional information about trees through field data

If supplementary field data is available for the sample plots, a data frame containing this information can be included in **`tree.field`**. Each row of the data frame must represent a tree and the data frame must contain at least the following columns:

- `id`: plot identification number (character string or numeric), which must coincide with those in the `id` column of the `tree.tls` argument i.e., the list of trees yielded by the tree detection functions
- `tree`: number of the tree
- `h.dist`: horizontal distance (in m) of the tree's center to the plot's center, which must coincide with the centers of their corresponding TLS plots
- `dbh`: tree diameter (in cm) at breast height (1.3 m)
- `h`: total tree height (in m)
- `dead`: integer value indicating whether the tree is dead (1) or not (NA)

Further optional columns are `h.blc` (height based live crown, in m), `h.bdc` (height based dead crown, in m), `v.user` (tree volume, in m$^3$) and `w.user` (tree biomass, in Mg).

#### Selecting trees to be included in the calculations

The arguments **`dbh.min`**, **`h.min`** and **`max.dist`** are used to determine the trees that are included in the calculations. The minimum diameter at breast height ($dbh$) and the minimum tree height can be specified in `dbh.min` (in cm) and `h.min` (in m) and are set to 4 cm and 1.3 m respectively by default. In `max.dist`, the maximal horizontal distance (in m) of a tree from the plot's center to be included in the calculations can be defined. If not specified, no trees are discarded because of their distance.

#### Plot design and parameters

The metrics and variables can be computed for different plot designs which are specified by **`plot.design`** and **`plot.parameters`**. There are three different plot designs, which are similar to the procedure used in conventional forest inventories. The plot design, which is to be used can be specified in `plot.design`. By default, all plot designs will be considered.

As the name implies, circular fixed area plots (`"fixed.area"`) are plots with a circular area defined by the plot radius (in m). The size of k-tree plots (`"k.tree"`) is determined by the number of trees (k) that enter the plot. The basal area factor (BAF in m$^2$/ha) defines the size of angle-count plots (`"angle.count"`). The parameters `radius`, `k` and `BAF` have to be specified as data frame in `plot.parameters`. If one of these parameters is not specified, the corresponding plot design will not be considered in the calculations. To calculate the dominant height and diameter, the number of dominant trees per ha (trees/ha) can be indicated by the argument `num.tree`. By default, the number of dominant trees is set to 100 trees/ha.

#### Output files

The function `metrics.variables` generates a list in which each element represents one of the considered plot designs (fixed area, k-tree and angle-count plots). These elements are data frames containing the estimated metrics and variables at stand-level as columns (described below). Further columns include information about the plot, such as the identification number (`id`) and the stratum identification (`stratum`) both coinciding with the respective columns in the output file of the `tree.detection` functions. Each row represents a simulated plot i.e., a plot with a certain size (defined by the values for `radius`, `k` and `BAF` shown in the respective columns). 

If `save.result = TRUE`, the data frames will be saved as seperate .csv files (one for each plot design) to the directory indicated in `dir.result` (or if not specified to the working directory). The .csv files (without row names) will be created using the `write.csv` function from the [utils](https://CRAN.R-project.org/package=R.utils) package.

### Stand-level metrics

```{r include=FALSE}
dir.data <- system.file("exdata", package="FORTLS")
metrics <- read.csv(paste(dir.data, "metrics.variables.fixed.area.plot.csv", sep = "/"))
```

```{r eval=FALSE, include=TRUE}
metrics[1:6, -c(3:36)]
```

```{r echo=FALSE}
kableExtra::scroll_box(kable_input = kableExtra::kable(metrics[1:6, -c(3:36)], format = "html"), 
                       width = "100%")
```

The stand-level metrics are statistical descriptive values, such as percentiles, standard deviation or means, as well as the number of points belonging to the normal section of the point cloud. 

The values **`n.pts`**, **`n.pts.est`**, **`n.pts.red`** and **`n.pts.red.est`** indicate the number of points and the estimated number of points respectively that belong to the tree's normal section (1.3 m +/- 0.05 m). This is calculated for the original point cloud (`n.pts`, `n.pts.est`) and the reduced point cloud (`n.pts.red`, `n.pts.red.est`).

Furthermore, the height percentiles (`P01`, `P05`, `P10`, `P20`, `P25`, `P30`, `P40`, `P50`, `P60`, `P70`, `P75`, `P80`, `P90`, `P95`, `P99`, numbers indicating the *k*-th percentile) are calculated for the $z$ coordinate of the TLS point cloud. Therefore they denominate the height (in m) above ground level.

#### Statistics of the z, rho and r

To describe the tendencies and distribution of the spherical coordinates `z`, $\rho$ (horizontal distance, `rho`) and `r` (euclidean distance), the following statistics are calculated:

- arithmetic (`mean.arit.z/rho/r`), quadratic (`mean.qua.z/rho/r`), geometric (`mean.geom.z/rho/r`) and harmonic means (`mean.harm.z/rho/r`)
- median (`median.z/rho/r`) 
- mode (`mode.z/rho/r`)
- variance (`var.z/rho/r`)
- standard deviation (`sd.z/rho/r`)
- coefficient of variation (`CV.z/rho/r`)
- range (`D.z/rho/r`)
- interquartile range (`ID.z/rho/r`)
- maximum (`max.z/rho/r`)
- minimum (`min.z/rho/r`)
- kurtosis (`kurtosis.z/rho/r`)
- skewness (`skewness.z/rho/r`)
- percentage of points above the arithmetic mean (`p.a.mean.z/rho/r`) and the mode (`p.a.mode.z/rho/r`) 
- percentage of points below the arithmetic mean (`p.b.mean.z/rho/r`) and the mode (`p.b.mode.z/rho/r`) 
- percentage of points above a height of 2 m (`p.a.2m.z`, only for the $z$ coordinate)
- percentage of points below a height of 2 m (`p.b.2m.z`, only for the $z$ coordinate)
- L-moments of order 2, 3 and 4 (`L2.z/rho/r`, `L3.z/rho/r` and `L4.z/rho/r`)
- third and forth central L-moments (`L3.mu.z/rho/r` and `L4.mu.z/rho/r`)
- ratio of L1 and L2 moments (`L.CV.z/rho/r`)
- median of the absolute deviation from the overall median (`median.a.d.z/rho/r`)
- mode of the absolute deviation from the overall mode (`mode.a.d.z/rho/r`)
- Canopy relief ratio i.e., the ration of `mean.z/rho/r` to `max.z/rho/r` (`CRR.z/rho/r`)
- scale and shape parameters of the Weibull distribution fitted for the spherical coordinates (`weibull_c.z/rho/r`, `weibull_b.z/rho/r`)

### Stand-level variables

```{r eval=FALSE, include=TRUE}
metrics[1:6, 1:36]
```

```{r echo=FALSE}
kableExtra::scroll_box(kable_input = kableExtra::kable(metrics[1:6, 1:36], format = "html"), 
                       width = "100%")
```

The variables are estimated based on the tree attributes of the detected trees from the point cloud data (output of the functions `tree.deteection.single.scan` and `tree.deteection.multi.scan`)  similar to the procedure of conventional forest inventories. The values are computed at stand-level and are therefore extended to an area of one ha.

#### Stand density (N), volume (V) and basal area (G) 

The stand density (**`N.tls`**, trees/ha) is calculated by the following equations. For the circular fixed area and k-tree plots, `N.tls` is calculated by 

$$
N.tls=\frac{10000}{\pi R^2}\cdot n
$$ 

with $R$ being the radius of the plot (in m) and $n$ the number of detected trees. The density of angle-count plots is calculated by 

$$
N.tls=\sum_{i=1}^{n} \frac{BAF}{g_i}
$$ 

with $BAF$ being the basal area factor (in m$^2$/ha) and $g_i$ the basal area of the tree $i$ (in m$^2$).

The basal area (**`G.tls`**, in m$^2$/ha) is estimated for circular fixed area and k-tree plots by the following equation:

$$
G.tls=\frac{10000}{\pi R^2}\sum_{i=1}^{n} g_i
$$

and for angle-count plots, the following equation is applied:

$$
G.tls=BAF \cdot n
$$

The stem volume (**`V.tls`**, in m$^3$/ha) is estimated by modelling the stem profile as a paraboloid. The volume is calculated as the volume of the revolution of the paraboloid function. The total heights of the detected trees are estimated as the 99th percentile of the points of the $z$ coordinate delimited by Voronoi polygons ($h_{P_{99}i}$, in m). The equation used for the calculation for circular fixed area and k-tree plots is

$$
V.tls=\frac{10000}{\pi R^2}\sum_{i=1}^{n} \pi \cdot \frac{h_{P_{99}i}^2}{2} \cdot \frac{(\frac{1}{2} \cdot dbh_i)^2}{(h_{P_{99}i} - 1.3)^2}
$$

and for angle-count plots is

$$
V.tls=\sum_{i=1}^{n} \frac{BAF}{g_i} \cdot \pi \cdot \frac{h_{P_{99}i}^2}{2} \cdot \frac{(\frac{1}{2} \cdot dbh_i)^2}{(h_{P_{99}i} - 1.3)^2}
$$

#### Occlusion correction

The above mentioned calculations of the stand-level variables $N$, $V$ and $G$ do not consider possible occlusions of trees. Therefore, different occlusion correction approaches are included in the function. 

For angle-count plots, the occlusion correction is based on the Poisson attenuation model. This approach calculates geometric gap probabilities which decreases with increasing distance from the TLS device and follow a Poisson distribution. The calculations of stand density, volume and basal area are corrected with a factor accounting for the gap probabilities and the corrected variables are called **`N.pam`**, **`V.pam`** and **`G.pam`** respectively. For further details see Strahler et al. (2008^[Strahler, A.H., Jupp, D.L.B., Woodcock, C.E., Schaaf, C.B., Yao, T., Zhao, F., Yang, X., Lovell, J., Culvenor, D., Newnham, G., Ni-Miester, W., Boykin-Morris, W., 2008. Retrieval of forest structural parameters using a ground-based lidar instrument (Echidna®). Can. J. Rem. Sens. 34 (Suppl. 2), S426–S440. https://doi.org/10.5589/m08-046.]) and Lovell et al. (2011^[Lovell, J.L., Jupp, D.L.B., Newnham, G.J., Culvenor, D.S., 2011. Measuring tree stem diameters using intensity profiles from ground-based scanning lidar from a fixed viewpoint. ISPRS J. Photogrammetry Remote Sens. 66 (1), 46–55. https://doi.org/10.1016/j.isprsjprs.2010.08.006.]).

For the other two plot designs (circular fixed area and k-tree plots) different occlusion correction approaches are applied. One approach is based on distance sampling data and can therefore only be used when the function `distance.sampling` was used (i.e., when the `tree.ds` argument is specified other than `NULL`). Functions based on point transect sampling that describe how the probability of tree detection decreases with increasing distance from the TLS device are used for the calculation. These functions are applied for the calculation of the variables **`N.hn`**/**`V.hn`**/**`G.hn`** and **`N.hr`**/**`V.hr`**/**`G.hr`** and are Half-Normal and Hazard-Rate respectively. Additionally, **`N.hn.cov`**/**`V.hn.cov`**/**`G.hn.cov`** and **`N.hr.cov`**/**`V.hr.cov`**/**`G.hr.cov`** are calculated with expanding the scale component of the function with $dbh$ as covariate. 

The other approach corrects shadowing effects. Tthis method calculates an expansion factor to correct the variables. The expansion factor is based on the percentage of the shaded area i.e., unsample area which is not seen from the TLS device due to masking by trees. The corrected calculated variables are called **`N.sh`**/**`V.sh`**/**`G.sh`**. For more details see Seidel and Ammer (2014^[Seidel, D., Ammer, C., 2014. Efficient measurements of basal area in short rotation forests based on terrestrial laser scanning under special consideration of shadowing. iFor. Biogeosci. For. 7 (4), 227–232. https://doi.org/10.3832/ifor1084-007.]).

#### Mean and dominant heights (h) and diameters (d)

The mean heights (in m) and diameters (in cm) are calculated by the arithmetic (**`h.tls`**, **`d.tls`**), quadratic (**`h.g`**, **`d.g`**), geometric (**`h.geom`**, **`d.geom`**) and harmonic means (**`h.harm`**, **`d.harm`**). To calculate the dominant heights and diameters, only the $n$ largest trees per hectar are considered. If not otherwise specified in the argument `num.tree` (see above), the number of dominant trees per hectare is set to 100 trees/ha. Dominant heights and diameters are also calculated as the arithmetic (**`h.0.tls`**, **`d.0.tls`**), quadratic (**`h.0.g`**, **`d.0.g`**), geometric (**`h.0.geom`**, **`d.0.geom`**) and harmonic means (**`h.0.harm`**, **`d.0.harm`**).