Human males generally live shorter and reproduce later than females. These sex-specific processes affect kinship dynamics in a number of ways. For example, the degree to which an average member of the population, call her Focal, has a living grandparent is affected by differential mortality affecting the parental generation at older ages. We may also be interested in considering how kinship structures vary by Focal’s sex: a male Focal may have a different number of grandchildren than a female Focal given differences in fertility by sex. Documenting these differences matters since women often face greater expectations to provide support and informal care to relatives. As they live longer, they may find themselves at greater risk of being having no living kin. The function kin2sex implements two-sex kinship models as introduced by Caswell (2022). This vignette show how to run two-sex models and highlights some of the advantages of this model over one-sex models in populations with time-invariant and time-variant rates.

library(DemoKin)
library(tidyr)
library(dplyr)
library(ggplot2)
library(knitr)

1. Demographic rates by sex

Data on female fertility by age is less common than female fertility. Schoumaker (2019) shows that male TFR is almost always higher than female Total Fertility Rates (TFR) using a sample of 160 countries. For this example, we use data from 2012 France to exemplify the use of the two-sex function. Data on female and male fertility and mortality are included in DemoKin. In this population, male and female TFR is almost identical (1.98 and 1.99) but the distributions of fertility by sex varies over age:

fra_fert_f <- fra_asfr_sex[,"ff"]
fra_fert_m <- fra_asfr_sex[,"fm"]
fra_surv_f <- fra_surv_sex[,"pf"]
fra_surv_m <- fra_surv_sex[,"pm"]

sum(fra_fert_m)-sum(fra_fert_f)
## [1] -0.0115
data.frame(value = c(fra_fert_f, fra_fert_m, fra_surv_f, fra_surv_m),
           age = rep(0:100, 4),
           sex = rep(c(rep("f", 101), rep("m", 101)), 2),
           risk = c(rep("fertility rate", 101 * 2), rep("survival probability", 101 * 2))) %>%
  ggplot(aes(age, value, col=sex)) + 
  geom_line() + 
  facet_wrap(~ risk, scales = "free_y") + 
  theme_bw()

2. Time-invariant two-sex kinship models

We now introduce the functions kin2sex, which is similar to the one-sex function kin (see ?kin) with two exceptions. First, the user needs to specify mortality and fertility by sex. Second, the user must indicate the sex of Focal (which was assumed to be female in the one-sex model). Let us first consider the application for time-invariant populations:

kin_result <- kin2sex(
  pf = fra_surv_f,
  pm = fra_surv_m, 
  ff = fra_fert_f, 
  fm = fra_fert_m, 
  time_invariant = TRUE,
  sex_focal = "f", 
  birth_female = .5
  )

The output of kin2sex is equivalent to that of kin, except that it includes a column sex_kin to specify the sex of the given relatives.

Let’s group aunts and siblings to visualize the number of living kin by Focal’s age.

kin_out <- kin_result$kin_summary %>% 
  mutate(kin = case_when(kin %in% c("s", "s") ~ "s",
                         kin %in% c("ya", "oa") ~ "a",
                         T ~ kin)) %>%
  filter(kin %in% c("d", "m", "gm", "ggm", "s", "a"))

kin_out %>% 
  group_by(kin, age_focal, sex_kin) %>%
  summarise(count=sum(count_living)) %>%
  ggplot(aes(age_focal, count, fill=sex_kin))+
  geom_area()+
  theme_bw() +
  facet_wrap(~kin)

A note on terminology

The function kin2sex uses the same codes as kin to identify relatives (see demokin_codes()). Note that when running a two-sex model, the code ‘m’ refers to either mothers or fathers! Use the column sex_kin to determine the sex of a given relatives. For example, in order to consider only sons and ignore daughters, use:

kin_result$kin_summary %>% 
  filter(kin == "d", sex_kin == "m") %>% 
  head()
## # A tibble: 6 × 11
##   age_focal kin   sex_kin year  cohort count_living mean_age sd_age count_dead
##       <int> <chr> <chr>   <lgl> <lgl>         <dbl>    <dbl>  <dbl>      <dbl>
## 1         0 d     m       NA    NA                0      NaN    NaN          0
## 2         1 d     m       NA    NA                0      NaN    NaN          0
## 3         2 d     m       NA    NA                0      NaN    NaN          0
## 4         3 d     m       NA    NA                0      NaN    NaN          0
## 5         4 d     m       NA    NA                0      NaN    NaN          0
## 6         5 d     m       NA    NA                0      NaN    NaN          0
## # ℹ 2 more variables: count_cum_dead <dbl>, mean_age_lost <dbl>

Information on kin availability by sex allows us to consider sex ratios, a traditional measure in demography, with females often in denominator. The following figure, for example, shows that a 25yo French woman in our hypothetical population can expect to have 0.5 grandfathers for every grandmother:

kin_out %>% 
  group_by(kin, age_focal) %>%
  summarise(sex_ratio=sum(count_living[sex_kin=="m"], na.rm=T)/sum(count_living[sex_kin=="f"], na.rm=T)) %>%
  ggplot(aes(age_focal, sex_ratio))+
  geom_line()+
  theme_bw() +
  facet_wrap(~kin, scales = "free")

The experience of kin loss for Focal depends on differences in mortality between sexes. A female Focal starts losing fathers earlier than mothers. We see a slightly different pattern for grandparents since Focal’s experience of grandparental loss is dependent on the initial availability of grandparents (i.e. if Focal’s grandparent died before her birth, she will never experience his death).

# sex ratio
kin_out %>%
  group_by(kin, sex_kin, age_focal) %>%
  summarise(count=sum(count_dead)) %>%
  ggplot(aes(age_focal, count, col=sex_kin))+
  geom_line()+
  theme_bw() +
  facet_wrap(~kin)

3. Time-variant two-sex kinship models

We look at populations where demographic rates are not static but change on a yearly basis. For this, we consider the case of Sweden using data pre-loaded in DemoKin. For this example, we will create ‘pretend’ male fertility rates by slightly perturbing the existing female rates. This is a toy example, since a real two-sex model should use actual female and male rates as inputs.

years <- ncol(swe_px)
ages <- nrow(swe_px)
swe_surv_f_matrix <- swe_px
swe_surv_m_matrix <- swe_px ^ 1.5 # artificial perturbation for this example
swe_fert_f_matrix <- swe_asfr
swe_fert_m_matrix <- rbind(matrix(0, 5, years),  
            swe_asfr[-((ages-4):ages),]) * 1.05 # artificial perturbation for this example

This is how it looks for year 1900:

bind_rows(
  data.frame(age = 0:100, sex = "Female", component = "Fertility rate", value = swe_fert_f_matrix[,"1900"]),
  data.frame(age = 0:100, sex = "Male", component = "Fertility rate", value = swe_fert_m_matrix[,"1900"]),
  data.frame(age = 0:100, sex = "Female", component = "Survival probability", value = swe_surv_f_matrix[,"1900"]),
  data.frame(age = 0:100, sex = "Male", component = "Survival probability", value = swe_surv_m_matrix[,"1900"])) %>% 
  ggplot(aes(age, value, col = sex)) +
  geom_line() +
  theme_bw() +
  facet_wrap(~component, scales = "free")

We now run the time-variant two-sex models (note the time_invariant = FALSE argument):

kin_out_time_variant <- kin2sex(
                      pf = swe_surv_f_matrix, 
                      pm = swe_surv_m_matrix,
                      ff = swe_fert_f_matrix, 
                      fm = swe_fert_m_matrix,
                      sex_focal = "f",
                      time_invariant = FALSE,
                      birth_female = .5,
                      output_cohort = 1900
                      )
## Preparing output...
## Assuming stable population before 1900.

We can plot data on kin availability alongside values coming from a time-invariant model to show how demographic change matters: the time-variant models take into account changes derived from the demographic transition, whereas the time-invariant models assume never-changing rates.

kin_out_time_invariant <- kin2sex(
                      swe_surv_f_matrix[,"1900"], swe_surv_m_matrix[,"1900"], 
                      swe_fert_f_matrix[,"1900"], swe_fert_m_matrix[,"1900"], 
                      sex_focal = "f", birth_female = .5)


kin_out_time_variant$kin_summary %>%
  filter(cohort == 1900) %>% mutate(type = "variant") %>%
  bind_rows(kin_out_time_invariant$kin_summary %>% mutate(type = "invariant")) %>% 
  mutate(kin = case_when(kin %in% c("ys", "os") ~ "s",
                         kin %in% c("ya", "oa") ~ "a",
                         T ~ kin)) %>%
  filter(kin %in% c("d", "m", "gm", "ggm", "s", "a")) %>%
  group_by(type, kin, age_focal, sex_kin) %>%
  summarise(count=sum(count_living)) %>%
  ggplot(aes(age_focal, count, linetype=type))+
  geom_line()+ theme_bw() +
  facet_grid(cols = vars(kin), rows=vars(sex_kin), scales = "free")

4. Approximations

Caswell (2022) introduced two approaches for approximating two-sex kinship structures for cases when male demographic rates are not available. The first is the androgynous approximation, which assumes equal fertility and survival for males and females. The second is the use of GKP factors apply to each type of relative (e.g., multiplying mothers by two to obtain the number of mothers and fathers).

Here, we present a visual evaluation of the accuracy of these approximations by comparing to ‘true’ two-sex models using the French data included with DemoKin for time-invariant models (Caswell 2022). We start by considering the androgynous approximation. We compare expected kin counts by age and find high levels of consistency for all kin types, except for grandfathers and great-grandfathers:

kin_out <- kin2sex(fra_surv_f, fra_surv_m, fra_fert_f, fra_fert_m, sex_focal = "f", birth_female = .5)

kin_out_androgynous <- kin2sex(fra_surv_f, fra_surv_f, fra_fert_f, fra_fert_f, sex_focal = "f", birth_female = .5)

bind_rows(
  kin_out$kin_summary %>% mutate(type = "full"),
  kin_out_androgynous$kin_summary %>% mutate(type = "androgynous")) %>%
  group_by(kin, age_focal, sex_kin, type) %>%
  summarise(count = sum(count_living)) %>%
  ggplot(aes(age_focal, count, linetype = type)) +
  geom_line() +
  theme_bw() +
  theme(legend.position = "bottom", axis.text.x = element_blank()) +
  facet_grid(row = vars(sex_kin), col = vars(kin), scales = "free")

Next, we consider the use of GKP factors and find that it also approximates relatively accurately kin counts at different ages of Focal. These are presented as examples only. Users are invited to perform more rigorous comparisons of these approximations.

# with gkp
kin_out_1sex <- kin(fra_surv_f, fra_fert_f, birth_female = .5)

kin_out_GKP <- kin_out_1sex$kin_summary%>%
  mutate(count_living = case_when(kin == "m" ~ count_living * 2,
                            kin == "gm" ~ count_living * 4,
                            kin == "ggm" ~ count_living * 8,
                            kin == "d" ~ count_living * 2,
                            kin == "gd" ~ count_living * 4,
                            kin == "ggd" ~ count_living * 4,
                            kin == "oa" ~ count_living * 4,
                            kin == "ya" ~ count_living * 4,
                            kin == "os" ~ count_living * 2,
                            kin == "ys" ~ count_living * 2,
                            kin == "coa" ~ count_living * 8,
                            kin == "cya" ~ count_living * 8,
                            kin == "nos" ~ count_living * 4,
                            kin == "nys" ~ count_living * 4))

bind_rows(
  kin_out$kin_summary %>% mutate(type = "full"),
  kin_out_androgynous$kin_summary %>% mutate(type = "androgynous"),
  kin_out_GKP %>% mutate(type = "gkp")) %>%
  mutate(kin = case_when(kin %in% c("ys", "os") ~ "s",
                          kin %in% c("ya", "oa") ~ "a",
                          kin %in% c("coa", "cya") ~ "c",
                          kin %in% c("nys", "nos") ~ "n",
                          T ~ kin)) %>%
  filter(age_focal %in% c(5, 15, 30, 60, 80)) %>%
  group_by(kin, age_focal, type) %>%
  summarise(count = sum(count_living)) %>%
  ggplot(aes(type, count)) +
  geom_bar(aes(fill=type), stat = "identity") +
  theme_bw()+theme(axis.text.x = element_text(angle = 90), legend.position = "bottom")+
  facet_grid(col = vars(kin), row = vars(age_focal), scales = "free")

References

Caswell, Hal. 2022. “The Formal Demography of Kinship IV: Two-Sex Models and Their Approximations.” Demographic Research 47 (September): 359–96. https://doi.org/10.4054/DemRes.2022.47.13.