scrm is uses a syntax compatible with the popular program ms. There are, however, a few differences to ms:

- scrm can not simulate
- gene conversions (
`-c`

in ms) and - a fix number of segregating sites (
`-s`

),

- gene conversions (
- the option
`-L`

produces a slightly different output and - it additionally implements the flags
`-l`

(approximation),`-sr`

(changing recombination rate),`-st`

(changing mutation rate),`-eI`

(sampling haplotypes at multiple time points) and`-oSFS`

(generates frequency spectra).

- We do not support changing the number of populations with
`-ema`

. Our version of the command is just`-ema <t> <M11> <M12> ...`

instead of`-ema <t> <npop> <M11> <M12> ...`

.

For all other options, you can also refer to ms’ manual to get a detailed description of what the commands are doing. scrm should happily execute any ms command that does not contain `-c`

, `-s`

and `-ema`

. Also, scrm has somewhat stricter requirements regarding the order of arguments if population admixture (`-es`

) is involved.

The arguments for calling *scrm* are

`scrm <nhap> <nrep> [...]`

where *nhap* is the total number of haplotypes (in all populations and at all times) that are simulated at each locus, and *nrep* is the number of independent loci that will be produced. The `[...]`

is an optional placeholder for an arbitrary number of command line flags described below.

`-r <R> <L>`

: Set the recombination rate to*R = 4*and the length of all loci to L base pairs.*N0*r*r*is expected number of recombinations on the locus per generation.`-l <l>`

: Use approximation rather than simulating the exact ARG. Within a sliding window of length*l*base pairs all linkage information is considered when building the genealogy. To positions outside of this window, some linkage is ignored. Setting*l=0*produces the SMC’ and*l=-1*deactivates the approximation. Since v1.6.0, it’s also possible to specify the window’s length in number of recombinations. To do so, use`-l <x>r`

, where x is the number of recombinations (e.g.`-l 100r`

for a window spanning 100 recombinations). Also starting with version 1.6.0**approximation is turned on by default**using a conservative window length of 500 recombinations. For most applications, it should be fine to reduce this value to 100 - 250 recombinations if runtime is a critical factor.

In all commands, migrations rates *M = 4 N0m*, where

`-I <npop> <s1> ... <sn> [<M>]`

: Use an island model with*npop*populations, where*s1*to*sn*haplotypes are sampled from population 1 to n, respectively. Optionally assume a symmetric migration rate of*M*.`-M <M>`

: Assume a symmetric migration rate of*M/(npop-1)*.`-m <i> <j> <M>`

: Set the migration rate from population*j*to population*i*to*M*(looking forward in time) [since v1.3.1].`-ma <M11> <M21> ... <M21> ...`

: Set the migration matrix (Dimension is*npop x npop*). Diagonals elements are ignored but required (you can use`x`

or`0`

).

For exponential growth/decline of a population, the parameter *a* changes the size of a population according to the formula *N(s) = N(0) exp(-as)*, where

`-g <a>`

and `-G <a>`

and `-eg <t> <a>`

and `-eG <t> <a>`

) and `-n <i> <n>`

: Set the present day size of population*i*to _n*N0_.`-G <a>`

: Set the exponential growth rate of all populations to*a*.`-g <i> <a>`

: Set the exponential growth rate of population*i*to*a*.

`-t < $\theta$ >`

: Set the mutation rate to \(\theta = 4N_0u\), where*u*is the neutral mutation rate per locus. If this options is given, scrm generates the segregating sites output.`-transpose-segsites`

or`--transpose-segsites`

: If given, the segregating sites are printed with each row representing a mutation and each column representing a haplotype, rather than the other way round. Additionally, the time at which a mutation occurred is reported (in units of*4 * N0*generations) [since v1.7.0].`-T`

: Print the local genealogies in newick format.`-O`

: Print the local genealogies in the`oriented forest`

format as described in Kelleher*et al.*(2014) [since v1.2].`-L`

: Print the TMRCA and the local tree length for each segment (behaves different to ms). Both values are scaled in coalescent time units, e.g. in*4 * N0*generations.`-oSFS`

: Print the site frequency spectrum. Requires that the mutation rate \(\theta\) is given with the ‘-t’ option.`-SC [ms|rel|abs]`

: Scaling of sequence positions. Either relative to the locus length between 0 and 1 (`rel`

), absolute in base pairs (`abs`

) or`ms`

’s scaling (default) where the positions in the*segregating sites*output are relative, and the positions in the trees output are absolute (`ms`

) [since v1.3.0].

`-seed <SEED> [<SEED2> <SEED3>]`

: Specifies a seed for the simulation. You can input up to three non-negative numbers. If no seed is given, scrm generates one using entropy provided by the operating system. To reproduce a previous simulation, use the single number in the second line of the output.`-print-model, --print-model`

: Prints information about the model defined by the command line arguments, including calculated population sizes. Can be useful for debugging or verifying the model [since v1.5.0].`-p <digits>`

: Sets the number of significant digits used in the output [since v1.4.0].`-h`

,`--help`

: Prints a help text.`-v`

,`--version`

: Prints version information.

The command this section all have a time *t* as first parameter. Changes made by the commands affect the time from *t* further back into the past. All times in units of _4*N0_ generations.

`-eI <t> <s1> ... <sn>`

: Sample*s1*to*sn*haplotypes are from population*1*to*n*, respectively, at time*t*.`-eM <t> <M>`

: Assume a symmetric migration rate of*M/(npop-1)*at time*t*.`-em <t> <i> <j> <M>`

: Set the migration rate from population*j*to population*i*to*M*(looking forward in time) at time*t*[since v1.3.1].`-ema <t> <M11> <M12> ... <M21> ...`

: Set the migration matrix at time*t*(Dimension is*npop x npop*). Diagonals elements are ignored but required (use ‘x’ or 0). The rates apply pastwards from time*t*.

`-eN <t> <n>`

: Set the size of all populations to _n*N0_ at time*t*.`-en <t> <i> <n>`

: Set the size of population*i*to _n*N0_ at time*t*.`-eg <t> <i> <a>`

: Set the exponential growth rate of population*i*to*a*at time*t*.`-eG <t> <a>`

: Set the exponential growth rate of all populations to*a*at time*t*.

`-es <t> <i> <p>`

: Population admixture. Replaces a fraction*1-p*of population*i*with haplotypes from a population*npop + 1*. Technically (and looking backwards in time), a new population*n+1*with size*N0*is created at time*t*. Migration (to & from) and growth rates for this population are initially 0. Each lines in population*i*is moved to the new population with probability*1-p*. Please sort multiple`-es`

arguments by their time to avoid confusion about the numbering of populations. Please give the arguments that affect the whole population (`-M`

,`-N`

,`-G`

&`-ma`

) before giving the first`-es`

. Also, their timed equivalent’s (`-eM`

,`-eN`

,`-eG`

,`-eI`

&`-ema`

) position on the command line events must also be sorted by time, at least relative to the`-es`

argument.`scrm`

throws an error if any of these conditions is not met. In doubt, just sort all command line arguments by their time.`-eps <t> <i> <j> <p>`

: Partial admixture. Similar to`-es`

but replaces a fraction`1-p`

of population*i*with haploids from population*j*at time*t*. Different to`-es`

, population*j*is a normal population that continues to exist at times more recent than*t*. Viewed backwards in time, this moves a fraction*1-p*of the linages in population*i*to population*j*. This does not change the number of populations, population sizes, growth or migration rates in any way [since v1.5.0].`-ej <t> <j> <i>`

: Adds a specialization event in population*i*that creates population*j*(forwards in time). Technically (and looking backwards in time), it moves all lines from population*j*into population*i*at time*t*. Migration rates into population*j*are set to 0 for the time further back into the past.

When multiple `es`

, `eps`

or `ej`

arguments are given for the same time *t*, the migrations are executed in the order in which the commands are given. For example if we have `-es 0.08 2 .2 -ej 0.08 3 1`

, first 80% of pop 2 move to a newly created pop 3 (viewed backwards in time), then everyone that just moved to pop 3 moves on to pop 1. This is equivalent to `-eps 0.08 2 1 .2`

, except that the latter does not create the empty population 3.

The following commands change the model parameters from at a sequence position *s*. You should still set the initial rate with `-r`

or `-t`

, respectively, and then use the commands prefixed with `s`

for all changes. Note that `-r`

also takes the total length of the sequence as second argument, while `-sr`

just has the rate as argument.

`-sr <s> <R>`

: Set the recombination rate to*R*starting at position*s*.`-st <s> <$\theta$>`

: Set the mutation rate to \(\theta\) starting at position*s*.