Characterizing the Transmission Potential of Zoonotic Infections from Minor Outbreaks

However, distinguishing between these pathogen and population-specific properties typically requires detailed serological studies, which are rarely available in the early stages of an outbreak. Using a simple transmission model that incorporates age-stratified social mixing patterns, we present a novel method for characterizing the transmission potential of subcritical infections, which have effective reproduction number R<1, from readily available data on the size of outbreaks. We show that the model can identify the extent to which outbreaks are driven by inherent pathogen transmissibility and preexisting population immunity, and can generate unbiased estimates of the effective reproduction number. Applying the method to real-life infections, we obtained accurate estimates for the degree of age-specific immunity against monkeypox, influenza A(H5N1) and A(H7N9), and refined existing estimates of the reproduction number. Our results also suggest minimal preexisting immunity to MERS-CoV in humans. The approach we describe can therefore provide crucial information about novel infections before serological surveys and other detailed analyses are available. The methods would also be applicable to data stratified by factors such as profession or location, which would make it possible to measure the transmission potential of emerging infections in a wide range of settings.

However, it can be difficult to measure these properties if there are limited experimental studies of population immunity. By incorporating social contact patterns into a mathematical model of disease transmission, we show that it is possible to estimate both pathogen transmissibility and preexisting immunity from available data on the size of outbreaks. When an infection does not transmit efficiently between humans, estimates often have to be made using case data from a limited number of small outbreaks. We find that, even with limited data, our technique can accurately evaluate the transmission potential of ‘stuttering’ chains of infection. We

However, novel pathogens do not always transmit efficiently when first introduced into human populations. Outbreaks of infections such as Middle East respiratory syndrome coronavirus (MERS-CoV) [3, 4] and monkeypox [5] have generally occurred as ‘stuttering chains’ of transmission [6] , generating a relatively small number of linked clusters of cases without evidence of sustained transmission. Infections such as influenza A(H5N1) [7] and A(H7N9) [8] also appear to be subcritical at present, having so far failed to transmit efficiently between humans.

Transmissibility can be summarised using the effective reproduction number, R, defined as the average number of secondary cases produced by a typical infectious host [9]. The reproduction number can be separated into two components: the inherent transmissibility of a pathogen, and the level of susceptibility in the host population. In some circumstances, susceptibility might be reduced as a result of preexisting immunity from previous vaccination campaigns, as is the case with monkeypox [5, 10], or prior exposure to a similar pathogen, as has been suggested for influenza A/H1N1p [11]. Such immunity will not necessary be distributed evenly across the population: if pathogens circulate over an extended period of time, or vaccination campaigns have been discontinued, preexisting immunity is more likely to be found in older age groups [12].

However, existing techniques for estimating transmission potential from outbreak size data generally represent transmission in the host population using single-type branching process [15, 16, 17, 18]. As a result, it is not possible to distinguish between inherent pathogen transmissibility and population susceptibility. For instance, a highly transmissible pathogen in a mostly immune population might have the same effective reproduction number as an infection with lower inherent transmissibility spreading between fully susceptible hosts.

Individuals of different ages have heterogeneous social contact patterns and hence different risks of infection during an outbreak [19, 20, 21]. Preexisting immunity in older age groups can alter this pattern [22] , making it possible to separate the reproduction number into its pathogen and popu-lation-specific components. We made use of this observation by developing a novel age-structured model of stuttering transmission chains, which combined reported social contact data with a multi-type branching process [23, 24].

Next, we used simulated outbreaks to examine whether the model could distinguish between different types of infection using only age-stratified final outbreak size data. Finally, we analysed observed outbreak data for monkeypox, influenza A(H5N1), A(H7N9) and MERS-COV, and found that it was possible to accurately characterize pathogen transmissibility and preexisting host immunity.

It has been suggested that the post-childhood drop in risky contacts that occurs around age 20 is a dominant factor shaping influenza dynamics [25], and the intense contacts between children make them an important epidemiological group for respiratory infections [26, 12]. We therefore divided the population into two groups: under 20 and over 20 year olds.

1A). When the infection was introduced into the under 20 age group, the outbreak size distribution was therefore relatively symmetric between the two groups (Fig. 1B). When the offspring distribution of secondary cases depended on reported physical contacts between different groups in the UK (SlA Fig. ), this pattern changed. Each infected host could generate secondary cases in either group, and the mean number of cases generated depended on which group the infected host was in (Fig. 1C). We assumed a fully susceptible population, which meant that the average number of secondary cases generated by a typical infectious individual was equal to the basic reproduction number, R0 [9]. If infection started in the under 20 age group, there was a noticeable bias in the outbreak size distribution, with large outbreaks in under 20 year-olds more likely than large outbreaks in the over 20s

1D). When the infection started in the over 20 age group (Fig. IF), the offspring distribution shifted, and the probability of large outbreaks in the under 20 age group decreased (Fig. 1F).

We defined this as an outbreak size that has a less than 10—3 probability of occurring in our model. When the infection was introduced into the under 20 age group, there was an asymmetry in the threshold for an unusually large outbreak in the UK (Fig. 2A). If R0 = 0.7, a chain of at least 8 cases was not unusual if some of the secondary cases are children, yet it is if the secondary cases are all adults. The conditions for an anomalously large outbreak shifted when infection started in the eldest group (Fig. 2B). In some cases the thresholds curved inwards. In Fig. 2A, when R0 = 0.7 an outbreak of size 7 was anomalously large if all secondary cases were in the youngest group, but an outbreak of size 10 was not unusual if between 2—8 secondary cases were in the eldest group. As the infection was introduced in the youngest group, this suggested that chains of transmission were more likely to persist if they crossed into the eldest age group. The threshold also curved inwards when the infection started in the eldest group (Fig. 2B). An outbreak of size 5 was unusual if all the secondary cases were in the youngest group, but an outbreak of size 8 was not anomalous if there were 3 cases in the eldest group. This implies that having a single case in the introductory age group and several in the other group was unlikely when R0 = 0.7. As suggested by the next generation matrix (SlA Fig. ), the primary case would generally create additional cases within the same group rather than infect only individuals in the other group.

We simulated outbreaks using a multi-type branching process with two groups, then used the outbreak size distribution to infer R0 and relative immunity in older individuals. We assumed that the under 20 age group was fully susceptible to infection, and the relative susceptibility of the over 20 age group, denoted 8, could vary. Each outbreak was seeded randomly in the susceptible population. In the UK the under 20 age group make up 24% of the total population, so in the absence of immunity, the probability of the outbreak starting in this group was 0.24.

First, we examined two infections with the same R0 = 0.2, but different levels of immunity in the over 20 age group. In one scenario, only 20% of hosts over age 20 were susceptible to infection (i.e. S = 0.2); in the other, the population was fully susceptible (S = 1). We simulated 50 spillover events, and found the maximum likelihood estimate of R0 and S. We repeated this process for 1000 sets of outbreaks, obtaining reliable estimates of both R0 and 8 (Figs. 3A-B). Next, we considered the same two susceptibility values, but for an infection with R0 = 0.7. The model was again able to distinguish between the different scenarios (Figs. 3C-D). The structure of the reproduction matrix (Equation 2) means that R0 and 8 should always be identifiable in the model, given enough data, because R0 scales the entire matrix, whereas 8 only scales the transmission rate to the older age group.

We compared these values with estimates from an inference framework based on a single-type branching process [15, 16, 17, 18]. In all four scenarios, our estimates for R are less biased in the age-structured model (Table 1). However, the relative sum-squared error is smaller in the single-type model when R0 is small. This is because accurate inference across the two age groups requires sampling from the tail of the joint outbreak size distribution, which is achieved either when R0 is larger (Table 1), or when more outbreak data are available. When inference is performed using data from a larger number of outbreaks, the relative error for the age-structured model is smaller than for the non-stratified framework (82 Fig).

This bias is the result of our assumption that introductions occur randomly across the susceptible population, and illustrates an important caveat to inference of R from the mean outbreak size in a single-type branching process model. If the proportion of cases that are introduced to each age group is equal to the dominant eigenvector of the reproduction matriX, it is possible to obtain unbiased estimates for R using only the mean outbreak size (see Text SI). However, if the true proportion of introductions in the under 20 group is less than number of introductions implied by dominant eigenvector, we will underestimate R in a single-type model (S3 Fig). Conversely, if the true proportion of introductions is larger, we overestimate R. In our model of transmission chains in Great Britain, we assumed a child-dom-inated social contact matriX but relatively flat population structure. In the absence of immunity, the probability the infection starts in the under 20 age group was therefore 0.24. However, the relevant component of the dominant eigenvector of the reproduction matriX is 0.68. If the probability of introduction is less than this—as it is in our model—the homogeneous miXing assumption will lead to an underestimate of R (S3 Fig). The age structured model avoids dependency on age-specific eXposure risk by accounting for which age group the infection started in when performing inference (Equation 13). If there were a disproportionate number of introductions in a particular age group, the structure of the likelihood function means that it would not bias our estimate for R. We also tested whether our inference approach, which assumed social contact data reflects age-specific transmission, was sensitive to misspecif1cation of the ‘true’ transmission process. We simulated data using different assumptions about age-specific infection rates but left the inference model unchanged. First, we simulated outbreak data using a multi-type branching process with 15 age groups. As in the inference model, transmission between different groups depended on reported physical contacts from the POLYMOD survey in Great Britain. Although the inference model only used two age groups, it correctly identified the four different combinations of transmissibility and susceptibility (S4 F ig.). Next, we simulated data using two ages groups, but with transmission based on the average number of reported physical contacts across 8 European countries in the POLYMOD study (S1B Fig.). The relative error in R was generally slightly larger (S1 Table), but we were still able to obtain accurate scenario estimates (S5 Fig.). When we considered a generic child-dominated next generation matrix (S1C Fig. ), our estimates for S were more variable, but we were still able to distinguish between pathogen transmissibility and preexisting immunity (S6 Fig.). Finally, we considered a transmission matrix in which adults were dominant (S1D Fig.). As expected in such a heavily mis-specif1ed model, we were not able to accuracy estimate 8 and R0 (S7 F ig. ).

As we could not be certain that the under 20 age group was fully susceptible, we did not infer the basic reproduction number, R0. Instead, we defined p to be the effective reproduction number when both groups were equally susceptible (i.e. S = 1). If in reality the under 20 age group had no immunity to the infection then p = R0. For our analysis of MERS and monkeypox outbreak data, we used the average reported physical contacts from POLYMOD across 8 European countries (SIB Fig). For H5N1 and H7N9, we used physical contact data from Southern China (SlE Fig. ).

Our maximum likelihood estimates suggest that the over 20 age group had substantial preexisting immunity against monkeypox and H5N1, and no immunity against H7N9 or MERS-CoV (Fig. 4). These estimates agree with values derived from detailed studies of vaccination and infection history (Table 3). We could not perform such a comparison for MERS-CoV, however, as we could find no studies reporting measurements of population-level immunity for humans.

This was likely the result of the small number of clusters we analysed. To examine whether a larger number of clusters might improve our model estimates, we performed a simulation study using an infection with limited transmissibility in a population with preexisting immunity (i.e. a similar scenario to monkeypox transmission). We simulated 5O spillover events, with R0 = 0.25 and S = 0.5, then attempted to infer the parameters from the age-stratified outbreak size data. We found that the 95% confidence interval of the joint

Characterization of transmission potential from observed outbreak size distributions. Each point shows joint maximum likelihood estimate of the effective reproduction number if both age groups were equally susceptible, p, and the relative susceptibility of over 20s, 8. Dark line indicates 80% confidence interval (CI); light line is 95% Cl. Blue, influenza A(H7N9); green, influenza A(H5N1); pink, monkeypox; orange, MERS. distribution of R0 and S was very broad (88 F ig.). However, when we simulated 150 or 250 spillover events instead, the uncertainty in our estimates shrank, and we were able to obtain more precise parameter estimates (88 Fig).

Our estimates were similar to previously published estimates that assumed a single-type population. However, the confidence intervals for our estimates were generally smaller (Table 3). Influenza A(H7N9) had an effective reproduction number of 0.08 (95% CI 0.02—0.23), influenza A(H5N1) had R = 0.10 (0.05—0.18) and monkeypox R = 0.08 (0.02—0.22). Our estimate of R for MERS-CoV was 0.73 (0.54—0.96), whereas in a single-type branching process model R = 0.63 (0.49—0.85). The discrepancy was caused by the age distribution of the largest outbreak clusters. One cluster of 26 infections consisted entirely of over 20s: if transmission was indeed driven by social mixing patterns, such an outbreak would require a large R to persist in only one group.

It is particularly important to account for such censoring when infections are near the R = 1 boundary [17]. To test the robustness of our estimates for MERS-CoV when outbreak size data were censored, we extended our inference framework to account for incomplete outbreaks (methods in Text 81). When censoring was included, our estimate for R increased slightly to 0.77 (0.57—1.03), but our maximum likelihood estimate for S remained the same.

However, for emerging infections estimates often have to be made using case data from a limited number of small outbreaks. Using a multi-type branching process, we developed an inference framework to make better use of age-structured outbreak size data.

Based on observed outbreak size distributions, we estimated that individuals over age 20 had susceptibility to monkeypox reduced by a factor 0.4 compared with younger hosts. This value agrees well with published estimates of population susceptibility (Table 3), with cross-immunity coming from the smallpox vaccination campaigns that ended in the two decades preceding the outbreaks [5]. We also found evidence of preexisting immunity to influenza A(H5N1) in older individuals; it has previously been suggested that such immunity could result either from prior eXposure to H5N1, or from cross-im-munity from previous infection with influenza A(HINI) [27]. In contrast, we estimated that both age groups had similar levels of susceptibility to MERS and influenza A(H7N9). Given that immunity from vaccination and natural infection tends to increase with age, this suggests that there was little preexisting immunity to these pathogens. While serological studies have found no evidence of preexisting immunity to H7N9 virus in these locations [28, 29], serological analysis remains challenging for novel coronaviruses such as MERS-CoV [30]. The approach we describe can therefore provide crucial information about the degree of population susceptibility before serological surveys are available.

In a single-type branching process framework, the threshold is a single number: the total size of the outbreak [31, 16]. In an age-structured model, however, the threshold depends on outbreak size in each age group. The age breakdown of cases can therefore provide additional information about what constitutes an unusual outbreak which would not be available with only overall outbreak sizes. Moreover, the shape of the thresholds in Fig. 2 suggest that the infection must pass between age groups to persist. Such dynamics could be important in understanding how pathogens adapt to a new host or invade a new population, and could be explored further in future using the models we have described here.

First, we assumed that secondary cases are drawn from a geometric distribution with mean R (or Rij in the two group model). This is akin to assuming that recovery times are exponentially distributed in the standard SIR model. Other studies have assumed that the offspring distribution for secondary cases follows a negative binomial distribution, and have suggested that an increased level of over-dispersion is often appropriate when modelling disease emergence [16, 14]. However, some of this over-dispersion is captured implicitly our model as a result of the variation that comes from including social contact structure. Given appropriate data, it would be interesting to see whether individual variation in transmission can be explained by social behaviour rather than processes such as virus shedding. This would have implications for how the over-dispersion parameter should be interpreted in an age-structured framework.

This simplification is reasonable for infections with a small effective reproduction number, but depletion of susceptibles would need to be accounted for if R were close to 1 [16]. In addition, we assumed that transmission potential between age groups was captured entirely by social contacts. Because we used simulated data to infer parameters, and hence had knowledge of the true model, we were also assuming that these contacts were reported accurately. We tested the accuracy of parameter estimation when the transmission process was mis-specif1ed, and found that it was still possible to distinguish between different scenarios as long as transmission matrices in both the simulation and inference models were dominated by intense mixing between children. This is a reasonable assumption, as it has been suggested that such mixing patterns drive observed outbreaks of respiratory infections [21, 25, 32]. Although published contact matrix data were not available for Central Africa, where mon-keypox cases were reported, preliminary results from social contact survey in Uganda suggest that age mixing patterns are qualitatively similar to those found in the POLYMOD study, with a clear pattern of assortative mixing between different age groups, and children reporting a larger number of contacts relative to adults (Olivier Le Polain de Waroux, personal communication) .

By accounting for the age structure of a population, we show that it is possible to obtain unbiased estimates of the reproduction number, and distinguish between pathogen transmissibility and immunity from outbreak size data. During an outbreak, cluster data may be difficult to obtain; cases are typically reported as aggregated totals by health ministries and WHO [33]. Our results illustrate the value of making higher resolution outbreak data available, with cluster information and covariates such as age reported along overall case numbers.

For example, if a vaccination campaign that protects against an infection is to change, or be discontinued, it would be important to understand how the pathogen could transmit in a fully susceptible population. This question motivated early studies of monkeypox transmission [34]. However, in studies of monkeypox outbreaks it was relatively straightforward to identify a case’s smallpox vaccination history, because the smallpox vaccine—which provided cross-immunity to monkeypox—left a distinctive scar. The same might not to be true for other vaccines.

Depending on the pathogen, transmission rates may also depend on factors such as profession or setting (for example, hospital versus community transmission). With appropriately stratified outbreak data, it would be possible to infer relative immunity and transmissibility in range of different groups. While spillover infections such as avian influenza and MERS-CoV are a natural application for our approach, population structure could also influence the dynamics of transmission chains following introduction via other routes. For example, novel pathogen strains could emerge via resistance-conferring mutations [35] or adaptation to a human host [36], or be introduced to a population through air travel [37]. By collecting secondary information such as the age distribution of cases, and combining these data with models such as the one outlined here, it should be possible to develop a better understanding of stuttering chains of infection and their transmission potential. During an outbreak, our framework would also be able to generate estimates of epidemiological parameters from a commonly available data source, and hence characterize transmission risk before serological surveys and other detailed analyses are available.

Contact data came from the POLYMOD study, a diary-based survey conducted in Europe

In both studies, participants reported the age of their contacts on a specified day, defined as either a face-to-face conversation in the physical presence of another person, or physical skin-to-skin contact. In our simulation study, we used data on reported physical contacts from the POLYMOD survey in Great Britain (SlA Fig.) to define the level of transmission between different age groups, as there is evidence that this type of contact is better proxy for respiratory pathogen transmission than total contacts [25, 32]. Similar qualitative mixing patterns can be found in other European countries (SIB Fig.) and Southern China (SlE Fig), as well as Southeast Asian countries such as Vietnam [39] and Hong Kong [25]. Outbreak size distributions for different infections were calculated from reported cases (Table 2). In the influenza A(H5N1) data, it was not always clear whether an outbreak cluster was seeded by a single primary case—with all other infections secondary—or multiple co-primary cases. We made the conservative assumption that each cluster had only one primary case: our estimate for R can therefore be considered to be an upper bound on potential transmissibility given available outbreak size data.

To model age-dependent infection, we defined mij to be the mean number of contacts With individuals in age group 1' reported by participants in age group j, and l to be the maximal eigenvalue of the matrix M With entries mij. Defining S to be the relative susceptibly of group 2 compared to group 1, the average number of infections to group i from group j was therefore given by [40]:

We defined the next generation matrix, R, to be the matrix With entries Rij,

If the population was fully susceptible, then R was equal to the basic reproduction number, R0. If S = 1, but we did not know whether the population as a whole was fully susceptible, then we defined the dominant eigenvalue to be p.

We used a multi-type branching process to model secondary infections (see Text 81 for details). Given two different types of individuals, the generating function for the offspring distribution of individual i was Where p51, 52 was the probability that an infectious individual of type 1' generated 51 secondary cases of type 1 and 52 cases of type 2. We assumed that stochasticity in transmission was represented by a Poisson process, and that the individual offspring distribution followed a negative binomial distribution [14]: It was possible to separate this probability generating function into two components, Extending approaches used for a single-type population [16] , we could specify the probability that a certain number of cases of type i are generated by infectives of type j (see Text 81 for details): Inserting the relevant part of Equation 4 into Equation 7, we obtained Note that in this paper we set k = 1. This was equivalent to assuming that recovery times were exponentially distributed, as in the standard SIR model.

We used the offspring distribution to calculate the probability that an outbreak results in the following outcome: 11 total cases in group 1; 111 total cases in group 2; an infections in group 1 caused by infective hosts in group 2; and 6121 infections in group 2 caused by infective hosts in Finally, we used Equations 9—10 to calculate r1 the probability the infection Will cause an outbreak of size n in group 1 and m in group 2, given that the initial case was in group 1: Where

If N; m was the number of chains that start in group i and resulted in 11 cases in group 1 and 111 cases in group 2, then by Equation 11 the likelihood of parameter set 9 given data X was: When only the total number of cases in a cluster was known, and not the age distribution, we instead inferred the reproduction number from the overall outbreak size distribution [16]. If Nn was the number of chains of size n, and rn was the probability a transmission chain has size n, the likelihood function was:

For a higher dimensional model, it might be necessary to use an alternative technique, such as Markov chain Monte Carlo [41], to ensure robust and efficient parameter estimation. Confidence intervals were calculated using profile likelihoods: for each value of R0, we found the maximum likelihood across all possible values of S; the 95% confidence interval was equivalent to the region of parameter space that was within 1.92 log-likelihood points of the maximum-likelihood estimate for both parameters [42].

It was not possible to obtain a tractable expression for the maximum likelihood (ML) estimates of p and S, and hence R, using Equation 13. Instead we calculated the ML estimate of the reproduction number, R, using the numerically estimated maXimum likelihood values for p and S. We used two metrics to assess the accuracy of R: the estimator bias and relative error [15]. Having generated M sets of outbreak data using the same R, and found R. for each set i, the estimator bias was and the root mean square relative error was given by: Mean outbreak size in two group model Let y denote the mean outbreak size matrix. If we denote entries of y by [,4 then Zj [4,-1- is the ij) mean outbreak size in group i. If the eigenvalues of the next generation matrix R, denoted ii, are such that Mil < 1 for all i, we have Where and al is the probability the primary infection was in group i.

(A) R0 = 0.2 and S = 0.2. Blue line, relative error in maximum likelihood estimate for R in single-type model; red line,

Blue line, population fully susceptible (S = 1); green line, over 20 age group have susceptibility reduced by half relative to under 20 group (S = 0.5). If the probability that the infection is introduced into

We simulated 1000 sets of 50 outbreaks, and found the maximum likelihood estimates (MLEs) for parameters for each set. White dots show true parameter values; heat map shows distribution of the 1000 MLEs.

We simulated 1000 sets of 50 outbreaks, and found the maximum likelihood estimates (MLEs) for parameters for each set. White dots show true parameter values; heat map shows distribution of the 1000 MLEs.

We simulated 1000 sets of 50 outbreaks, and found the maximum likelihood estimates (MLEs) for parameters for each set. White dots show true parameter values; heat map shows distribution of the 1000 MLEs.

In simulations, R0 = 0.25 and S = 0.5. Age-specific contact patterns were based on reported physical contacts in Great Britain in POLYMOD study [20].

Accuracy of R estimation when inference matrix is mis—specified (Matrix in SlB Fig.). (PDF)

Simulation and inference code. Simulation model generates stochastic multi type outbreaks from a two-class mixing matrix. The inference model generates maximum likelihood estimates of R0 and S from outbreak size data. (R)

We would like to thank Paul Fine, Theo Kypraios and Jamie Lloyd-Smith for useful discussions.

Appears in 39 sentences as: age group (24) age groups (17) ages groups (1)

In *Characterizing the Transmission Potential of Zoonotic Infections from Minor Outbreaks*

- Such immunity will not necessary be distributed evenly across the population: if pathogens circulate over an extended period of time, or vaccination campaigns have been discontinued, preexisting immunity is more likely to be found in older age groups [12].Page 2, “Introduction”
- Preexisting immunity in older age groups can alter this pattern [22] , making it possible to separate the reproduction number into its pathogen and popu-lation-specific components.Page 2, “Introduction”
- First we derived an expression for the outbreak size distribution in an age-stratified population, in which transmission between different age groups depended on the number of physical contacts reported in the POLYMOD survey in Great Britain.Page 2, “Introduction”
- We explored the age pattern of infection by calculating the joint outbreak size distribution across different age groups .Page 3, “Outbreak size distributions for age-structured populations”
- When the infection was introduced into the under 20 age group , the outbreak size distribution was therefore relatively symmetric between the two groups (Fig.Page 3, “Outbreak size distributions for age-structured populations”
- If infection started in the under 20 age group , there was a noticeable bias in the outbreak size distribution, with large outbreaks in under 20 year-olds more likely than large outbreaks in the over 20sPage 3, “Outbreak size distributions for age-structured populations”
- When the infection started in the over 20 age group (Fig.Page 4, “Outbreak size distributions for age-structured populations”
- IF), the offspring distribution shifted, and the probability of large outbreaks in the under 20 age group decreased (Fig.Page 4, “Outbreak size distributions for age-structured populations”
- When the infection was introduced into the under 20 age group , there was an asymmetry in the threshold for an unusually large outbreak in the UK (Fig.Page 4, “Identifying anomalously large outbreaks”
- As the infection was introduced in the youngest group, this suggested that chains of transmission were more likely to persist if they crossed into the eldest age group .Page 4, “Identifying anomalously large outbreaks”
- This implies that having a single case in the introductory age group and several in the other group was unlikely when R0 = 0.7.Page 4, “Identifying anomalously large outbreaks”

See all papers in *April 2015* that mention age group.

See all papers in *PLOS Comp. Biol.* that mention age group.

Back to top.

Appears in 30 sentences as: R0 (32)

In *Characterizing the Transmission Potential of Zoonotic Infections from Minor Outbreaks*

- We assumed a fully susceptible population, which meant that the average number of secondary cases generated by a typical infectious individual was equal to the basic reproduction number, R0 [9].Page 3, “Outbreak size distributions for age-structured populations”
- We used the outbreak size distribution to identify what constitutes an anomalously large outbreak for a particular R0 .Page 4, “Identifying anomalously large outbreaks”
- If R0 = 0.7, a chain of at least 8 cases was not unusual if some of the secondary cases are children, yet it is if the secondary cases are all adults.Page 4, “Identifying anomalously large outbreaks”
- 2A, when R0 = 0.7 an outbreak of size 7 was anomalously large if all secondary cases were in the youngest group, but an outbreak of size 10 was not unusual if between 2—8 secondary cases were in the eldest group.Page 4, “Identifying anomalously large outbreaks”
- This implies that having a single case in the introductory age group and several in the other group was unlikely when R0 = 0.7.Page 4, “Identifying anomalously large outbreaks”
- We simulated outbreaks using a multi-type branching process with two groups, then used the outbreak size distribution to infer R0 and relative immunity in older individuals.Page 4, “Estimating transmissibility and pre-existing immunity”
- First, we examined two infections with the same R0 = 0.2, but different levels of immunity in the over 20 age group.Page 5, “Estimating transmissibility and pre-existing immunity”
- We simulated 50 spillover events, and found the maximum likelihood estimate of R0 and S. We repeated this process for 1000 sets of outbreaks, obtaining reliable estimates of both R0 and 8 (Figs.Page 5, “Estimating transmissibility and pre-existing immunity”
- Next, we considered the same two susceptibility values, but for an infection with R0 = 0.7.Page 5, “Estimating transmissibility and pre-existing immunity”
- The structure of the reproduction matrix (Equation 2) means that R0 and 8 should always be identifiable in the model, given enough data, because R0 scales the entire matrix, whereas 8 only scales the transmission rate to the older age group.Page 5, “Estimating transmissibility and pre-existing immunity”
- We used our estimates of R0 and relative immunity in the over 20 age group to calculate the effective reproduction number.Page 6, “Estimating transmissibility and pre-existing immunity”

See all papers in *April 2015* that mention R0.

See all papers in *PLOS Comp. Biol.* that mention R0.

Back to top.

Appears in 14 sentences as: maXimum likelihood (1) maximum likelihood (13)

In *Characterizing the Transmission Potential of Zoonotic Infections from Minor Outbreaks*

- We simulated 50 spillover events, and found the maximum likelihood estimate of R0 and S. We repeated this process for 1000 sets of outbreaks, obtaining reliable estimates of both R0 and 8 (Figs.Page 5, “Estimating transmissibility and pre-existing immunity”
- Our maximum likelihood estimates suggest that the over 20 age group had substantial preexisting immunity against monkeypox and H5N1, and no immunity against H7N9 or MERS-CoV (Fig.Page 7, “Application to real outbreaks”
- Each point shows joint maximum likelihood estimate of the effective reproduction number if both age groups were equally susceptible, p, and the relative susceptibility of over 20s, 8.Page 8, “Application to real outbreaks”
- When censoring was included, our estimate for R increased slightly to 0.77 (0.57—1.03), but our maximum likelihood estimate for S remained the same.Page 9, “Application to real outbreaks”
- We obtained maximum likelihood estimates for 9 = {R0, 8} by calculating the two-dimen-sional likelihood surface and using a simple grid-search algorithm to find the maximum point.Page 13, “Inference”
- Confidence intervals were calculated using profile likelihoods: for each value of R0, we found the maximum likelihood across all possible values of S; the 95% confidence interval was equivalent to the region of parameter space that was within 1.92 log-likelihood points of the maximum-likelihood estimate for both parameters [42].Page 13, “Inference”
- It was not possible to obtain a tractable expression for the maximum likelihood (ML) estimates of p and S, and hence R, using Equation 13.Page 13, “Performance metrics”
- Instead we calculated the ML estimate of the reproduction number, R, using the numerically estimated maXimum likelihood values for p and S.Page 13, “Performance metrics”
- Blue line, relative error in maximum likelihood estimate for R in single-type model; red line,Page 14, “Supporting Information”
- We simulated 1000 sets of 50 outbreaks, and found the maximum likelihood estimates (MLEs) for parameters for each set.Page 14, “Supporting Information”
- We simulated 1000 sets of 50 outbreaks, and found the maximum likelihood estimates (MLEs) for parameters for each set.Page 15, “Supporting Information”

See all papers in *April 2015* that mention maximum likelihood.

See all papers in *PLOS Comp. Biol.* that mention maximum likelihood.

Back to top.

Appears in 11 sentences as: branching process (11)

In *Characterizing the Transmission Potential of Zoonotic Infections from Minor Outbreaks*

- However, existing techniques for estimating transmission potential from outbreak size data generally represent transmission in the host population using single-type branching process [15, 16, 17, 18].Page 2, “Introduction”
- We made use of this observation by developing a novel age-structured model of stuttering transmission chains, which combined reported social contact data with a multi-type branching process [23, 24].Page 2, “Introduction”
- We simulated outbreaks using a multi-type branching process with two groups, then used the outbreak size distribution to infer R0 and relative immunity in older individuals.Page 4, “Estimating transmissibility and pre-existing immunity”
- We compared these values with estimates from an inference framework based on a single-type branching process [15, 16, 17, 18].Page 6, “Estimating transmissibility and pre-existing immunity”
- This bias is the result of our assumption that introductions occur randomly across the susceptible population, and illustrates an important caveat to inference of R from the mean outbreak size in a single-type branching process model.Page 6, “Estimating transmissibility and pre-existing immunity”
- First, we simulated outbreak data using a multi-type branching process with 15 age groups.Page 7, “Estimating transmissibility and pre-existing immunity”
- Our estimate of R for MERS-CoV was 0.73 (0.54—0.96), whereas in a single-type branching process model R = 0.63 (0.49—0.85).Page 9, “Application to real outbreaks”
- Using a multi-type branching process , we developed an inference framework to make better use of age-structured outbreak size data.Page 9, “Discussion”
- In a single-type branching process framework, the threshold is a single number: the total size of the outbreak [31, 16].Page 9, “Discussion”
- We used a multi-type branching process to model secondary infections (see Text 81 for details).Page 12, “Offspring distribution”
- Estimates of R0 and relative susceptibility, S, when simulation model is a multi-type branching process with 15 age groups.Page 14, “Supporting Information”

See all papers in *April 2015* that mention branching process.

See all papers in *PLOS Comp. Biol.* that mention branching process.

Back to top.

Appears in 5 sentences as: confidence interval (4) Confidence intervals (1) confidence intervals (1)

In *Characterizing the Transmission Potential of Zoonotic Infections from Minor Outbreaks*

- Our estimate of S for monkeypox exhibited considerable uncertainty: the 95% confidence interval spanned 0.02—1.Page 7, “Application to real outbreaks”
- We found that the 95% confidence interval of the jointPage 7, “Application to real outbreaks”
- Dark line indicates 80% confidence interval (CI); light line is 95% Cl.Page 8, “Application to real outbreaks”
- However, the confidence intervals for our estimates were generally smaller (Table 3).Page 9, “Application to real outbreaks”
- Confidence intervals were calculated using profile likelihoods: for each value of R0, we found the maximum likelihood across all possible values of S; the 95% confidence interval was equivalent to the region of parameter space that was within 1.92 log-likelihood points of the maximum-likelihood estimate for both parameters [42].Page 13, “Inference”

See all papers in *April 2015* that mention confidence interval.

See all papers in *PLOS Comp. Biol.* that mention confidence interval.

Back to top.

Appears in 3 sentences as: parameter estimates (1) parameter estimation (2)

In *Characterizing the Transmission Potential of Zoonotic Infections from Minor Outbreaks*

- However, when we simulated 150 or 250 spillover events instead, the uncertainty in our estimates shrank, and we were able to obtain more precise parameter estimates (88 Fig).Page 9, “Application to real outbreaks”
- We tested the accuracy of parameter estimation when the transmission process was mis-specif1ed, and found that it was still possible to distinguish between different scenarios as long as transmission matrices in both the simulation and inference models were dominated by intense mixing between children.Page 10, “Discussion”
- For a higher dimensional model, it might be necessary to use an alternative technique, such as Markov chain Monte Carlo [41], to ensure robust and efficient parameter estimation .Page 13, “Inference”

See all papers in *April 2015* that mention parameter estimation.

See all papers in *PLOS Comp. Biol.* that mention parameter estimation.

Back to top.

Appears in 3 sentences as: simulated data (3)

In *Characterizing the Transmission Potential of Zoonotic Infections from Minor Outbreaks*

- We simulated data using different assumptions about age-specific infection rates but left the inference model unchanged.Page 6, “Estimating transmissibility and pre-existing immunity”
- Next, we simulated data using two ages groups, but with transmission based on the average number of reported physical contacts across 8 European countries in the POLYMOD study (S1B Fig.Page 7, “Estimating transmissibility and pre-existing immunity”
- Because we used simulated data to infer parameters, and hence had knowledge of the true model, we were also assuming that these contacts were reported accurately.Page 10, “Discussion”

See all papers in *April 2015* that mention simulated data.

See all papers in *PLOS Comp. Biol.* that mention simulated data.

Back to top.