Measuring unequal distribution of pandemic severity across census years, variants of concern and interventions

Background The COVID-19 pandemic stressed public health systems worldwide due to emergence of several highly transmissible variants of concern. Diverse and complex intervention policies deployed over the last years have shown varied effectiveness in controlling the pandemic. However, a systematic analysis and modelling of the combined effects of different viral lineages and complex intervention policies remains a challenge due to the lack of suitable measures of pandemic inequality and nonlinear effects. Methods Using large-scale agent-based modelling and a high-resolution computational simulation matching census-based demographics of Australia, we carried out a systematic comparative analysis of several COVID-19 pandemic scenarios. The scenarios covered two most recent Australian census years (2016 and 2021), three variants of concern (ancestral, Delta and Omicron), and five representative intervention policies. We introduced pandemic Lorenz curves measuring an unequal distribution of the pandemic severity across local areas. We also quantified pandemic biomodality, distinguishing between urban and regional waves, and measured bifurcations in the effectiveness of interventions. Results We quantified nonlinear effects of population heterogeneity on the pandemic severity, highlighting that (i) the population growth amplifies pandemic peaks, (ii) the changes in population size amplify the peak incidence more than the changes in density, and (iii) the pandemic severity is distributed unequally across local areas. We also examined and delineated the effects of urbanisation on the incidence bimodality, distinguishing between urban and regional pandemic waves. Finally, we quantified and examined the impact of school closures, complemented by partial interventions, and identified the conditions when inclusion of school closures may decisively control the transmission. Conclusions Public health response to long-lasting pandemics must be frequently reviewed and adapted to demographic changes. To control recurrent waves, mass-vaccination rollouts need to be complemented by partial NPIs. Healthcare and vaccination resources need to be prioritised towards the localities and regions with high population growth and/or high density.


Introduction
On 30 January 2020 the COVID-19 was recognised by the World Health Organisation (WHO) as a public health emergency of international concern: the WHO's highest level of alert.On 11 March 2020 this was followed by the WHO declaring the outbreak a pandemic [1].On 5 May 2023, that is, 170 weeks since announcing the global health emergency, the WHO declared an end to the emergency, while continuing to refer to the COVID-19 as a pandemic [2].Over this time, the COVID-19 pandemic has had a profound impact, causing significant loss of life, reducing life expectancy [3,4], seriously challenging healthcare systems [5], and adversely affecting socio-economic activity worldwide [6].By mid-June 2023, the pandemic had caused almost 800 million confirmed cases and 7 million confirmed deaths [7].
Over the course of pandemic, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) which causes COVID-19 has mutated from its ancestral strain into a number of lineages and sub-lineages, varying in terms of infectivity and virulence.Several of these variants have been designated by the WHO as variants of concern, including the highly transmissible lineages B.1.617.2 (Delta) [8,9] and B.1.1.529(Omicron) [10,11].In Australia, these variants have triggered significant new waves of the pandemic [12,13,14,15].
In response, the public health systems worldwide employed a diverse range of intervention policies.The initial reactions involved non-pharmaceutical interventions (NPIs) [16,17,18].Typically, the NPI interventions combined various components, such as border closures and travel restrictions, case isolation, home quarantine, school closures, and comprehensive "stay-at-home" orders comprising social distancing.Once safe and effective vaccines became available, this was followed by mass vaccination campaigns around the globe [19,20,21].Vaccination rollouts differed with respect to (a) population coverage, ranging from partial to nearly complete; (b) rollout schemes, e.g., preemptive, progressive, or boosting; as well as (c) different vaccine combinations, e.g., priority and general vaccines [21].Each vaccine had efficacy variations with respect to (i) the susceptibility-reducing efficacy, (ii) the disease-preventing efficacy, and (iii) the transmission-limiting efficacy [19,21].In addition, vaccine effectiveness was non-linearly diminishing over time [22].The practice of dealing with multiple variables, objectives and constraints confounded many potential effects of various intervention policies: for example, school closures have been found to contribute differently under different circumstances [18,23].
The combined effects of the evolving viral lineages and complex intervention policies have been difficult to systematically analyse, model and predict.For example, the persistence of Omicron variant in Australia and the resulting recurrent waves were explained by a nuanced combination of the new transmissible sub-variants, the fluctuating adoption of NPIs, and the waning immunity from prior infections and vaccinations [15].Importantly, such complex effects become sensitive to demographic variations in heterogeneous populations spanning different age groups, household sizes, socio-economic profiles and jurisdictions.In general, to study a pandemic which has lasted more than three years, one needs to account for demographic changes which play an increasingly salient role.This influence often remains concealed due to the lack of high-resolution data and presence of jurisdictional barriers, socio-political biases, and other factors [24,25].
Here, we aim to examine some of these public health challenges and carry out a systematic simulation-based analysis of several COVID-19 pandemic scenarios, using Australia as a case study.We use two most recent census years (2016 and 2021) as the alternative demographic settings within which each pandemic scenario is simulated.Our comparative analysis contrasts three variants of concern which made an impact in Australia: the ancestral strain, the Delta and the Omicron lineages.For every scenario, five representative intervention policies are compared, ranging from (1) baseline (i.e., no interventions), to (2) partial NPIs without vaccination, (3) partial preemptive vaccination without NPIs, (4) mixed intervention with both partial NPIs and partial vaccination, and (5) partial lockdown including school closures.
In order to compare 30 possible scenarios (2×3×5) we apply an agent-based model (ABM) which simulates an artificial population generated using the high-resolution Australian census data.The ABM has been previously calibrated and validated for several pandemic stages in Australia during the last four years [13,15,18,21,26].
This study identifies and explains, in context of different variants and policies, several coupled nonlinear effects of the population growth and heterogeneity on the pandemic severity.In particular, we study how pandemic peaks may be amplified by distributed changes in the population size or the changes in population density.
Importantly, the study introduces a novel measure of pandemic inequality -pandemic Lorenz curves -and demonstrates that the pandemic severity may be distributed unequally across local areas.For example, while the pandemic inequality may reduce when the population or the disease transmissibility grow, it may increase with more stringent interventions or in non-urban areas.We also measure pandemic biomodality, which characterises formation of distinct urban and regional pandemic waves.In addition, we measure bifurcations in the effectiveness of interventions, e.g., school closures, relative to variants of concern.

Agent-based modelling COVID-19 pandemic with Australian census
We simulated several scenarios of the COVID-19 pandemic in Australia using a well-established agent-based model previously validated for several pandemic waves and variants of concern [13,15,18,21].Our model has two fundamental components: (i) a simulated Australian population generated to represent key demographic features of the Australian census data, and (ii) a stochastic agent-based model for disease transmission and control, detailed in Appendix B and Appendix C respectively.
Our model comprises stochastically-generated anonymous agents that represent the population of Australia: about 23.4 million using 2016 census, and 25.4 million using 2021 census.The population is partitioned into Statistical Areas (SAs) at different resolutions, e.g., SA2 level represents suburbs.
Disease transmission follows a discrete-time simulation, updating states of each agent over time.Following initial infections "seeded" around international airports, the transmission is probabilistically simulated by considering agent interactions across multiple social layers (mixing contexts), given different contact and transmission rates within both residential and work/study contexts.Epidemiological characteristics and variant-specific natural history of the disease are described in Appendix C, and in Section 2.2.1.The transmission process is also affected by intervention policies, including NPIs and vaccinations.The transmission probabilities are modified across social contexts of the agents in response to their compliance with specific intervention policies.

Variants of concern
Our model has been calibrated to match key COVID-19 characteristics across three variants of concern: ancestral (i.e., the strain initially detected in Wuhan, which was prevalent in Australia in 2020) [18], Delta (i.e., B.1.617.2 variant, prevalent in Australia in 2021) [13], and Omicron (i.e., B.1.1.529variant, prevalent in Australia during 2022) [15].Over the last four years, these variants have not only evolved towards higher infectivity (i.e., higher basic reproductive number, R 0 ), but have also exhibited distinct characteristics in the disease natural history.In this study, we performed a lateral comparison of these three variants, each following a different natural disease history (as shown in Figure 1), combined with the intervention policies described in Section 2.2.2, and using the data from two most recent census years (2016 and 2021).See Appendix C for a detailed parameterisation of the relevant epidemiological characteristics.

19.56
Figure 1: Model of the natural history of three COVID-19 variants: ancestral (blue), Delta (green), and Omicron (red).The illustrated profiles are sampled from 2 random agents.Each profile rises exponentially until reaching the infectivity peak, followed by a linear decrease until full recovery.Vertical lines mark the mean incubation period for the three considered variants (ancestral: blue, Delta: green, and Omicron: red), with the means following a log normal distribution.The mean incubation period and recovery period for each of the variants are reported in Appendix Table C.2.The inset shows R 0 of the three considered variants.

Intervention policies
NPIs considered in our model include case isolation (CI), home quarantine (HQ), social distancing (SD), and school closures (SC), each affecting different agents based on their health states (infected or susceptible), age groups (schoolaged or not), and household compositions (if there is an infected household member).
In simulating pandemic scenarios, we assumed a moderate level of preemptive vaccination coverage of the population (50%), accounting for the combined effects of a relatively high vaccination coverage in Australia [27] and low diminishing vaccine efficacy [28].In line with prior studies [13,15,21], the vaccination scheme distributes two types of vaccines (priority and general), each with the vaccine efficacy defined in terms of reducing susceptibility, preventing symptoms of the disease, and limiting further transmission, as described in Appendix Section C.3.
We simulated five specific policies, each with a different combination of preemptive vaccination coverage and NPIs.These policies cover a wide range of scenarios, starting from the "live-as-usual" intervention-free scenario to the lockdown-like scenario with strong restrictions limiting population mobility and social interactions, as shown in Figure 2. Specifically, these policies can be summarised as follows: • Policy 1 -Baseline: no NPIs and no preemptive vaccination coverage, representing a scenario without any interventions.
• Policy 2 -Partial NPIs: some NPIs implemented (CI, HQ and SD at 70% compliance level), with no preemptive vaccination coverage.This represents a pandemic intervention scenario feasible without vaccines.
• Policy 3 -Partial vaccination: 50% preemptive vaccination of the population before a pandemic wave.This represents a scenario developing in a population with partially acquired immunity, but without any restrictions on social interactions during the pandemic.
• Policy 4 -Mixed intervention: some NPIs implemented during a pandemic wave (CI, HQ and SD at 70% compliance level) assuming that 50% of the population has been preemptively vaccinated prior to the pandemic wave.This represents a scenario with partial acquired immunity in the population, followed by further restrictions on social interactions during the wave.
• Policy 5 -Partial lockdown: all NPIs implemented (CI, HQ, SD at 70% compliance level, and SC in addition) and 50% preemptive vaccination of the population.This scenario represents a lockdown of the partially immunised population using strong restrictions on social interactions. -

Lorenz curves: measuring unequal distribution of pandemic severity
Different communities may experience impacts of interventions in significantly different ways, and these complex effects may not uniformly aggregate into the national pandemic dynamics (e.g., nationwide incidence and cumulative incidence).To examine distribution of the overall pandemic severity across different local areas, we introduce a novel technique based on Lorenz curves.
Lorenz curves, initially proposed to evaluate the degree of inequality in wealth distribution in society [29], have since been applied in many other domains, such as economics [30,31], underpinning the well-known Gini index which measures income/wealth distribution across a population [32], and biology [33,34].In this study, we proposed and constructed the pandemic Lorenz curves that capture inequality in the distribution of cumulative incidence at the SA2 level, and compared their shapes across the considered scenarios.The pandemic Lorenz curve dissects the nationwide cumulative incidence at the SA2 level, tracing it across all SA2 areas and assessing their relative contribution to the pandemic severity.
Figure 3 shows a simplified example to demonstrate possible shapes of pandemic Lorenz curve where the x-axis represents the cumulative fraction of SA2 residential population, ranked by their local attack rate (cumulative incidence over the SA2 residential population), and the y-axis represents the fraction of cumulative incidence at the 'global' national level, contributed by the corresponding fraction of SA2 residential population.Appendix D provides a detailed explanation of the Lorenz curves introduced in our study to measure the unequal distribution of pandemic severity.

Results
We present our results in three parts matching our key objectives, firstly examining the effects of population heterogeneity on pandemic severity across two census years (Section 3.1), then exploring pandemic spread, under the considered policies, in terms of urbanisation (Section 3.2), and finally examining varying effects of a specific intervention policyschool closures -across the three considered variants of concern (Section 3.3).

Effects of population heterogeneity on pandemic severity
We related the population heterogeneity with the pandemic severity observed in simulated scenarios across different intervention policies.Here, we measured the pandemic severity as the normalised incidence per million (unless specified otherwise), computed as the ratio between the detected incidence cases to the total population (in millions) for the considered census year.We assessed the population heterogeneity in terms of population increase at the 'global' national level (Section 3.1.1)and at the 'local' SA2 level (Sections 3.1.2and 3.1.3).Appendix A provides more information about the population structure captured by the Australian Bureau of Statistics (ABS).

Population growth amplifies pandemic peaks
Our results show that the population growth, i.e., the 8.6% population increase between 2016 and 2021, produced a non-linear response effect on the pandemic severity.If the pandemic severity was proportional to the population growth, the relative change in incidence between the two years could be expected to be zero (i.e., flat line in Figure 4, bottom row).However, comparison of the simulated pandemic scenarios between 2016 and 2021 produces a relative change in normalised incidence which non-trivially diverges from a flat profile (Figure 4, bottom row), especially in scenarios with less stringent policies (e.g., policies 1, 2 and 3).It is clear that the divergence is positive (i.e., a higher normalised incidence in 2021, indicating an amplified non-linear response to population growth) around the incidence peak, followed by a negative oscillation (i.e., a lower normalised incidence in 2021, suggesting a negative response to population growth) after the peak.Figure 5 (top row) directly contrasts the pandemic profiles across two census years.This non-linear response pattern (an early amplification compensated by a late negative oscillation) is manifested for the baseline scenarios across all variants of concern, as well as the scenarios with less stringent policies for the Omicron variant, as shown in Figure 4 (bottom row).In general, the amplification effect is more notable in the scenarios associated with more transmissible variants (Figure 5).Specifically, for the Omicron variant, the incidence peak for Policy 1 (baseline) simulated for 2021 census data (Figure 5 (a), red profiles, solid and dashed lines), is 12.21% higher than its counterpart produced for 2016 census.This non-linearity arises due to the population distribution which non-uniformly affects the pandemic severity, amplifying the peak incidence (see the following subsection).

Changes in population size amplify incidence peak more than changes in density
To investigate possible causes that have contributed to the non-linear response effects observed at national level, we considered the demographic changes which occurred between the two census years at SA2 level.The higher resolution offered at this level enabled us to trace how the local areas contribute to the disproportionate response in relative incidence between the two census years, 2016 and 2021 (see Appendix A for the demographic statistics and structure captured by census data).Specifically, we examined changes in the incidence peaks across 2,147 SA2 areas which were registered in both 2016 and 2021 census years.In doing so, we used the baseline scenario implementing Policy 1 (no interventions) and correlate the peak incidence with specific demographic changes: the residential population size and the population density at SA2 level.
We found that there is a strong positive correlation between the changes of peak incidence and the changes in the "usual residential" population.This finding is supported by high correlation coefficients across all variants (0.64 ≤ r ≤ 0.91, as shown in Figure 6).That is, an SA2 with a greater net population influx in the five-year period between 2016 and 2021 is highly likely to have a higher spike in the incidence peak.The impact of population increase is further amplified for highly transmissible variants, resulting in a greater peak incidence difference shown by the steeper slope observed for the Omicron variant (Figure 6, in red).In addition to the correlation with the usual residential population, we also found a weaker positive correlation (0.34 ≤ r ≤ 0.45) between the changes in peak incidence and the changes in population density, suggesting that SA2 areas which develop a higher population density may also have a higher incidence peak across all variants (Appendix Figure E.2).

Pandemic severity is distributed unequally across local (SA2) areas
Using Lorenz curves, we examined how the local (SA2) areas are impacted by different intervention policies across two demographic settings representing two census years.Comparing across two census years, we note that Lorenz curves produced for 2021 census are closer to the line of equality (i.e., the diagonal line) for the less transmissible variants (i.e., ancestral and Delta).This suggests that for these two variants, the SA2 areas contribute to the aggregate attack rate at the national level more equally in 2021 scenarios than in their 2016 counterparts.
For the ancestral and Delta variants, more stringent intervention policies led to a more diverging contribution pattern (i.e., Lorenz curve shaped further away from the line of equality).For example, for the Delta variant, using 2016 census data, we found that the increase in the fraction of cumulative incidence from 20% to 80% under Policy 1 (Figure 7 (a) top row, blue profile) is attained by an equal increase (i.e., 20% to 80%) in the population ranked by their local SA2 attack rate.However, the same increase in the incidence fraction (i.e., 20% to 80%) under Policy 4 (Figure 7 (a) top row, purple profile), was attained by a smaller set of SA2s comprising only approximately 40% to 85% of the population ranked by their local attack rate.This finding indicates that the pandemic severity is distributed more unequally under more stringent interventions (including NPIs and vaccinations).Specifically, SA2 areas with a higher local attack rate (i.e., ranked higher on x-axis) account for a higher fraction of the cumulative incidence.
We also note that the unequal distribution of pandemic severity diminishes for the highly transmissible Omicron variant (shown in Figure 7 (c)) where all simulated intervention policies failed to adequately slow down the transmission.This resulted in all SA2 areas contributing equally to the national-level cumulative incidence, with all Lorenz curves overlapping with the line of equality.

Effects of urbanisation on pandemic spread
The bimodal pandemic dynamics is characterised by the incidence peak (attributable to predominantly urban spread), followed by an inflexion point which shapes around a smaller secondary wave (attributable to mostly regional and rural areas) [35].Typically, the first wave triggered by the international air traffic is rapidly shaped in urban populations concentrated near international airports.In contrast, the pandemic progression into non-urban regions (i.e., areas further away from international airports) is significantly slower.The confluence of these factors resulting in bimodal dynamics in Australia was detected and described in context of simulating the pandemic influenza using 2011 and 2016 census data [35].Our simulations of the COVID-19 pandemic scenarios in this study also produced bimodal dynamics, especially for the ancestral and Delta variants using 2016 census data (Figure 8).However, the bimodality is less prominent in scenarios using 2021 census data, indicating a shift of the pandemic dynamics between urban and regional regions.This observation is also supported by the pandemic dynamics examined at the SA2 level.Appendix Figure E.1 shows that in comparison to the simulated results for 2016 census, the time gap between the first and second pandemic waves (defined in terms of the number of SA2 areas that peaked on a given day) for 2021 census has become shorter at the SA2 level.These findings indicate that there are more intricate structural demographic changes in the population which occurred between 2016 and 2021, beyond a uniform population growth.
To explain this phenomenon, we assessed the pandemic progression during the baseline scenario (Policy 1), by tracing pandemic waves in urban and non-urban SA2 areas (see Appendix Table E.2 for the population statistics).Figure 8 compares the Greater Capital Cities (GCCs) against other areas (i.e., all remaining SA2s).We note that the two distinct pandemic progression profiles (urban and non-urban) are mostly separated by the initial conditions: the national incidence peak is largely attributed to urban SA2 areas while the inflexion point observable at a later stage of the pandemic is caused by a secondary wave emerging in non-urban SA2 areas.For the baseline scenarios traced for two census years, our results showed that the bimodal dynamics is diminishing from 2016 to 2021 data.This can be explained by two factors: (i) a higher incidence peak occurring in both GCCs and other urban areas, and (ii) an accelerated pandemic progression into non-urban areas (or at least, into areas outside of GCCs).This effect was observed for scenarios across all variants of concern, with more transmissible variants showing a greater increase in the incidence peak.This observation suggests a faster pandemic spread for 2021 scenarios, indicating that the urbanisation increased over the five-year period, captured by the census statistics (see Appendix Section A), reducing the bimodal dynamics.
We also observed that the bimodality is weakened for the Omicron variant.This is explained by a reduced difference in the peak incidence timing between GCCs and non-GCCs.In other words, the two waves, urban (primary) and non-urban (secondary), tend to merge into a primary significant wave with a single incidence peak.This can be attributed to the high transmissibility of the Omicron variant which suppresses the impact of population heterogeneity.When the transmission was adequately slowed down (due to interventions, for example, implementing Policy 4 and Policy 5), the two pandemic waves became more separable, leading to notable bimodality (Appendix Figure E.4).
It is also worth noting that we do not consider re-infections in these scenarios.In other words, a higher incidence peak occurring earlier in the simulation corresponds to a reduction of the susceptible population, thus exhausting the susceptible population sooner and consequently weakening bimodality.

Effects of school closures across variants of concern
Finally, we examined the effects of school closures across three variants of concern by comparing pandemic scenarios between Policy 4 (Mixed intervention) and Policy 5 (Partial lockdown).Our results suggested that the effectiveness of school closures varies significantly for different variants.The effects of school closures were most prominent for the Delta variant with a two-order reduction in peak incidence (from over 1,000 cases per million to under 10 cases per million), resulting in a sharp difference shown in Figure 9 (b).Such a bifurcation was observed only in scenarios for the Delta variant and was not detected for variants with either lower R 0 (i.e., ancestral variant, Figure 9 (a)) or a higher R 0 (i.e., the Omicron variant, Figure 9 (c)).
For the ancestral variant, Policy 4 was sufficiently effective in containing the spread with new cases kept at a very low level.Although school closures could further reduce the peak incidence, the reduction (from around 2 cases per million to around 0.6 cases per million) would be marginal.This could be observed for both 2016 and 2021 census years (Figure 9 (a)).
For the Omicron variant, school closures delay the incidence peak by approximately 25 days with a sizable reduction from nearly 9,000 cases per million to under 6,000 cases per million (Figure 9 (c)).This reduction, however, would still be insufficient to curb the spread.This finding suggests that school closures could slow down the spread of the Omicron variant to some extent but would be inadequate for suppressing incidence, due to the extremely high R 0 of the Omicron variant.
Overall, these observations suggest that when coupled with NPIs and partial vaccination, school closures can be a highly effective policy for pandemic suppression of variants comparable in transmissibility with the Delta variant.These benefits, however, may not eventuate for either significantly less or significantly more transmissible variants of concern.

Discussion
In this study, we carried out a systematic comparison of pandemic scenarios across census years, variants of concern and intervention policies.The considered scenarios combined five different intervention policies with three dominant variants impacting Australia between 2020 and 2023 (ancestral, Delta, and Omicron), using a surrogate population generated based on 2016 and 2021 Australia census data.Our simulation results revealed that the population growth and heterogeneity non-linearly affect pandemic dynamics, amplifying the peak incidence for all variants of concern.These non-linear response effects highlight a complex interplay between the demographic and pandemic characteristics, which may amplify pandemic impacts on continuously growing populations worldwide.
Firstly, we focused on the Australian population which grew by 8.6% between 2016 and 2021 (net gain of nearly 2 million people).However, the population growth has been distributed non-uniformly across the country, affecting both local residential population size and population density.Such unevenly distributed demographic changes directly affect pandemic progression, and our results demonstrated that the estimation of pandemic severity cannot be accomplished by a simple scaling of the outcomes obtained for previous datasets, such as 2016 census data.This strongly suggests that pandemic models and public health policies need to be frequently reviewed and adapted to changes in the population heterogeneity, especially when dealing with long-lasting pandemics similar to COVID-19.We showed that the effectiveness of intervention policies differed across variants of concern (Figures 4 and 5).Coupled with the observed amplification effects of demographic changes (Figure 6), these findings call for specific interventions aimed to reduce the amplification effect, e.g., prioritising (1) the SA2 areas with a higher population growth, and (2) the SA2 areas with a higher density increase.
At the same time, some policy-related findings were independent of the considered variants or demographics.In particular, a partial preemptive vaccination rollout with 50% coverage and limited vaccine efficacy (Policy 3) can be effective only in a combination with partial NPIs, comprising a mixed intervention (Policy 4).Applied on its own, Policy 3 was unable to control the pandemic spread across all considered scenarios, as shown in Figure 4. Thus, the ongoing and future immunity boosting vaccination rollouts need to aim at high population coverage comparable with original mass-vaccination campaigns, while still being complemented by partial NPIs.Similarly, partial NPIs (Policy 2) cannot provide a principled solution on its own.While managing to control pandemic scenarios for the less transmissible ancestral variant, partial NPIs did not succeed in preventing sizable incidence peaks in scenarios for the Delta and Omicron variants, but only delayed these peaks (Figure 4).In general, such delays may be useful in rolling out booster vaccinations, reinforcing the point that a mixed intervention which combines partial NPIs and partial vaccination (such as Policy 4) may provide an adequate intervention.
Our study has also highlighted the "pandemic inequality", with certain SA2 areas contributing to the nationwide cumulative incidence stronger than others.We found that the pandemic inequality reduces when the population or the disease transmissibility grows (Figure 7).However, the opposite tendency -increasing pandemic inequality -was observed with more stringent interventions (Appendix Figure E.3) or in non-urban areas (Appendix Figure E.6).While this inequality was lesser in scenarios for the highly transmissible Omicron variant, the findings still suggest that a resource prioritisation scheme is needed, with the interventions targeting the localities and regions which have been experiencing a high population growth and/or developing a high density.
Secondly, we quantified the pandemic effects of urbanisation, distinguishing between the first wave affecting Greater Capital Cities and a delayed second wave developing in regional and rural areas (Figure 8).This bimodality indicates that, to be effective, the intervention efforts need to adapt during a pandemic progression, with the initial focus on metropolitan centres followed by a shift to non-urban areas, in anticipation of the corresponding peaks.Such geo-spatial redistribution of healthcare and vaccination resources needs to account for the transmissibility of dominant variants of concern, as more transmissible variants accelerate the pandemic, bring the urban and non-urban waves closer in time, and shorten the time between their peaks.
Finally, we evaluated the role of school closures in suppressing the pandemic transmission caused by different variants of concern.Crucially, we demonstrated that the effects of school closures are highly dependent on the dominant variant, with the more decisive effect observed only for the Delta variant (given the range of other policy-defining parameters, i.e., the vaccination coverage of 50% and the NPI compliance with 70% SD), as illustrated in Figure 9.This also reinforces the suggestions that policy makers should not assume that interventions will have the same effect across different variants.In addition, this highlights the possibility that some interventions can compensate others in specific circumstances: for example, when the preemptive vaccination or booster uptake is lower or slower than anticipated, school closures and stricter NPIs may be required to compensate for the lack of immunity.
In summary, the study highlighted the need for geo-spatially and demographically tailored, proactive and agile interventions, in contrast to general-purpose, reactive and rigid policies.

Limitations and future work
This study of pandemic severity did not include considerations of (a) socio-economic factors, and (b) disease burden in terms of hospitalisations, ICU occupancy and mortality.As demonstrated in our previous studies [13,15,26], these components can be included within an ABM study but substantially increase its scope.
Our simulations ran over a period of 196 days, without considering re-infections.Given the considered simulation horizon, this limitation has a minor effect discussed in our study of recurrent waves [15].
Our focus on the three dominant variants of concern rather than on their numerous sub-lineages (which may co-circulate) allowed us to distill some of the salient public health lessons.We believe these lessons would remain relevant across other sub-variants, including co-circulating ones.
Finally, we did not model the differences in vaccine efficacy across variants of concern (including ancestral, Delta, and Omicron variants).This should not impact the outcomes over the considered simulation horizon.Nevertheless, our model will be extended in near future, addressing these limitations (reinfections and multiple co-circulating variants with different immunity profiles).

Conclusion
In pursuing our objectives, we solved several methodological challenges, extending the range of applicability for agent-based pandemic modelling.Firstly, we incorporated the ABS census data for 2021, thus accounting for the most recent demographic information for Australia, in terms of the population structure, age distribution, household composition, and commuting flow patterns.This "upgrade" is important because previous similar studies used the data from the Australian census of 2016, scaling the modelling outcomes by approximately 10% to account for the larger population.Our results showed that the demographic changes over the five-year period contribute to the pandemic outcomes in more subtle non-linear ways that often cannot be captured by a uniform scaling.To study these nuanced contributions we employed Lorenz curves characterising an unequal distribution of pandemic effects.
Secondly, we addressed a well-known inconsistency between low-resolution and aggregated high-resolution census data brought about in 2016 by the ABS anonymity policy compliance system [36,37].In order to reduce this mismatch, we reconstructed a surrogate high-resolution 2021 commuter topology (Section B.2 of Supplementary Material).This allowed us to examine nuanced effects of the pandemic scenarios on urban and regional areas, and measure pandemic bimodality.
Overall, the extended ABM, coupled with the reconstruction techniques, offered a versatile approach to model comparative scenarios with multiple variants of concern, simulated across different demographic settings (census years) and for distinct intervention policies.A combination of census-based ABM and pandemic Lorenz curves provided a unique high resolution method to not only simulate different pandemic scenarios across varying demographics and variants, but also evaluate the unequally distributed effects of feasible intervention policies.This, in particular, allowed us to emphasise the divergent role of school closures as a complementary NPI -with respect to the disease transmissibility, exemplifying a bifurcation in the effectiveness of school closures.
In summary, the presented results illustrate how comparative analysis measuring distribution of the pandemic severity across different dimensions can help in improving public health preparedness and response to future pandemics.In particular, the study highlights that rigorous pandemic modelling can provide insights into the impact of complex demographic factors on the spread of infectious diseases over medium-to long-term.
of AMTraC-19.We also thank Vitali Sintchenko and Tania Sorrell for their insights on epidemiological and public health aspects of this study, as well as Tim C. Germann and Sara Del Valle for helpful discussions regarding pandemic agent-based modelling.We thankfully acknowledge the use of high-performance computing cluster, Artemis, provided by the Sydney Informatics Hub at the University of Sydney.

A.1 Census data structure
The 2021 census reports that the Australian population reached 25.4 million, following an 8.6% increase (2,020,896 million people) since 2016 census counts.There are eight states and territories in Australia, each with a capital city.In addition to the residential population, a wide range of other demographic features are also reported for each level.
The following features are incorporated into the artificial population generation: age, gender, household composition, and commuting patterns.See Section B for more details regarding how these features are incorporated in our model.
In order to reduce the risk of individual-or household-level identification, all data reported in the census are anonymised via several perturbation methods including, but not limited to, the removal of aggregated microdata falling under a set threshold.It is important to note that the ABS data perturbation policies may differ across datasets and may change between census years.

A.2 Mobility: travel to work
The short-distance commuting patterns are reported via the travel-to-work (TTW) dataset, detailing the accumulated number of employed individuals commuting between their place of usual residence (UR) and a place of work (POW).We use this dataset to construct a commuter mobility network (detailed in Section B.2). Commuter counts are accumulated on several statistical levels: • UR: reported at SA1, SA2, SA3, SA4 level • POW: reported at DZN, SA2, SA3, SA4 level • number of employed commuters (either full-time or part-time) Another level (DZN, destination zone) is used to describe the POW population at a resolution comparable to SA1 but following a different set of partition rules.Both SA1 and DZN accumulate to attain exact partitions at SA2 level.
The perturbation protocol that the ABS adopted for privacy protection removes some edges between UR and POW (that is, UR ↔ POW edges).This removal is done if the commuter counts are below a certain threshold.In our previous work [37], we showed that the edge removal causes large discrepancies in the edge counts between different statistical area levels.When generating the artificial population, we address this problem by minimising the discrepancies between different SA levels, improving consistency of the population representation across the levels (see Section B.2.2).

A.3 Urban structure and international air traffic
Australia is a highly urbanised society with heterogeneously distributed population as annotated, see Figure A.1.The eight capital cities, despite their rather small geographical area size, account for more than two thirds of the Australian residential population, while the remaining land (mostly regional areas) comprises only one third of the population.
The capital cities and their surrounding areas are defined by the ABS as the Greater Capital Cities (GCC) including the urban areas (city centres) and the areas in some proximity, to account for the residents who work, shop, and socialise within the city.We examine the urban structure, particularly GCCs, for two reasons.
Firstly, the population densities of GCCs are considerably higher compared to other areas of Australia, allowing for a greater social interaction from both residents and working commuters.This increased likelihood of mixing aggravates the spread of infectious diseases [42,43,44].In addition, both residential and working population densities continued to increase between 2016 and 2021 for all GCCs despite the brief disruption of migration and movement during the COVID-19 pandemic, as shown in Figure A.2. Furthermore, the urbanisation level in GCCs greatly varies across the country.Secondly, the local and international air traffic is concentrated in GCCs, increasing the potential of disease introduction by long-range travel (i.e., international sources) [45], as well as its subsequent spread.Due to Australia's geographic isolation, the international airports located in GCCs take the vast majority of international passenger inflow.We note that airports of several non-GCC cities, such as Newcastle (NTL) and Sunshine Coast (MCY), may also take a relatively small number of international incoming passengers, as shown in Figure A.4.The air traffic data is reported by the Australian Bureau of Infrastructure, Transport, and Regional Economics (BITRE) [46], and we use this data to scale the number of initial infections in proportion to the international air traffic inflow, following earlier studies [18,47].The actual traffic data in 2021 were severely impacted due to travel restrictions imposed during the COVID pandemic.For visual clarity, some Australian islands are not shown.Hobart is not annotated as there were no incoming international passengers to Hobart during the considered period (i.e., in 2019).The 3-letter code names for the annotated international airports are described in Table B.3.

B.1.3 Generation of simulated population.
While each agent is assigned with demographic and household attributes as detailed in earlier sections, the total residential population and household count in each SA1 area are constrained by the aggregate number per SA1 reported by census.We assign the attributes as following: • Step 1: For a given SA1 area, if the number of generated agents is less than the number of residential population reported by census, sample a household based on the CDF of households constructed from the ABS data.
• Step 2: Within each of the generated household, assign agents with genders and age groups (male/female adults and/or children) based on the corresponding CDFs.
For example, if the household instance sampled from the census data is a couple comprising two adults of opposing gender, with one child, the generated household will consist of three household members: • an agent representing the father in this household, generated with its age sampled from the male adult CDF of age groups, • an agent representing the mother in this household, generated with its age sampled from the female adult CDF of age groups, • an agent representing the child in this household, generated with its gender sampled from the CDF of genders for children, and its age sampled from the CDF of age groups for children corresponding to the sampled gender.
This process continues until all SA1 areas are considered and the aggregate artificial populations matches the population statistics reported by census.During this process, every three consecutively generated households form a household cluster, representing a residential context with close contacts around the agents' residency.
With the population generated at SA1 level, we follow the partitioning method that the ABS uses to merge SA1 population into SA2 population.We note that no agents are generated during this process as all SA1 areas are included within SA2 areas.We assume that SA2 areas are spatially distributed and the geographic distance between any pair of SA2 areas is calculated as the distance between the centroids of two SA2 areas defined by the ABS-provided shapefiles representing geographical boundaries.This spatial distribution plays a crucial role in generating school placements for children and working group placements for adults.
As a result, the generated aggregate artificial population is a "digital twin" of the Australian population, matching key census statistics, including the number of SA2 and SA1 areas, households, household clusters, and total agents.These statistics are summarised in Table B.1.Note that in addition to residential contexts, a number of other mixing contexts is also reported in Census.The 4th column shows the relative increase for each attribute over the five-year period.The population expansion results in an increase across all residential and mixing contexts.

B.2 Surrogate domestic commuting network
In addition to the residential mixing contexts (see Section B.1), our model also considers interactions across spatially distributed agents which occur in workplaces and schools.The mobility patterns describing commuting to places of work and education are represented by a realistic commuting network constructed using the census commuting data and student registration records from ACARA [48].In this network, we allocate population flows specifically capturing the student and worker flows between mutually exclusive sets, e.g., between the usual residential areas (SA1 areas) and the places of work (DZNs).In this section, we describe the construction of the student and worker flows in the commuting network in two parts: travel to school, applicable for agents from 5 to 18 years of age; and travel to work, applicable for agents over 18 years of age.

B.2.1 Travel to school.
In our surrogate population, nearly 18% of the population are school-aged children (from 5 to 18 years old), with the vast majority of them (approximately 95%) going to school on weekdays during day time and interacting with other school-attending children and teachers.These at-school interactions are enabled by integrating "student flows" into the commuting network.Each flow is set between the student residence (which has been previously assigned during the generation of simulated agents; see Section B.1) and the destination zones (i.e., schools).
Using the school enrollment numbers and locations reported by ACARA, the school locations are pseudodeterministically chosen within destination zones, while accounting for school capacity and geographic proximity to areas with a sufficient number of children and teachers.
The allocation of schools is based on the following assumptions.The probability of students attending a school is set in proportion to the geographic distance between the school and their residence, within the school's catchment zone.This process emulates the tendency of students to initially prioritise schools in the closest proximity.If schools nearby are unavailable, students are assumed to expand their search to nearby options within the maximum travel distance (150km).This ensures that the vast majority of students are assigned to schools by the end of the generation process.
Within each school, students are further organised into smaller groups representing grades and classes, where classes represent the primary units of regular interaction for students with the highest frequency of contacts, followed by grades.The contact rates in the school-environment are summarised in Table C.1.
Upon completion of the student-to-school assignments, we estimate the number of teachers working at a school based on the enrolment numbers of the school, the student-to-staff ratio (2:17) (approximated from historical ABS data), and the estimated number of three teachers per class.We then randomly sample from the ensemble of adults working in the school's DZN to match the estimated number of teachers.The remaining workforce within the DZN is then assigned to non-school-related working groups detailed in the following section.

B.2.2 Travel to work.
We then construct a travel-to-work network for the remaining working population, capturing daily frequent contacts at the workplace where each edge represents a worker flow between agents' usual-residential areas (UR, previously assigned as detailed in Section B.1) and the corresponding places of work (POW).Within the working population, each agent is also assigned to a working group with a maximum of 20 people at work to simulate a realistic working environment where a small group of individuals frequently interact on a daily basis.This assignment process matches the key census attributes reported in travel-to-work (TTW) data [41] (i.e., the census table for employment, income, and education representing the number of commuters between UR and POW).
The work-related census reports partition the population into multiple statistical area (SA) levels and Destination Zones (DZN), in ascending resolution.We note that DZN (i.e., the highest resolution for POW) does not align precisely with SA1 partitions (i.e., the highest resolution for UR).We use DZN and SA1 as two separate partitions simulated in different time cycles, daytime and nighttime respectively (see Section C below for more details).Each agent residing in a SA1 is randomly assigned to work in a DZN and the total number of registered commuters between the SA1 (considered as UR) and the DZN (considered as POW) must match the corresponding commuting volume reported by census.This process uses the high-resolution worker flow data between SA1 and DZN to construct a high-fidelity TTW network between all URs and POWs.However, the implementation of privacy-protecting policies adopted by the ABS heavily perturbed the data, resulting a sizable discrepancy between high-resolution data (e.g., SA1-DZN) to lower-resolution data (e.g., SA2-SA2).We consider and address this discrepancy in detail, as described below.
Inconsistencies in ABS data for travel to work.The census data contains inconsistencies between UR and POW data due to the adoption of new privacy protection measures, implemented within the ABS anonymity policy compliance system over different census years [36,37].Such inconsistencies produce a mismatch in the commuting flows (between UR and POW) reported at different SA partitions.These inconsistencies therefore require a reconstruction of some commuting flows between UR and POW, added to match the aggregate totals.
Specifically, the total number of commuters aggregated at a higher resolution, e.g., between SA1 (UR) and DZN (POW), is usually less than its counterpart obtained from a lower resolution, e.g., between SA2 (UR) and SA2 (POW), as shown in Figure B.1).This results from various ABS privacy protection measures aimed to avoid re-identification from high-resolution commuting data, particularly when commuting volume is low.As a result, the working mobility in 2016 census shows a 34% difference between the aggregated and true commuters [37].Here, we report a similar observation for 2021 census, with a difference of a 25.6% between the SA1-DZN and SA1-SA2 commuting datasets, and a 34.27% difference between the SA1-DZN and the total number of reported commuters in Australia (see Table B B.2: Summary of working flows between the usual residential (UR) and the place-of-work (POW), reported at different resolutions of statistical areas defined by census.At each resolution, the number of edges specifies the number of destinations (connections between these regions), while the number of commuters represents the aggregated number of commuters in all edges at this resolution.The last column shows the discrepancy (a reduction measured in percentage) between the specified resolution and the reference resolution.Note that census data are pre-processed, with all non-geographic regions (e.g., "Migratory/offshore/shipping" and "No usual address") excluded.
Prior work resolving inconsistencies in 2016 census.In our prior work [37] we proposed an algorithm that uses a statistical re-sampling process to generate and add new edges to the SA1-DZN commuting network.This considerably improved the consistency of commuting flow across different resolutions.This algorithm used the commuting data from the older 2011 census as a baseline (existing prior to the implementation of privacy protection protocols), to supplement the commuting data in different resolutions in 2016 census.The main steps of the algorithm are summarised as follows: • Step 1: Calculate the historical (2011) distribution P (w|N SA1 ) of the number of commuters between UR (SA1) and POW (DZN) (i.e., edge weights of the edges (SA1, DZN) in the TTW network, denoted by w), given a working population size in SA1 level (denoted by N SA1 ).• Step 2: Sample additional edges for each UR unit (SA1) based on P (w|N SA1 ).The list of partial candidate edges, L = {(SA1 1 , w 1 ), ..., (SA1 n , w n )}, is generated.The total sampled edge weights for an SA1 must be bounded by the inconsistencies between SA2 and SA1 levels (i.e., the difference between the number Each subplot traces combinations between UR and POW for all possible resolutions.Across the UR ↔ POW flows, considered at comparable or higher resolutions (i.e., SA3-DZN, SA2-DZN, and SA1-DZN), the edge numbers are noticeably lower towards the right hand side on x-axis, with a reduced number of commuters (y-axis).
of commuters residing in an SA2 and the aggregated number of commuters residing in the corresponding decomposed SA1s).• Step 3: Assign workplaces (DZN) for the newly sampled edges, i.e., to complete the list L: L = {(SA1 1 , DZN 1 , w 1 ), ..., (SA1 n , DZN 1 , w n )}.These assignments are constrained by the number of remaining unassigned commuters in each DZN and in the SA2-SA2 commuting network when compared with the SA1-DZN network.• Step 4: Eliminate reclaimed edges (i.e., the SA1-DZN edges that the ABS already declared) and concatenate the remaining generated edges with the existing edges in 2016 census.
An optimised approach to address the inconsistencies in 2021 census.We optimised the algorithm originally presented in [37] and applied the optimisation to construct the travel-to-work network using 2021 census data.The optimised algorithm reduces the discrepancy in SA1-DZN travel-to-work data by using the lower resolution commuting data as a reference point, since the privacy protocols mostly affect the higher resolution commuting data.For example, there is only a marginal reduction (1.42%) in the number of commuters observed in the SA2-SA2 network, compared to a significantly more sizable 34.27% reduction observed in the higher-resolution SA1-DZN network, as summarised in Table B.2.
The optimised re-sampling algorithm reconstructing the surrogate TTW network is detailed in Algorithm 1, with the changes to the original algorithm [37] highlighted in blue.Similar to the original approach, in the optimised algorithm, a number of unassigned working-age agents in an SA1 i region is selected at random and assigned to a random working group in a DZN i region.The assignment of the residential area and the destination zone is constrained by the edge weight w i of the corresponding edge (SA1 i , DZN i , w i ) in the surrogate network.We introduced several additional constraints to check the eligibility of newly generated edges (step (3) of Algorithm 1), before adding them to the edge-list of the surrogate network.This modification ensures that the sampled artificial edges are not duplicates of existing edges and improves the quality of the final surrogate edge-list.This however had a cost of significantly lengthening the processing time relative to the original algorithm [37].To speed up the computation, we adjusted the sampling process by sampling a uniformly random number of edges (rand(N )) in the existing edge-list for each DZN at a time.This method significantly reduced the computational cost.
Comparison of the network reconstruction results.Our optimised algorithm significantly improved the mapping between SA2-SA2 and SA1-DZN networks from 69.77% (between the census SA2-SA2 and SA1-DZN networks) to 92.87% (between our reconstructed SA2-SA2 and census SA2-SA2 networks), thus accounting for missing edges.It provided further improvement in comparison to the prior algorithm [37], which reconstructed 90.66% of the mapping between census SA2-SA2 and SA2-SA2 networks, as shown in The optimised algorithm reconstructs the SA2-SA2 surrogate network with a smaller mean squared error (MSE) and a higher correlation with the true commuter flow (i.e., census SA2-SA2 network), as shown in We note that although our solution effectively rectified most of the discrepancies between SA1-DZN and SA2-SA2 networks using the re-sampling procedure, it cannot provide a complete mapping between the SA2-SA2 network and

B.3 International air traffic
In our model, the initial infection cases are seeded proportionally to the international air traffic inflow.The scenarios for 2016 use the BITRE air traffic data for this year without any modifications.In contrast, to simulate the pandemic scenarios for 2021, we estimate the international air travel for this year based on the BITRE air traffic data between 2003 and 2019.We intentionally do not use the air traffic data between 2020 to 2021 due to the severe disruption of international travel due to the travel restrictions during the COVID-19 pandemic.
We approximated the daily average number of incoming international passengers (IIP) at the international airports across Australia 2021 by utilising the international air traffic data up to 2019 [46] which reports the total number of IIP for a given year (reported by June) at various international airports across the country.We then extrapolated the number of IIP in this period by using a weighted multimodal linear regression model.We placed a higher weight (γ = 0.8) on more recent data points and constructed the linear regression model which produced the most optimised estimates with small weighted losses.We note that while larger international airports (e.g. g. Sample for rand(N) candidate edges without replacement: (SA1 i , DNZ i , w i ) from L f .Add sampled edge(s) to O.

Mixing context
Type of interaction Daily transmission probability (q g j→i ) Household (size 2) Any to child (0 - where σ(i) is defined based on the age of agent i using a piece-wise approach: for adults (age > 18), σ a = 0.67; and for children (age ≤ 18), σ c = 0.268, following previous studies [13,15,18,21].Asymptomatic agents in the model have a lower infectivity compared to their symptomatic counterparts, governed by a scaling factor of α asymp (0 ≤ α asymp ≤ 1).
In reality, not all infection cases are detected, particularly when case detection relies on voluntary self-reporting.To capture this feature, the model employes two variables adjusting the extent of case detection as the probability of detecting symptomatic cases (π symp ) and the probability of detecting asymptomatic cases (π asymp ).
In this study, we model the transmission of three COVID-19 variants of concern in a wide of range of scenarios in combination with different intervention policies.Some epidemiological parameters are set differently across the three variants, while others are kept constant for all variants.A summary of parameterisation is provided in Table C.2.

C.2 Non-pharmaceutical interventions
The model incorporates various non-pharmaceutical interventions (NPIs) aimed to mitigate SARS-CoV-2 transmission, including case isolation (CI), home quarantine (HQ), school closures (SC), and social distancing (SD).Each NPI is characterised by: (i) macro-distancing level, setting a fraction of the population that adopts, or complies with, the intervention, and (ii) micro-distancing parameters which quantify the adjusted (usually reduced) interaction strengths General, V E c 0.5 [49] General, V E s 0.293 derived Table C.4: Simulation parameters for the preemptive vaccine rollout simulated within the ABM.
Moderna; and the "general" vaccine with a medium efficacy (General, V E c = 0.5), representing ChAdOx1 nCoV-19 (Oxford/AstraZeneca).These values align with clinical observations following booster vaccination against the Omicron variant [49].For simplicity, we use the clinical vaccine efficacy against Omicron as a reference point for all considered variants, aiming to investigate the impact of imperfect vaccines in a comparative way.
We further decompose V E c into two efficacy components: the susceptibility-reducing efficacy (V E s ) and the diseasepreventing efficacy (V E d ), following [13,15,21]: with the efficacy components set as specified in Table C.4.
We also consider the transmission-limiting efficacy (V E t ) set at V E t = 0.4, observed for the considered vaccines against the Omicron variant [50].We carried out the sensitivity analysis in terms of vaccine efficacy components in prior studies [13,21], testing a range of V E t and V E c values and showing robustness of the model to changes in these efficacy components.The model parameterisation with respect to vaccination is provided in Table C.4.
The transmission probability of infecting a susceptible agent i, accounting for both vaccination and NPIs, is derived as follows: where for all vaccinated agents j: V E t j = V E t , V E s j = V E s and V E d j = V E d , and for all unvaccinated agents j: V E t j = V E s j = V E d j = 0.The values of V E s , V E d , and V E t are selected according to the vaccine type (priority or general) allocated to the agent.
The probability of an agent experiencing illness (symptomatic) is further influenced by the disease-preventing efficacy, (V E d i ), as follows: given the fractions of symptomatic adults and children, denoted by σ a and σ c respectively.

C.4 Sensitivity analysis
We have carried out an extensive sensitivity analysis for a wide range of model parameters and multiple variants of concern, including different profiles of the disease natural history [13,18,26], the global transmission scalar [15,18,26], the asymptomatic infectivity [15,18,26], the fractions of symptomatic adults and children [13,15,18,26], the NPI compliance levels (e.g., SD compliance [13,15,18]), the SD intervention threshold [26], the interaction strength in different mixing contexts (at home, work, or community [18]), and the vaccination coverage and vaccine efficacy components [18,21].These studies demonstrated robustness of the model to parameter changes within acceptable ranges supported by epidemiological evidence.
Changes in population size amplify incidence peak more than changes in density.

E.2 Effects of urbanisation on pandemic spread
Over the last five years, the Australian population has increased in both urban and non-urban (i.e., regional and rural) areas, leading to redefinition of SA2 boundaries (see Table E.2).There are several non-GCC cities in Australia with international airports (see Figure A.4 in Supplementary Material, shown in green).The inclusion of the SA2 areas surrounding these cities into analysis of the urban areas does not affect the observations reported in Section 3.2 of main manuscript.In this section, we compare the effects of growing urbanisation on pandemic dynamics for Greater Capital Cities (GCC) considered without and with cities having international airports (AP), as shown by Figures E. 4

Figure 2 :
Figure 2: Five simulated intervention policy scenarios.PRE-VAC: preemptive vaccination prior to the pandemic.NPIs: non-pharmaceutical interventions.Policies are considered to be more stringent moving from left to right.The macroand micro-parameters for NPI-related policies are summarised in Appendix Table C.3.Parameters relating to the vaccination coverage and vaccine efficacy are summarised in Appendix Table C.4.

Figure 3 :
Figure 3: Pandemic Lorenz curves measuring inequality in distribution of the pandemic severity.The black line represents the line of equality where each SA2 contributes equally to the cumulative incidence.A curve closer to the line of equality (i.e., Lorenz Curve A, shown in red) indicates that the contributions of SA2 residential areas towards the aggregate cumulative incidence in response to a specific intervention policy A are more equally distributed than the contributions of these areas under policy B which are traced by the curve shaped further away from the line of equality (i.e., Lorenz Curve B, shown in blue).

Figure 4 :
Figure 4: Impact of different intervention policies on pandemic severity for three considered variants simulated for two census years (top row: 2016; middle row: 2021; bottom row: relative change between years).Each column compares the impact of five intervention policies for one variant of concern: (a) ancestral; (b) Delta; (c) Omicron.See Figure 2 for a detailed description of the considered intervention policies.Coloured shaded areas around solid lines show standard deviation.Each profile corresponds to one intervention policy and is computed as the average over 100 runs.

5 Figure 5 :
Figure 5: A comparison of pandemic severity for different policies across three considered variants (ancestral: blue; Delta: green; Omicron: red) and two census years (solid line: 2021; dashed line: 2016).The severity of each variant is measured by cases per million (top row).The change in incidence (bottom row) is calculated as the difference of incidence cases per million between two census years.Each column compares the impact of three variants for one intervention policy: (a) Policy 1; (b) Policy 4; and (c) Policy 5. See Figure 2 for a detailed description of the considered intervention policies.Coloured shaded areas around solid lines (in bottom row) show standard deviation.Each profile (solid and dashed lines) corresponds to one intervention policy and is computed as the average over 100 runs.

Figure 7 (
a) and (b) show that for the baseline scenario without any interventions (Policy 1) the pandemic effects are distributed equally across the areas, for all considered variants simulated for both census years (Appendix Figure E.3 provides a different layout of the same results).

Figure 6 :
Figure 6:  Positive correlation between the usual residential population difference and the peak incidence difference between 2016 and 2021 at SA2 resolution for three considered variants: Ancestral (blue), Delta (green), and Omicron (red).Dashed lines represent linear fitting for each of the profiles (see Appendix TableE.1 for statistical analysis).Data points corresponding to each SA2 are computed as the average over 100 runs.Total number of overlapping SA2 between 2016 and 2021 census years: 2147.Pearson correlation coefficients: r Ancestral = 0.7717, r Delta = 0.6447, r Omicron = 0.9002.

Figure 7 :
Figure 7: Pandemic Lorenz curves measuring distribution of pandemic effects across SA2 areas for considered variants, years and policies.Each column compares the impact of five intervention policies for one variant: (a) ancestral; (b) Delta; (c) Omicron.Top raw: 2016; bottom raw: 2021.Refer to Figure2for a detailed description of the considered intervention policies.Each profile corresponds to one intervention policy and is computed as the average over 100 runs.

Figure 8 :
Figure 8: Comparison of pandemic waves in Greater Capital Cities (GCCs) and all other areas.Each column compares the baseline scenario (Policy 1) for 2016 and 2021 census data, for a variant of concern: (a) ancestral; (b) Delta; (c) Omicron.Each profile is computed as the average over 100 runs.

Figure 9 :
Figure 9: Effects of school closures combined with NPIs for three considered variants (log scale): (a) ancestral; (b) Delta; (c) Omicron.School closures effectively control the spread of Delta variant, producing a sharp difference in incidence.Such a bifurcation is not observed for the ancestral and Omicron variants.Each profile corresponds to one intervention policy and is computed as the average over 100 runs.Appendix Figure E.7 shows these plots on linear scale.
Figure A.1 shows the geographical representation of the states and their corresponding capital cities which comprise about two thirds of Australian population.

Figure A. 1 :
Figure A.1: Snapshot of the Australian population: locations and geographical areas.Capital city of each state is annotated with their usual residential population (P) and geographical area (A) in 2016 (in red) and in 2021 (in blue) respectively.The state names are abbreviated as follows: Western Australia (WA), Northern Territory (NT), South Australia (SA), Queensland (QLD), New South Wales (NSW), Victoria (VIC) and Tasmania (TAS).Note that changes in the size of geographical areas are due to boundary changes.
Figure A.2 (a) and Figure A.3 illustrate a stark contrast between the population-dense GCCs (Sydney, NSW GCC; and Melbourne, VIC GCC) and cities with noticeably lower population density, e.g., Hobart (TAS GCC).This large variability also suggests a high heterogeneity within the GCC population.

Figure A. 2 :Figure A. 3 :Figure A. 4 :
Figure A.2: Comparison of residential population density and working population density in different states and territories (ST/T) of Australia partitioned by GCCs (a), and other areas (b).Note that the population density scale in (b)is more than 50 times lower than that in (a).The Australian Capital Territory (ACT) predominantly comprises urban areas and thus does not show a comparable density for non-GCC areas.

Figure B. 1 :
Figure B.1: Visualisation of the discrepancies at different resolutions observed in 2021 census commuting data.Each subplot traces combinations between UR and POW for all possible resolutions.Across the UR ↔ POW flows, considered at comparable or higher resolutions (i.e., SA3-DZN, SA2-DZN, and SA1-DZN), the edge numbers are noticeably lower towards the right hand side on x-axis, with a reduced number of commuters (y-axis).

Figure B. 2 :
Figure B.2: Visualisation of commuting patterns between usual residences (URs) and places of work (POWs), using 2021 census data.Each commuting flow is represented by an edge connecting the centroids of the corresponding UR and POW.Red lines represent the commuting edges between URs (in SA1 level) and POWs (in DZN level) directly taken from 2021 census, while blue lines depict the reconstructed edges created by our algorithm to match the aggregate totals.
Figure B.3.a).The re-sampling process reduces the number of unassigned commuters in the SA2-SA2 TTW network from 3,469,391 to 835,534, as shown in Figure B.3.(b).
Figure B.3 (c) and B.3 (d).The resultant commuting patterns generated from 2021 census are shown in Figure B.2 with the Greater Capital Cities (GCC) annotated within each state.

Figure B. 3 :
Figure B.3: Results of network reconstruction using the 2021 TTW data.(a) Number of commuters between the reference TTW SA2-SA2 network (from ABS Data), the TTW SA1-DZN network (from ABS Data), the TTW SA1-DZN network (refined by the original algorithm of Fair et al. [37], adapted for census 2021), and the TTW SA1-DZN network (refined by the optimised algorithm).(b) Number of unassigned commuters, counted in SA2-SA2 network but not in SA1-DZN network, over refining iterations.(c) Mean squared error between the SA2-SA2 TTW network from ABS and the reconstructed SA2-SA2 TTW network from the refined SA1-DZN networks.(d) Pearson's correlation coefficient between the SA2-SA2 TTW network from ABS and the reconstructed SA2-SA2 TTW network from the refined SA1-DZN networks.
From [I,O]: determine the remaining unassigned working adults (not in D ∪ O) in DZN i : R(DZN i ).e. From [F,O]: determine the remaining unassigned commuters (not in D ∪O) in the UR-POW (SA2 UR,i − SA2 POW ) TTW edge: R(SA2 UR,i , SA2 POW ). f

( 4 )
Concatenate O and D to reconstruct the SA1-DZN TTW network.

Figure B. 4 :
Figure B.4: Weighted linear regression models for the considered airports across Australia.The incoming international passengers (IIP) numbers in previous years up to 2019 are shown in blue; the projected IIP numbers in 2021 are shown in red.Solid red line represents the fitted linear regression model.

Figure E. 2 : 6 .
Figure E.2:Correlation between the usual residential population density difference and the peak-incidence difference, computed between 2016 and 2021 at the SA2 resolution for the three considered variants: ancestral (blue), Delta (green), and Omicron (red).Data points corresponding to each SA2 area are derived as the averages over 100 runs.Dashed lines represent linear fit for each of the profiles.Total number of overlapping SA2 areas between 2016 and 2021 census years: 2,147.Pearson correlation coefficients: r ancestral = 0.4150, r Delta = 0.3426, r Omicron = 0.4509.These are lower correlation coefficients than the ones obtained for the population size difference, as reported in main manuscript, Figure6.

5 Figure E. 3 :
Figure E.3: Pandemic Lorenz curves for the considered policies across the three variants and two census years.Each row compares the impact of a policy using 2016 census (left) and 2021 census (right) for three variants.Refer to Figure 2 in the main manuscript for a detailed description of the considered intervention policies.Each profile corresponds to one intervention policy and is computed as the average over 100 runs.

5 Figure E. 4 :( 5 Figure E. 5 :
Figure E.4: GCCs versus other non-urban areas.Effects of urbanisation on pandemic dynamics for the considered policies, across three variants, using 2016 and 2021 census years.We partition SA2 areas in Australia as Greater Capital Cities (GCCs), and other non-urban areas.Each column compares the impact of five intervention policies for a variant of concern: (a) ancestral; (b) Delta; (c) Omicron.Each plot is averaged over 100 Table C.3.Parameters relating to the vaccination coverage and vaccine efficacy are summarised in Appendix Table C.4.
Each state is further divided into different levels of statistical areas (without gaps or overlaps) following a nested hierarchical framework of geographic levels[40].Each succeeding level is a breakdown of the previous level into smaller areas defined as follows: States and Territories (ST/T), Statistical Area Level 4 (SA4), Statistical Area Level 3 (SA3), Statistical Area Level 2 (SA2), Statistical Area Level 1 (SA1), and Mesh Blocks (MB).TableA.1 summarises the number of statistical areas at each level and the corresponding usual residential population range.Note that 2021 census reported more statistical areas across all levels due to changes in the partitioning approach.Due to merging and splitting of areas, the new partitioning approach resulted in a net increase of 7.5% for the number of SA1 areas and 7% increase for the number of SA2 areas.
Table A.1: The hierarchical levels in Australian census structure.The population for each level refers to the usual residential population, excluding visitors. .2).

Table B .
3 Sydney, Melbourne, Brisbane, and Adelaide, shown in FigureB.4) have a sustained inflow of IIP since 2003, small airports (e.g., Darwin) may not have a continuous IIP until the more recent years.In such cases, we could not produce an accurate weighted linear model due to the lack of data; instead, we estimated the IIP in 2021 by averaging the IIP in available years.A comprehensive summary of the estimation methods employed for each international airport is provided in TableB.3.The linear regression model for eligible airports is shown in Figure B.4. 3: Estimated number of incoming international passengers (IIP) in the international airports across in Australia in 2021, projected from the pre-pandemic BITRE data.The estimated IIP is used for the initial infection seeding (see Section3).An edge weight in the SA1-DZN travel-to-work network, representing the number of commuters living in a SA1 and working in a DZN.N SA1 : Total number of commuters living in a SA1, or working population size of a SA1.N SA1 is grouped into bins, used in the construction of P (w|N SA1 ).An illustrative example of the distribution P (w|N SA1 ) is shown below.Each column represents a distribution of edge weight given a bin value of N SA1 .Repeat until convergence.For each DZN i in I: a. From K: search for SA2 POW that DZN i locates.
(2) Construct lists of additional partial edge candidates (SA1,w) for each SA1 based on P (w|N SA1 ) and [C,G]:For each SA1 i listed in G: a. From C: determine its bin size N SA1i .b. From G: determine the number of working adults residing in SA1 i , but not assigned a DZN: R(SA1 i ).c. Determine P (w|N SA1i ) from step 1.d.Construct L(SA1 i ) = w i w i ∼ P (w|N SA1i ) wi∈L(SA1i) w i ≤ R(SA1 i )(3)Construct additional complete edges (SA1 i , DZN i , w i ) based on [E,I,K,L] and L(SA1 i ): b.From E: construct L

Table C .
1: Daily transmission probabilities q g j→i from infected agent j to susceptible agent i for different mixing contexts and interaction types.Numbers in brackets show age groups.

Table E .
and E.5 respectively.We also examine pandemic Lorenz curves in context of this urbanisation (Figure E.6).2: Population distribution of 2016 and 2021 census years, following the urban and non-urban partition explained in Section 3.2 in main manuscript, for Greater Capital Cities (GCCs) and cities with international airports (APs).