$$\begin{align}
totalCases_t & = a b^t \\
& = a(1+r)^t \\
& = a e^{kt}; \ \ (where \ k=ln(1+r)=ln(b)) \\
\end{align}$$
In these equations, $a$ is the total cases at the start (when $t=0$). The equation is subject to the constraints ($a>0, b>1, r>0, k>0$). Some of you will recognise $a(1+r)^t$ as the formula for compound interest. Compound interest is another form of exponential growth.
The interesting thing about exponential growth in cumulative cases is that it depends on exponential growth in new cases. Let's take a moment to prove this. We can solve the number of new cases for any point in time with a difference equation. For example, if our time period is days, then the number of new cases today is the total number of cases we have at the end of today, minus the total number of cases we had at the end of yesterday.
$$\begin{align}
newCases_t & = totalCases_t - totalCases_{t-1} \\
& = a b^t - a b^{t-1}\\
& = a(b^t - b^{t-1})\\
& = a((b-1)b^{t-1})\\
& = (ab-a) \left(\frac{b^t}{b}\right) \\
& = a\left( 1-\frac{1}{b}\right) b^t \\
& = c b^t; \ \ (where \ c=a\left( 1-\frac{1}{b}\right))
\end{align}$$
This last equation is demonstrably an exponential equation, in the same general form as the very first line in the previous block of equations. Also, it maintains the original constrains, as if ($a>0$, and $b>1$), then ($c>0$). Therefore, if the cumulative number of cases is growing exponentially over time, then the new cases on each day are also growing exponetially (QED).
We can quickly check our algebra with some python code.
In [1]: import pandas as pd ...: a = 7 # day 0 staring point - selected at random - must be greater than 0 ...: b = 1.5 # growth rate - selected at random - must be greater than 1 ...: t = pd.Series(list(range(1,11))) ...: total_cases_f = a * (b ** t) ...: nu_cases_diff = total_cases_f - total_cases_f.shift(1).fillna(a) ...: nu_cases_f = (a * (1 - 1/b)) * (b ** t) ...: print(pd.DataFrame([t, total_cases_f, nu_cases_diff, nu_cases_f], ...: index=['t', 'Total Cases', ...: 'New Cases (by difference)', ...: 'New Cases (by formula)']).T) ...: t Total Cases New Cases (by difference) New Cases (by formula) 0 1.0 10.500000 3.500000 3.500000 1 2.0 15.750000 5.250000 5.250000 2 3.0 23.625000 7.875000 7.875000 3 4.0 35.437500 11.812500 11.812500 4 5.0 53.156250 17.718750 17.718750 5 6.0 79.734375 26.578125 26.578125 6 7.0 119.601562 39.867188 39.867188 7 8.0 179.402344 59.800781 59.800781 8 9.0 269.103516 89.701172 89.701172 9 10.0 403.655273 134.551758 134.551758
And thankfully, it checks out. We get the same answer for new cases from the difference method, and from the exponential equation for new cases, we derived above.
Why all that boring math. Well when we over-plot total cases and new cases on the same chart (but to different scales), it is easy to pick if both series are growing exponentially, or in some other way.
We can demonstrate this with a couple of synthetic examples, and then apply what we have learnt from the synthetic charts to real data from the COVID-19 pandemic. For the synthetic data, we will produce two quick plots, the first will be an exponential growth of 20 per cent per day for thirty days. In the second chart, we will do exponential growth for 25 days, followed by linear growth for the last 10 days.
The first plot - of pure exponential growth - looks like this. The blue line of cumulative exponential growth (with a scale on the right-hand side) maps perfectly to the red bars of exponential growth in new cases reported each day (with a scale on the left-hand side). They overlap perfectly because the axes for the two charts are scaled to fill the available chart space (and because of the relationship identified above).
In the next chart, we have 25 days of exponential growth followed by five days of constant growth. Just because it looks like more red on the chart, notice that the left-hand axis only goes to 16 in this chart. It went to 40 in the previous chart. Also note that the blue-line stops curving upwards from day 25, and tracks in a linear fashion (remember $y=mx+b$ from high school).
The key thing to note when we have less than exponential growth, the chart has a lot of red above the dotted blue line for the cumulative total (which is scaled on the right-hand side)
Let's apply this observation to some real-world examples.
If we look at the growth in cases in the United States, it looks pretty close to being exponential. While there is some red above the blue line, that is probably a result of noise in the data, and perhaps constraints with making testing available.
Conversely, if we look at the growth in cases in Italy, we can see a lot of red above the blue line. The spike on 16 March, is probably making up for the limited additional cases reported on 15 March: a data reporting issue. But more generally, it looks like exponential growth progressed to around 22 March, and since then we have seen linear growth in the order of 5000 to 6000 new cases a day. While that might seem a lot, it is much less than it would have been if the prior exponential growth had continued unabated.
And that is why I am a fan of these charts that plot cumulative growth over time alongside new cases or new deaths for each day.
The full set of charts for each nation can be found on my GitHub gist site. But please note, you won't be able to view the gist from a mobile phone.
Finally, the python code I used to generate the synthetic charts above.
import numpy as np import pandas as pd import matplotlib.pyplot as plt plt.style.use('ggplot') def plot_incidence(name, incidence, cumulative, mode): assert((incidence.index==cumulative.index).all()) # plot daily incidence ax = plt.subplot(111) labels = [f'{int(p)}' for p in incidence.index] ax.bar(labels, incidence.values, color='#dd0000', label=f'New {mode}') ax.set_title(f'Synthetic: {name}') ax.set_xlabel('Day') ax.set_ylabel(None) # plot cumulative total axr = ax.twinx() axr.plot(labels, cumulative.values, lw=2.0, color='#0000dd', ls='--', label=f'Cumulative {mode}') axr.set_ylabel(None) axr.grid(False) # manually label the x-axis MAX_LABELS = 9 ticks = ax.xaxis.get_major_ticks() if len(ticks): modulus = int(np.floor(len(ticks) / MAX_LABELS) + 1) for i in range(len(ticks)): if i % modulus: ticks[i].label1.set_visible(False) # put in a legend ax.legend(loc='upper left') axr.legend(loc='upper center') # align the base of the left and right scales yliml = list(ax.get_ylim()) yliml[0] = 0 ax.set_ylim(yliml) ylimr = list(axr.get_ylim()) ylimr[0] = 0 axr.set_ylim(ylimr) # adjust plot width xspace = 0.01 * len(ticks) setting = (-0.5-xspace, len(ticks)-0.5+xspace) ax.set_xlim(*setting) axr.set_xlim(*setting) # start work on the broader canvas fig = ax.figure fig.set_size_inches(8, 4) fig.tight_layout(pad=1) # y-axis labels - the hard way lHeight = 0.96 lInstep = 0.02 fig.text(1.0-lInstep, lHeight, f'Cumulative {mode}', ha='right', va='top', fontsize=11, color='#333333') fig.text(lInstep, lHeight, f'New {mode}', ha='left', va='top', fontsize=11, color='#333333') # save - goodbye fig.savefig(f'./incidence-{name}.png', dpi=125) plt.close() # fake up the synthetic data n = 30 days = list(range(1,n+1)) t = pd.Series(days, index=days) a = 1 b = 1.2 cumulative = a * (b ** t) incidence = cumulative - cumulative.shift(1).fillna(a) plot_incidence('1 - Exponential Growth', incidence, cumulative, 'cases') incidence.iloc[25:] = incidence.iloc[24] cumulative = incidence.cumsum() plot_incidence('2 - Exponential then Linear Growth', incidence, cumulative, 'cases')
No comments:
Post a Comment