Fitting COVID data

I propose here a simple method to fit experimental data common to epidemiological spreads, such as the present COVID-19 pandemic, using the inverse gaussian distribution. This follows the general incompregension of my answer to the question Is the COVID-19 pandemic curve a Gaussian curve? on StackOverflow. My initial point is to say that a Gaussian is not adapted as it handles a distribution on real numbers, while such a curve (the variable being number of days) handles numbers on the half line. Inspired by the excellent A Theory of Reaction Time Distributions by Dr Fermin Moscoso del Prado Martin a constructive approach is to propose another distribution, such as the inverse Gaussian distribution.

This notebook develops this same idea on real data and proves numerically how bad the Gaussian fit is compared to the latter. Thinking about the importance of doing a proper inference in such a case, I conclude

In this notebook, I define a fitting method using pytorch which fits data to a the inverse gaussian distribution :

$$ f ( x ; \mu , \lambda ) = \sqrt {\frac {\lambda }{2\pi x^3}} \exp (-\frac {\lambda (x-\mu )^{2}}{2\mu ^{2}x} ) $$

To learn more about some properties / implementation of this pdf, see scipy.stats documentation.

In [1]:
from __future__ import division, print_function
import numpy as np
np.set_printoptions(precision=6, suppress=True)
import os
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
import matplotlib.pyplot as plt
phi = (np.sqrt(5)+1)/2
fig_width = 10
figsize = (fig_width, fig_width/phi)

getting real COVID data

retrieve data

We will here use data provided openly by the french government: https://github.com/opencovid19-fr/data and use the following URL:

In [2]:
URL, data_cache = 'https://raw.githubusercontent.com/opencovid19-fr/data/master/dist/chiffres-cles.json', '/tmp/covid_fr.json'
In [3]:
import pandas as pd

Let's cache the data locally to avoid downloading it again and again:

In [4]:
try:
    df = pd.read_json(data_cache)
    print('loading cache')
except:
    df = pd.read_json(URL)
    print('loading from internet')
    df.to_json(data_cache)
loading from internet
In [5]:
df
Out[5]:
date source sourceType nom code casConfirmes hospitalises deces donneesRegionales paysTouches ... hospitalise hospitalisesReadaptation hospitalisesAuxUrgences hospitalisesConventionnelle reanimations capaciteReanimation decesEhpad casEhpad casConfirmesEhpad casPossiblesEhpad
0 2020-01-24 {'nom': 'ARS Nouvelle-Aquitaine', 'url': 'http... agences-regionales-sante Charente DEP-16 0.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 2020-01-24 {'nom': 'ARS Nouvelle-Aquitaine', 'url': 'http... agences-regionales-sante Charente-Maritime DEP-17 0.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 2020-01-24 {'nom': 'ARS Nouvelle-Aquitaine', 'url': 'http... agences-regionales-sante Corrèze DEP-19 0.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 2020-01-24 {'nom': 'ARS Nouvelle-Aquitaine', 'url': 'http... agences-regionales-sante Creuse DEP-23 0.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 2020-01-24 {'nom': 'ARS Nouvelle-Aquitaine', 'url': 'http... agences-regionales-sante Dordogne DEP-24 0.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
28966 2020-10-14 {'nom': 'OpenCOVID19-fr'} opencovid19-fr Nouvelle-Aquitaine REG-75 NaN 443.0 555.0 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
28967 2020-10-14 {'nom': 'OpenCOVID19-fr'} opencovid19-fr Occitanie REG-76 NaN 660.0 664.0 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
28968 2020-10-14 {'nom': 'OpenCOVID19-fr'} opencovid19-fr Auvergne-Rhône-Alpes REG-84 NaN 1432.0 2062.0 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
28969 2020-10-14 {'nom': 'OpenCOVID19-fr'} opencovid19-fr Provence-Alpes-Côte d'Azur REG-93 NaN 956.0 1244.0 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
28970 2020-10-14 {'nom': 'OpenCOVID19-fr'} opencovid19-fr Corse REG-94 NaN 31.0 68.0 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

28971 rows × 31 columns

Let's convert the date column into a proper date:

In [6]:
df['date'] = pd.to_datetime(df['date'])
df 
Out[6]:
date source sourceType nom code casConfirmes hospitalises deces donneesRegionales paysTouches ... hospitalise hospitalisesReadaptation hospitalisesAuxUrgences hospitalisesConventionnelle reanimations capaciteReanimation decesEhpad casEhpad casConfirmesEhpad casPossiblesEhpad
0 2020-01-24 {'nom': 'ARS Nouvelle-Aquitaine', 'url': 'http... agences-regionales-sante Charente DEP-16 0.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 2020-01-24 {'nom': 'ARS Nouvelle-Aquitaine', 'url': 'http... agences-regionales-sante Charente-Maritime DEP-17 0.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 2020-01-24 {'nom': 'ARS Nouvelle-Aquitaine', 'url': 'http... agences-regionales-sante Corrèze DEP-19 0.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 2020-01-24 {'nom': 'ARS Nouvelle-Aquitaine', 'url': 'http... agences-regionales-sante Creuse DEP-23 0.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 2020-01-24 {'nom': 'ARS Nouvelle-Aquitaine', 'url': 'http... agences-regionales-sante Dordogne DEP-24 0.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
28966 2020-10-14 {'nom': 'OpenCOVID19-fr'} opencovid19-fr Nouvelle-Aquitaine REG-75 NaN 443.0 555.0 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
28967 2020-10-14 {'nom': 'OpenCOVID19-fr'} opencovid19-fr Occitanie REG-76 NaN 660.0 664.0 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
28968 2020-10-14 {'nom': 'OpenCOVID19-fr'} opencovid19-fr Auvergne-Rhône-Alpes REG-84 NaN 1432.0 2062.0 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
28969 2020-10-14 {'nom': 'OpenCOVID19-fr'} opencovid19-fr Provence-Alpes-Côte d'Azur REG-93 NaN 956.0 1244.0 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
28970 2020-10-14 {'nom': 'OpenCOVID19-fr'} opencovid19-fr Corse REG-94 NaN 31.0 68.0 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

28971 rows × 31 columns

... and extract the different data codes:

In [7]:
df['code'].unique()
Out[7]:
array(['DEP-16', 'DEP-17', 'DEP-19', 'DEP-23', 'DEP-24', 'DEP-33',
       'DEP-40', 'DEP-47', 'DEP-64', 'DEP-79', 'DEP-86', 'DEP-87', 'FRA',
       'REG-11', 'REG-75', 'WORLD', 'DEP-34', 'DEP-74', 'REG-84',
       'REG-27', 'DEP-02', 'DEP-25', 'DEP-59', 'DEP-60', 'DEP-62',
       'DEP-80', 'DEP-90', 'REG-32', 'REG-44', 'DEP-21', 'DEP-29',
       'DEP-44', 'DEP-67', 'DEP-06', 'DEP-49', 'DEP-53', 'DEP-76',
       'REG-01', 'REG-02', 'REG-03', 'REG-04', 'REG-06', 'REG-24',
       'REG-28', 'REG-52', 'REG-53', 'REG-76', 'REG-93', 'REG-94',
       'DEP-35', 'COM-977', 'COM-978', 'DEP-56', 'DEP-72', 'DEP-01',
       'DEP-08', 'DEP-10', 'DEP-27', 'DEP-51', 'DEP-52', 'DEP-54',
       'DEP-55', 'DEP-57', 'DEP-68', 'DEP-69', 'DEP-88', 'DEP-03',
       'DEP-07', 'DEP-15', 'DEP-26', 'DEP-30', 'DEP-38', 'DEP-42',
       'DEP-43', 'DEP-63', 'DEP-71', 'DEP-73', 'DEP-12', 'DEP-13',
       'DEP-22', 'DEP-28', 'DEP-37', 'DEP-70', 'DEP-84', 'DEP-973',
       'DEP-05', 'DEP-14', 'DEP-18', 'DEP-2A', 'DEP-2B', 'DEP-31',
       'DEP-36', 'DEP-41', 'DEP-45', 'DEP-50', 'DEP-75', 'DEP-77',
       'DEP-78', 'DEP-83', 'DEP-91', 'DEP-92', 'DEP-93', 'DEP-94',
       'DEP-95', 'DEP-972', 'DEP-39', 'DEP-46', 'DEP-81', 'DEP-82',
       'DEP-85', 'DEP-89', 'DEP-11', 'DEP-58', 'DEP-61', 'DEP-04',
       'DEP-32', 'COM-987', 'COM-974', 'DEP-65', 'DEP-971', 'DEP-974',
       'DEP-66', 'DEP-48', 'DEP-976', 'DEP-09', 'COM-988', 'COM-986'],
      dtype=object)

filtering data

Selecting one region:

In [9]:
code = 'FRA'
code = 'REG-93'
df_loc = df[df['code']==code]
sourceType = "ministere-sante" sourceType = 'agences-regionales-sante' df_loc = df_loc[df_loc['sourceType']==sourceType]

Selecting the two columns of interest:

In [10]:
df_loc = df_loc[['date', 'deces']]
df_loc
Out[10]:
date deces
214 2020-02-28 NaN
292 2020-03-02 NaN
347 2020-03-03 NaN
434 2020-03-04 0.0
435 2020-03-04 NaN
... ... ...
28485 2020-10-10 1212.0
28606 2020-10-11 1215.0
28727 2020-10-12 1226.0
28848 2020-10-13 1237.0
28969 2020-10-14 1244.0

257 rows × 2 columns

Removing that containing NaNs:

In [11]:
df_loc = df_loc[df_loc['deces'].notna()]

Selecting a date range:

In [12]:
start_date = '2020-02-23'
df_loc['date'] > start_date
Out[12]:
434      True
532      True
658      True
765      True
869      True
         ... 
28485    True
28606    True
28727    True
28848    True
28969    True
Name: date, Length: 234, dtype: bool
In [13]:
df_loc = df_loc[start_date < df_loc['date']]
In [14]:
stop_date = '2020-06-23'
df_loc = df_loc[df_loc['date']<stop_date]
In [15]:
fig, ax = plt.subplots(figsize=figsize)
df_loc.plot(x='date', y='deces', ax=ax)
ax.set_title('Cumulative number of deaths');

returning a numpy array

In [16]:
death = np.array(df_loc['deces'])
df_loc['date'], death
Out[16]:
(434     2020-03-04
 532     2020-03-05
 658     2020-03-06
 765     2020-03-07
 869     2020-03-08
            ...    
 14400   2020-06-18
 14521   2020-06-19
 14642   2020-06-20
 14763   2020-06-21
 14884   2020-06-22
 Name: date, Length: 117, dtype: datetime64[ns],
 array([  0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   4.,
          7.,   7.,   9.,   9.,  11.,  11.,  13.,  13.,  15.,  14.,  20.,
         20.,  26.,  26.,  33.,  33.,  44.,  44.,  48.,  55.,  65.,  80.,
        103., 124., 141., 161., 178., 195., 231., 253., 269., 286., 299.,
        317., 328., 348., 374., 393., 421., 447., 463., 470., 499., 526.,
        561., 575., 610., 625., 631., 656., 665., 682., 696., 704., 708.,
        713., 732., 746., 756., 764., 770., 773., 777., 797., 805., 816.,
        824., 831., 836., 840., 849., 855., 862., 869., 874., 876., 878.,
        884., 889., 890., 893., 898., 898., 898., 898., 908., 911., 912.,
        913., 913., 913., 917., 919., 920., 922., 923., 923., 923., 925.,
        927., 927., 929., 933., 935., 935., 936.]))
In [17]:
death.shape, np.diff(death).shape
Out[17]:
((117,), (116,))
In [18]:
death[1:] = np.diff(death)
death[0] = 0
In [19]:
death
Out[19]:
array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  4.,  3.,  0.,
        2.,  0.,  2.,  0.,  2.,  0.,  2., -1.,  6.,  0.,  6.,  0.,  7.,
        0., 11.,  0.,  4.,  7., 10., 15., 23., 21., 17., 20., 17., 17.,
       36., 22., 16., 17., 13., 18., 11., 20., 26., 19., 28., 26., 16.,
        7., 29., 27., 35., 14., 35., 15.,  6., 25.,  9., 17., 14.,  8.,
        4.,  5., 19., 14., 10.,  8.,  6.,  3.,  4., 20.,  8., 11.,  8.,
        7.,  5.,  4.,  9.,  6.,  7.,  7.,  5.,  2.,  2.,  6.,  5.,  1.,
        3.,  5.,  0.,  0.,  0., 10.,  3.,  1.,  1.,  0.,  0.,  4.,  2.,
        1.,  2.,  1.,  0.,  0.,  2.,  2.,  0.,  2.,  4.,  2.,  0.,  1.])
In [20]:
df_loc['death'] = death
df_loc
Out[20]:
date deces death
434 2020-03-04 0.0 0.0
532 2020-03-05 0.0 0.0
658 2020-03-06 0.0 0.0
765 2020-03-07 0.0 0.0
869 2020-03-08 0.0 0.0
... ... ... ...
14400 2020-06-18 929.0 2.0
14521 2020-06-19 933.0 4.0
14642 2020-06-20 935.0 2.0
14763 2020-06-21 935.0 0.0
14884 2020-06-22 936.0 1.0

117 rows × 3 columns

df_loc['dead per day'] = df_loc['deces']
In [21]:
fig, ax = plt.subplots(figsize=figsize)
df_loc.plot(x='date', y='death', ax=ax)
ax.set_title('Daily rate of deaths');

performing the fit

converting data to fit

In [22]:
X, y = df_loc['date'], df_loc['death']
In [23]:
X.shape, y.shape
Out[23]:
((117,), (117,))
In [24]:
X = X.astype(int)
X = np.array(X)
X
Out[24]:
array([1583280000000000000, 1583366400000000000, 1583452800000000000,
       1583539200000000000, 1583625600000000000, 1583712000000000000,
       1583798400000000000, 1583884800000000000, 1583971200000000000,
       1584057600000000000, 1584403200000000000, 1584489600000000000,
       1584489600000000000, 1584576000000000000, 1584576000000000000,
       1584662400000000000, 1584662400000000000, 1584748800000000000,
       1584748800000000000, 1584835200000000000, 1584835200000000000,
       1584921600000000000, 1584921600000000000, 1585008000000000000,
       1585008000000000000, 1585094400000000000, 1585094400000000000,
       1585180800000000000, 1585180800000000000, 1585267200000000000,
       1585353600000000000, 1585440000000000000, 1585526400000000000,
       1585612800000000000, 1585699200000000000, 1585785600000000000,
       1585872000000000000, 1585958400000000000, 1586044800000000000,
       1586131200000000000, 1586217600000000000, 1586304000000000000,
       1586390400000000000, 1586476800000000000, 1586563200000000000,
       1586649600000000000, 1586736000000000000, 1586822400000000000,
       1586908800000000000, 1586995200000000000, 1587081600000000000,
       1587168000000000000, 1587254400000000000, 1587340800000000000,
       1587427200000000000, 1587513600000000000, 1587600000000000000,
       1587686400000000000, 1587772800000000000, 1587859200000000000,
       1587945600000000000, 1588032000000000000, 1588118400000000000,
       1588204800000000000, 1588291200000000000, 1588377600000000000,
       1588464000000000000, 1588550400000000000, 1588636800000000000,
       1588723200000000000, 1588809600000000000, 1588896000000000000,
       1588982400000000000, 1589068800000000000, 1589155200000000000,
       1589241600000000000, 1589328000000000000, 1589414400000000000,
       1589500800000000000, 1589587200000000000, 1589673600000000000,
       1589760000000000000, 1589846400000000000, 1589932800000000000,
       1590019200000000000, 1590105600000000000, 1590192000000000000,
       1590278400000000000, 1590364800000000000, 1590451200000000000,
       1590537600000000000, 1590624000000000000, 1590710400000000000,
       1590796800000000000, 1590883200000000000, 1590969600000000000,
       1591056000000000000, 1591142400000000000, 1591228800000000000,
       1591315200000000000, 1591401600000000000, 1591488000000000000,
       1591574400000000000, 1591660800000000000, 1591747200000000000,
       1591833600000000000, 1591920000000000000, 1592006400000000000,
       1592092800000000000, 1592179200000000000, 1592265600000000000,
       1592352000000000000, 1592438400000000000, 1592524800000000000,
       1592611200000000000, 1592697600000000000, 1592784000000000000])
In [25]:
INTS_PER_DAY = 86400000000000 
In [26]:
X = (X-X[0])//INTS_PER_DAY
In [27]:
np.array(X)
X
Out[27]:
array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  13,  14,  14,
        15,  15,  16,  16,  17,  17,  18,  18,  19,  19,  20,  20,  21,
        21,  22,  22,  23,  24,  25,  26,  27,  28,  29,  30,  31,  32,
        33,  34,  35,  36,  37,  38,  39,  40,  41,  42,  43,  44,  45,
        46,  47,  48,  49,  50,  51,  52,  53,  54,  55,  56,  57,  58,
        59,  60,  61,  62,  63,  64,  65,  66,  67,  68,  69,  70,  71,
        72,  73,  74,  75,  76,  77,  78,  79,  80,  81,  82,  83,  84,
        85,  86,  87,  88,  89,  90,  91,  92,  93,  94,  95,  96,  97,
        98,  99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110])
In [28]:
y = np.array(y)
y
Out[28]:
array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  4.,  3.,  0.,
        2.,  0.,  2.,  0.,  2.,  0.,  2., -1.,  6.,  0.,  6.,  0.,  7.,
        0., 11.,  0.,  4.,  7., 10., 15., 23., 21., 17., 20., 17., 17.,
       36., 22., 16., 17., 13., 18., 11., 20., 26., 19., 28., 26., 16.,
        7., 29., 27., 35., 14., 35., 15.,  6., 25.,  9., 17., 14.,  8.,
        4.,  5., 19., 14., 10.,  8.,  6.,  3.,  4., 20.,  8., 11.,  8.,
        7.,  5.,  4.,  9.,  6.,  7.,  7.,  5.,  2.,  2.,  6.,  5.,  1.,
        3.,  5.,  0.,  0.,  0., 10.,  3.,  1.,  1.,  0.,  0.,  4.,  2.,
        1.,  2.,  1.,  0.,  0.,  2.,  2.,  0.,  2.,  4.,  2.,  0.,  1.])

using torchor "when you have a new hammer, everything looks like a nail, seehttps://en.wikipedia.org/wiki/Law_of_the_instrument

Let's use the definition of the pdf:

$$ f ( x ; \mu , \lambda ) = \sqrt {\frac {\lambda }{2\pi x^3}} \exp (-\frac {\lambda (x-\mu )^{2}}{2\mu ^{2}x} ) $$

and a previous post where I used pyTorch to fit psychophysical data:

In [29]:
import torch
from torch.utils.data import TensorDataset, DataLoader

torch.set_default_tensor_type("torch.DoubleTensor")
criterion = torch.nn.MSELoss(reduction="sum")
# https://pytorch.org/docs/stable/generated/torch.nn.L1Loss.html#torch.nn.L1Loss
#criterion = torch.nn.L1Loss(reduction="sum")


class CovidRegressionModel(torch.nn.Module):
    def __init__(self, mu=80, tau=24, 
                 log_amp=torch.log(500*torch.ones(1)),
                 log_lambda=torch.log(100*torch.ones(1)),
                ):
        super(CovidRegressionModel, self).__init__()
        self.tau = torch.nn.Parameter(tau * torch.ones(1))
        self.mu = torch.nn.Parameter(mu * torch.ones(1))
        # when modeling a stricly positive number, a good habit is to use their log as the fitted parameter:
        self.log_amp = torch.nn.Parameter(log_amp * torch.ones(1))
        self.log_lambda = torch.nn.Parameter(log_lambda * torch.ones(1))

    def forward(self, x):
        out = (x > self.tau) * torch.exp(self.log_amp)
        date = (x-self.tau) # 
        date[x<=self.tau] = 1 # to avoid NaNs in the output value (and in the gradient)
        out *= torch.sqrt(torch.exp(self.log_lambda) / (2 * np.pi * date ** 3)) 
        out *= torch.exp(-torch.exp(self.log_lambda) * (date - self.mu) ** 2 / (2 * self.mu ** 2 * date))
        out[x<=self.tau] = 0 # overwriting values before the onset
        return out

learning_rate = 0.005
beta1, beta2 = 0.9, 0.999
betas = (beta1, beta2)
num_epochs = 2 ** 9 + 1
batch_size = 16
amsgrad = False # gives similar results
amsgrad = True  # gives similar results

def fit_data(
    X,
    y,
    learning_rate=learning_rate,
    batch_size=batch_size,  # gamma=gamma,
    num_epochs=num_epochs,
    betas=betas,
    verbose=False, **kwargs
):

    variables, labels = torch.Tensor(X[:, None]), torch.Tensor(y[:, None])
    loader = DataLoader(
        TensorDataset(variables, labels), batch_size=batch_size, shuffle=True
    )

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    covid_model = CovidRegressionModel()
    covid_model = covid_model.to(device)
    covid_model.train()
    optimizer = torch.optim.Adam(
        covid_model.parameters(), lr=learning_rate, betas=betas, amsgrad=amsgrad
    )
    for epoch in range(int(num_epochs)):
        covid_model.train()
        losses = []
        for variables_, labels_ in loader:
            variables_, labels_ = variables_.to(device), labels_.to(device)
            outputs = covid_model(variables_)
            loss = criterion(outputs, labels_)

            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            losses.append(loss.item())

        if verbose and (epoch % (num_epochs // 32) == 0):
            print(f"Iteration: {epoch} - Loss: {np.sum(losses)/len(variables):.5f}")

    covid_model.eval()
    variables, labels = torch.Tensor(X[:, None]), torch.Tensor(y[:, None])
    outputs = covid_model(variables)
    loss = criterion(outputs, labels).item() / len(variables)
    return covid_model, loss
In [30]:
covid_model, loss = fit_data(X, y, verbose=True)
print("Final loss =", loss)
Iteration: 0 - Loss: 95.98448
Iteration: 16 - Loss: 53.55947
Iteration: 32 - Loss: 33.12017
Iteration: 48 - Loss: 31.34294
Iteration: 64 - Loss: 30.80274
Iteration: 80 - Loss: 30.38041
Iteration: 96 - Loss: 30.07997
Iteration: 112 - Loss: 29.78837
Iteration: 128 - Loss: 29.52766
Iteration: 144 - Loss: 29.35324
Iteration: 160 - Loss: 29.20898
Iteration: 176 - Loss: 29.01493
Iteration: 192 - Loss: 28.93391
Iteration: 208 - Loss: 28.82388
Iteration: 224 - Loss: 28.73245
Iteration: 240 - Loss: 28.73530
Iteration: 256 - Loss: 28.59497
Iteration: 272 - Loss: 28.56916
Iteration: 288 - Loss: 28.49581
Iteration: 304 - Loss: 28.46384
Iteration: 320 - Loss: 28.41391
Iteration: 336 - Loss: 28.39930
Iteration: 352 - Loss: 28.39156
Iteration: 368 - Loss: 28.34303
Iteration: 384 - Loss: 28.30992
Iteration: 400 - Loss: 28.28816
Iteration: 416 - Loss: 28.26931
Iteration: 432 - Loss: 28.27551
Iteration: 448 - Loss: 28.23936
Iteration: 464 - Loss: 28.24632
Iteration: 480 - Loss: 28.21003
Iteration: 496 - Loss: 28.22573
Iteration: 512 - Loss: 28.19352
Final loss = 28.158525623907405
In [31]:
covid_model.eval()
outputs = covid_model(torch.Tensor(X[:, None]))
y_pred = outputs.detach().numpy()
In [32]:
fig, ax = plt.subplots(figsize=figsize)
ax.plot(X, y, '*', label='data')
ax.plot(X, y_pred, '--', label='fit')
ax.vlines(covid_model.mu.item(), y.min(), y.max(), colors='g', linestyles='--', label=f'$\mu$={covid_model.mu.item():.1f} (mean date)')
ax.vlines(covid_model.tau.item(), y.min(), y.max(), colors='r', linestyles='--', label=fr'$\tau$={covid_model.tau.item():.1f} (day of onset)')
ax.plot([], [], label=f'A={torch.exp(covid_model.log_amp).item():.1f} (amplitude)')
ax.plot([], [], label=f'$\lambda$={torch.exp(covid_model.log_lambda).item():.1f} (spread in days)')
ax.legend()
ax.set_xlabel(f'# number of days since {start_date}');
ax.set_ylabel(f'# number of dead per day');
ax.set_title('Fitting Daily rate of deaths (PACA-FR)');

some book keeping for the notebook

In [33]:
%load_ext watermark
%watermark -i -h -m -v -p numpy,pandas,matplotlib,torch  -r -g -b
2020-10-15T09:07:48+02:00

CPython 3.8.6
IPython 7.16.1

numpy 1.16.5
pandas 1.0.5
matplotlib 3.2.2
torch 1.6.0

compiler   : Clang 12.0.0 (clang-1200.0.32.2)
system     : Darwin
release    : 19.6.0
machine    : x86_64
processor  : i386
CPU cores  : 36
interpreter: 64bit
host name  : fortytwo
Git hash   : c82aa98290e82785ebf6339af5216c1ef0c50458
Git repo   : https://github.com/laurentperrinet/sciblog.git
Git branch : master