COVID-19 Overview

Tracking coronavirus total cases, deaths and new cases by country. Additionally, a detailed view is provided for the US(by state), Europe

  • comments: true
  • author: Pratap Vardhan
  • categories: [overview, interactive]
  • image: images/covid-overview.png
  • permalink: /covid-overview/

In [1]:
#hide
print('''
Example of using jupyter notebook, pandas (data transformations), jinja2 (html, visual)
to create visual dashboards with fastpages
You see also the live version on https://gramener.com/enumter/covid19/
''')


Example of using jupyter notebook, pandas (data transformations), jinja2 (html, visual)
to create visual dashboards with fastpages
You see also the live version on https://gramener.com/enumter/covid19/


In [2]:
#hide
import numpy as np
import pandas as pd
from jinja2 import Template
from IPython.display import HTML

In [3]:
#hide

# FETCH
import getpass
base_url = 'https://raw.githubusercontent.com/pratapvardhan/notebooks/master/covid19/'
base_url = '' if (getpass.getuser() == 'Pratap Vardhan') else base_url
paths = {
    'mapping': base_url + 'mapping_countries.csv',
    'overview': base_url + 'overview.tpl'
}

def get_mappings(url):
    df = pd.read_csv(url)
    return {
        'df': df,
        'replace.country': dict(df.dropna(subset=['Name']).set_index('Country')['Name']),
        'map.continent': dict(df.set_index('Name')['Continent'])
    }

mapping = get_mappings(paths['mapping'])

def get_template(path):
    from urllib.parse import urlparse
    if bool(urlparse(path).netloc):
        from urllib.request import urlopen
        return urlopen(path).read().decode('utf8')
    return open(path).read()

def get_frame(name):
    url = (
        'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/'
        f'csse_covid_19_time_series/time_series_covid19_{name}_global.csv')
    df = pd.read_csv(url)
    # rename countries
    df['Country/Region'] = df['Country/Region'].replace(mapping['replace.country'])
    return df

def get_dates(df):
    dt_cols = df.columns[~df.columns.isin(['Province/State', 'Country/Region', 'Lat', 'Long'])]
    LAST_DATE_I = -1
    # sometimes last column may be empty, then go backwards
    for i in range(-1, -len(dt_cols), -1):
        if not df[dt_cols[i]].fillna(0).eq(0).all():
            LAST_DATE_I = i
            break
    return LAST_DATE_I, dt_cols

In [4]:
#hide
COL_REGION = 'Country/Region'
# Confirmed, Recovered, Deaths
df = get_frame('confirmed')
# dft_: timeseries, dfc_: today country agg
dft_cases = df
dft_deaths = get_frame('deaths')
LAST_DATE_I, dt_cols = get_dates(df)

dt_today = dt_cols[LAST_DATE_I]
dt_5ago = dt_cols[LAST_DATE_I-5]

dfc_cases = dft_cases.groupby(COL_REGION)[dt_today].sum()
dfc_deaths = dft_deaths.groupby(COL_REGION)[dt_today].sum()
dfp_cases = dft_cases.groupby(COL_REGION)[dt_5ago].sum()
dfp_deaths = dft_deaths.groupby(COL_REGION)[dt_5ago].sum()

In [5]:
#hide
df_table = (pd.DataFrame(dict(Cases=dfc_cases, Deaths=dfc_deaths, PCases=dfp_cases, PDeaths=dfp_deaths))
             .sort_values(by=['Cases', 'Deaths'], ascending=[False, False])
             .reset_index())
for c in 'Cases, Deaths'.split(', '):
    df_table[f'{c} (+)'] = (df_table[c] - df_table[f'P{c}']).clip(0)  # DATA BUG
df_table['Fatality Rate'] = (100 * df_table['Deaths'] / df_table['Cases']).round(1)
df_table['Continent'] = df_table['Country/Region'].map(mapping['map.continent'])
df_table.head(15)


Out[5]:
Country/Region Cases Deaths PCases PDeaths Cases (+) Deaths (+) Fatality Rate Continent
0 China 81661 3285 81250 3253 411 32 4.0 Asia
1 Italy 74386 7503 47021 4032 27365 3471 10.1 Europe
2 US 65778 942 19100 244 46678 698 1.4 North America
3 Spain 49515 3647 20410 1043 29105 2604 7.4 Europe
4 Germany 37323 206 19848 67 17475 139 0.6 Europe
5 Iran 27017 2077 19644 1433 7373 644 7.7 Asia
6 France 25600 1333 12758 451 12842 882 5.2 Europe
7 Switzerland 10897 153 5294 54 5603 99 1.4 Europe
8 United Kingdom 9640 466 4014 178 5626 288 4.8 Europe
9 South Korea 9137 126 8652 94 485 32 1.4 Asia
10 Netherlands 6438 357 3003 107 3435 250 5.5 Europe
11 Austria 5588 30 2388 6 3200 24 0.5 Europe
12 Belgium 4937 178 2257 37 2680 141 3.6 Europe
13 Canada 3251 30 943 12 2308 18 0.9 North America
14 Norway 3084 14 1914 7 1170 7 0.5 Europe

In [6]:
#hide
# world, china, europe, us
metrics = ['Cases', 'Deaths', 'Cases (+)', 'Deaths (+)']
s_china = df_table[df_table['Country/Region'].eq('China')][metrics].sum().add_prefix('China ')
s_us = df_table[df_table['Country/Region'].eq('US')][metrics].sum().add_prefix('US ')
s_eu = df_table[df_table['Continent'].eq('Europe')][metrics].sum().add_prefix('EU ')
summary = {'updated': pd.to_datetime(dt_today), 'since': pd.to_datetime(dt_5ago)}
summary = {**summary, **df_table[metrics].sum(), **s_china, **s_us, **s_eu}
summary


Out[6]:
{'updated': Timestamp('2020-03-25 00:00:00'),
 'since': Timestamp('2020-03-20 00:00:00'),
 'Cases': 467593,
 'Deaths': 21180,
 'Cases (+)': 195608,
 'Deaths (+)': 9886,
 'China Cases': 81661,
 'China Deaths': 3285,
 'China Cases (+)': 411,
 'China Deaths (+)': 32,
 'US Cases': 65778,
 'US Deaths': 942,
 'US Cases (+)': 46678,
 'US Deaths (+)': 698,
 'EU Cases': 248822,
 'EU Deaths': 14177,
 'EU Cases (+)': 119470,
 'EU Deaths (+)': 8116}

In [7]:
#hide
dft_ct_cases = dft_cases.groupby(COL_REGION)[dt_cols].sum()
dft_ct_new_cases = dft_ct_cases.diff(axis=1).fillna(0).astype(int)
dft_ct_new_cases.head()


Out[7]:
1/22/20 1/23/20 1/24/20 1/25/20 1/26/20 1/27/20 1/28/20 1/29/20 1/30/20 1/31/20 ... 3/16/20 3/17/20 3/18/20 3/19/20 3/20/20 3/21/20 3/22/20 3/23/20 3/24/20 3/25/20
Country/Region
Afghanistan 0 0 0 0 0 0 0 0 0 0 ... 5 1 0 0 2 0 16 0 34 10
Albania 0 0 0 0 0 0 0 0 0 0 ... 9 4 4 5 6 6 13 15 19 23
Algeria 0 0 0 0 0 0 0 0 0 0 ... 6 6 14 13 3 49 62 29 34 38
Andorra 0 0 0 0 0 0 0 0 0 0 ... 1 37 0 14 22 13 25 20 31 24
Angola 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 1 1 0 1 0 0

5 rows × 64 columns


In [8]:
#hide_input
template = Template(get_template(paths['overview']))
html = template.render(
    D=summary, table=df_table.head(20),  # REMOVE .head(20) to see all values
    newcases=dft_ct_new_cases.loc[:, dt_cols[LAST_DATE_I-50]:dt_cols[LAST_DATE_I]],
    np=np, pd=pd, enumerate=enumerate)
HTML(f'<div>{html}</div>')


Out[8]:
World
Confirmed Cases
467,593
(+195,608)
Deaths
21,180
(+9,886)

Updated on March 25, 2020 ( +change since 5 days ago.)

China
Cases
81,661
(+411)
Deaths
3,285
(+32)
Europe
Cases
248,822
(+119,470)
Deaths
14,177
(+8,116)
U.S.
Cases
65,778
(+46,678)
Deaths
942
(+698)

In the last 5 days, 195,608 new Coronavirus cases have been reported worldwide. Of which 119,470 (61%) are from Europe. China has reported 411 new cases in the last 5 days.

10 100 1000
Country New Cases Total Cases Deaths Fatality
Feb. 04
Mar. 25
(+NEW) since Mar, 20
China
81,661 (+411) 3,285 (+32) 4.0%
Italy
74,386 (+27,365) 7,503 (+3,471) 10.1%
US
65,778 (+46,678) 942 (+698) 1.4%
Spain
49,515 (+29,105) 3,647 (+2,604) 7.4%
Germany
37,323 (+17,475) 206 (+139) 0.6%
Iran
27,017 (+7,373) 2,077 (+644) 7.7%
France
25,600 (+12,842) 1,333 (+882) 5.2%
Switzerland
10,897 (+5,603) 153 (+99) 1.4%
United Kingdom
9,640 (+5,626) 466 (+288) 4.8%
South Korea
9,137 (+485) 126 (+32) 1.4%
Netherlands
6,438 (+3,435) 357 (+250) 5.5%
Austria
5,588 (+3,200) 30 (+24) 0.5%
Belgium
4,937 (+2,680) 178 (+141) 3.6%
Canada
3,251 (+2,308) 30 (+18) 0.9%
Norway
3,084 (+1,170) 14 (+7) 0.5%
Portugal
2,995 (+1,975) 43 (+37) 1.4%
Brazil
2,554 (+1,761) 59 (+48) 2.3%
Sweden
2,526 (+887) 62 (+46) 2.5%
Turkey
2,433 (+2,074) 59 (+55) 2.4%
Israel
2,369 (+1,840) 5 (+5) 0.2%