Download file from website using urllib

In this notebook we present a simple example on how to use the python module urllib to download files from a website.


In [1]:
import urllib # import the module urllib

In [2]:
url = "https://nomads.ncdc.noaa.gov/data/gfs4/201702/20170201/" # Link of the directory with the files to download
outpath = "." # Define the directory where you want to store the data

Download one file

"gfs_4_20170201_0000_000.grb2" file description:

  • _20170201 : Year/month/day
  • _0000: Start run
  • _000: forecast hours

In [3]:
filename = "gfs_4_20170201_0000_000.grb2"  
print url+filename # print the full link of the file


https://nomads.ncdc.noaa.gov/data/gfs4/201702/20170201/gfs_4_20170201_0000_000.grb2

Download the file


In [8]:
urllib.urlretrieve(url+filename, outpath+filename) # arg1: link of the file, arg2: output path


Out[8]:
('.gfs_4_20170201_0000_000.grb2',
 <httplib.HTTPMessage instance at 0x7f4965861ab8>)

Download multiple files

Let's download all the forecast up to 120H

Create a list of integer


In [9]:
rng = range(1, 121)
print rng


[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120]

Convert into a string and modify in the correct format


In [10]:
for r in rng:
    print str(r).zfill(3)


001
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031
032
033
034
035
036
037
038
039
040
041
042
043
044
045
046
047
048
049
050
051
052
053
054
055
056
057
058
059
060
061
062
063
064
065
066
067
068
069
070
071
072
073
074
075
076
077
078
079
080
081
082
083
084
085
086
087
088
089
090
091
092
093
094
095
096
097
098
099
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120

Create a list with the filename to be downloaded


In [11]:
basename = "gfs_4_20170201_0000_"

In [12]:
basename + str(r).zfill(3)+".grb2" # full filename


Out[12]:
'gfs_4_20170201_0000_120.grb2'

In [14]:
filenames = [basename + str(r).zfill(3)+".grb2" for r in rng]
for filename in filenames:
    print filename


gfs_4_20170201_0000_001.grb2
gfs_4_20170201_0000_002.grb2
gfs_4_20170201_0000_003.grb2
gfs_4_20170201_0000_004.grb2
gfs_4_20170201_0000_005.grb2
gfs_4_20170201_0000_006.grb2
gfs_4_20170201_0000_007.grb2
gfs_4_20170201_0000_008.grb2
gfs_4_20170201_0000_009.grb2
gfs_4_20170201_0000_010.grb2
gfs_4_20170201_0000_011.grb2
gfs_4_20170201_0000_012.grb2
gfs_4_20170201_0000_013.grb2
gfs_4_20170201_0000_014.grb2
gfs_4_20170201_0000_015.grb2
gfs_4_20170201_0000_016.grb2
gfs_4_20170201_0000_017.grb2
gfs_4_20170201_0000_018.grb2
gfs_4_20170201_0000_019.grb2
gfs_4_20170201_0000_020.grb2
gfs_4_20170201_0000_021.grb2
gfs_4_20170201_0000_022.grb2
gfs_4_20170201_0000_023.grb2
gfs_4_20170201_0000_024.grb2
gfs_4_20170201_0000_025.grb2
gfs_4_20170201_0000_026.grb2
gfs_4_20170201_0000_027.grb2
gfs_4_20170201_0000_028.grb2
gfs_4_20170201_0000_029.grb2
gfs_4_20170201_0000_030.grb2
gfs_4_20170201_0000_031.grb2
gfs_4_20170201_0000_032.grb2
gfs_4_20170201_0000_033.grb2
gfs_4_20170201_0000_034.grb2
gfs_4_20170201_0000_035.grb2
gfs_4_20170201_0000_036.grb2
gfs_4_20170201_0000_037.grb2
gfs_4_20170201_0000_038.grb2
gfs_4_20170201_0000_039.grb2
gfs_4_20170201_0000_040.grb2
gfs_4_20170201_0000_041.grb2
gfs_4_20170201_0000_042.grb2
gfs_4_20170201_0000_043.grb2
gfs_4_20170201_0000_044.grb2
gfs_4_20170201_0000_045.grb2
gfs_4_20170201_0000_046.grb2
gfs_4_20170201_0000_047.grb2
gfs_4_20170201_0000_048.grb2
gfs_4_20170201_0000_049.grb2
gfs_4_20170201_0000_050.grb2
gfs_4_20170201_0000_051.grb2
gfs_4_20170201_0000_052.grb2
gfs_4_20170201_0000_053.grb2
gfs_4_20170201_0000_054.grb2
gfs_4_20170201_0000_055.grb2
gfs_4_20170201_0000_056.grb2
gfs_4_20170201_0000_057.grb2
gfs_4_20170201_0000_058.grb2
gfs_4_20170201_0000_059.grb2
gfs_4_20170201_0000_060.grb2
gfs_4_20170201_0000_061.grb2
gfs_4_20170201_0000_062.grb2
gfs_4_20170201_0000_063.grb2
gfs_4_20170201_0000_064.grb2
gfs_4_20170201_0000_065.grb2
gfs_4_20170201_0000_066.grb2
gfs_4_20170201_0000_067.grb2
gfs_4_20170201_0000_068.grb2
gfs_4_20170201_0000_069.grb2
gfs_4_20170201_0000_070.grb2
gfs_4_20170201_0000_071.grb2
gfs_4_20170201_0000_072.grb2
gfs_4_20170201_0000_073.grb2
gfs_4_20170201_0000_074.grb2
gfs_4_20170201_0000_075.grb2
gfs_4_20170201_0000_076.grb2
gfs_4_20170201_0000_077.grb2
gfs_4_20170201_0000_078.grb2
gfs_4_20170201_0000_079.grb2
gfs_4_20170201_0000_080.grb2
gfs_4_20170201_0000_081.grb2
gfs_4_20170201_0000_082.grb2
gfs_4_20170201_0000_083.grb2
gfs_4_20170201_0000_084.grb2
gfs_4_20170201_0000_085.grb2
gfs_4_20170201_0000_086.grb2
gfs_4_20170201_0000_087.grb2
gfs_4_20170201_0000_088.grb2
gfs_4_20170201_0000_089.grb2
gfs_4_20170201_0000_090.grb2
gfs_4_20170201_0000_091.grb2
gfs_4_20170201_0000_092.grb2
gfs_4_20170201_0000_093.grb2
gfs_4_20170201_0000_094.grb2
gfs_4_20170201_0000_095.grb2
gfs_4_20170201_0000_096.grb2
gfs_4_20170201_0000_097.grb2
gfs_4_20170201_0000_098.grb2
gfs_4_20170201_0000_099.grb2
gfs_4_20170201_0000_100.grb2
gfs_4_20170201_0000_101.grb2
gfs_4_20170201_0000_102.grb2
gfs_4_20170201_0000_103.grb2
gfs_4_20170201_0000_104.grb2
gfs_4_20170201_0000_105.grb2
gfs_4_20170201_0000_106.grb2
gfs_4_20170201_0000_107.grb2
gfs_4_20170201_0000_108.grb2
gfs_4_20170201_0000_109.grb2
gfs_4_20170201_0000_110.grb2
gfs_4_20170201_0000_111.grb2
gfs_4_20170201_0000_112.grb2
gfs_4_20170201_0000_113.grb2
gfs_4_20170201_0000_114.grb2
gfs_4_20170201_0000_115.grb2
gfs_4_20170201_0000_116.grb2
gfs_4_20170201_0000_117.grb2
gfs_4_20170201_0000_118.grb2
gfs_4_20170201_0000_119.grb2
gfs_4_20170201_0000_120.grb2

Perform the download


In [ ]:
for filename in filenames:
    print filename
    urllib.urlretrieve(url+filename, outpath+filename)


gfs_4_20170201_0000_001.grb2
gfs_4_20170201_0000_002.grb2
gfs_4_20170201_0000_003.grb2
gfs_4_20170201_0000_004.grb2
gfs_4_20170201_0000_005.grb2
gfs_4_20170201_0000_006.grb2
gfs_4_20170201_0000_007.grb2
gfs_4_20170201_0000_008.grb2
gfs_4_20170201_0000_009.grb2
gfs_4_20170201_0000_010.grb2
gfs_4_20170201_0000_011.grb2
gfs_4_20170201_0000_012.grb2
gfs_4_20170201_0000_013.grb2
gfs_4_20170201_0000_014.grb2
gfs_4_20170201_0000_015.grb2
gfs_4_20170201_0000_016.grb2
gfs_4_20170201_0000_017.grb2
gfs_4_20170201_0000_018.grb2
gfs_4_20170201_0000_019.grb2
gfs_4_20170201_0000_020.grb2
gfs_4_20170201_0000_021.grb2
gfs_4_20170201_0000_022.grb2
gfs_4_20170201_0000_023.grb2
gfs_4_20170201_0000_024.grb2
gfs_4_20170201_0000_025.grb2
gfs_4_20170201_0000_026.grb2
gfs_4_20170201_0000_027.grb2
gfs_4_20170201_0000_028.grb2
gfs_4_20170201_0000_029.grb2
gfs_4_20170201_0000_030.grb2

In [ ]: