tabula-py is a tool for convert PDF tables to pandas DataFrame. tabula-py is a wrapper of tabula-java, which requires java on your machine. tabula-py also enales you to convert PDF tables into CSV/TSV files.
tabula-py's PDF extraction accuracy is same as tabula-java or tabula app; GUI tool of tabula, so if you want to know the performance of tabula-py, I highly recommend you to try tabula app.
tabula-py is good for:
In [1]:
!java -version
openjdk version "11.0.7" 2020-04-14
OpenJDK Runtime Environment (build 11.0.7+10-post-Ubuntu-2ubuntu218.04)
OpenJDK 64-Bit Server VM (build 11.0.7+10-post-Ubuntu-2ubuntu218.04, mixed mode, sharing)
After confirming the java environment, install tabula-py by using pip.
In [2]:
# To be more precisely, it's better to use `{sys.executable} -m pip install tabula-py`
!pip install -q tabula-py
|████████████████████████████████| 10.4MB 2.7MB/s
Before trying tabula-py, check your environment via tabula-py environment_info()
function, which shows Python version, Java version, and your OS environment.
In [3]:
import tabula
tabula.environment_info()
Python version:
3.6.9 (default, Apr 18 2020, 01:56:04)
[GCC 8.4.0]
Java version:
openjdk version "11.0.7" 2020-04-14
OpenJDK Runtime Environment (build 11.0.7+10-post-Ubuntu-2ubuntu218.04)
OpenJDK 64-Bit Server VM (build 11.0.7+10-post-Ubuntu-2ubuntu218.04, mixed mode, sharing)
tabula-py version: 2.1.0
platform: Linux-4.19.104+-x86_64-with-Ubuntu-18.04-bionic
uname:
uname_result(system='Linux', node='385ad8f65f50', release='4.19.104+', version='#1 SMP Wed Feb 19 05:26:34 PST 2020', machine='x86_64', processor='x86_64')
linux_distribution: ('Ubuntu', '18.04', 'bionic')
mac_ver: ('', ('', '', ''), '')
In [5]:
import tabula
pdf_path = "https://github.com/chezou/tabula-py/raw/master/tests/resources/data.pdf"
dfs = tabula.read_pdf(pdf_path, stream=True)
# read_pdf returns list of DataFrames
print(len(dfs))
dfs[0]
'pages' argument isn't specified.Will extract only from page 1 by default.
Got stderr: Jun 04, 2020 8:22:21 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font Symbol
Jun 04, 2020 8:22:21 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font ZapfDingbats
1
Out[5]:
Unnamed: 0
mpg
cyl
disp
hp
drat
wt
qsec
vs
am
gear
carb
0
Mazda RX4
21.0
6
160.0
110
3.90
2.620
16.46
0
1
4
4
1
Mazda RX4 Wag
21.0
6
160.0
110
3.90
2.875
17.02
0
1
4
4
2
Datsun 710
22.8
4
108.0
93
3.85
2.320
18.61
1
1
4
1
3
Hornet 4 Drive
21.4
6
258.0
110
3.08
3.215
19.44
1
0
3
1
4
Hornet Sportabout
18.7
8
360.0
175
3.15
3.440
17.02
0
0
3
2
5
Valiant
18.1
6
225.0
105
2.76
3.460
20.22
1
0
3
1
6
Duster 360
14.3
8
360.0
245
3.21
3.570
15.84
0
0
3
4
7
Merc 240D
24.4
4
146.7
62
3.69
3.190
20.00
1
0
4
2
8
Merc 230
22.8
4
140.8
95
3.92
3.150
22.90
1
0
4
2
9
Merc 280
19.2
6
167.6
123
3.92
3.440
18.30
1
0
4
4
10
Merc 280C
17.8
6
167.6
123
3.92
3.440
18.90
1
0
4
4
11
Merc 450SE
16.4
8
275.8
180
3.07
4.070
17.40
0
0
3
3
12
Merc 450SL
17.3
8
275.8
180
3.07
3.730
17.60
0
0
3
3
13
Merc 450SLC
15.2
8
275.8
180
3.07
3.780
18.00
0
0
3
3
14
Cadillac Fleetwood
10.4
8
472.0
205
2.93
5.250
17.98
0
0
3
4
15
Lincoln Continental
10.4
8
460.0
215
3.00
5.424
17.82
0
0
3
4
16
Chrysler Imperial
14.7
8
440.0
230
3.23
5.345
17.42
0
0
3
4
17
Fiat 128
32.4
4
78.7
66
4.08
2.200
19.47
1
1
4
1
18
Honda Civic
30.4
4
75.7
52
4.93
1.615
18.52
1
1
4
2
19
Toyota Corolla
33.9
4
71.1
65
4.22
1.835
19.90
1
1
4
1
20
Toyota Corona
21.5
4
120.1
97
3.70
2.465
20.01
1
0
3
1
21
Dodge Challenger
15.5
8
318.0
150
2.76
3.520
16.87
0
0
3
2
22
AMC Javelin
15.2
8
304.0
150
3.15
3.435
17.30
0
0
3
2
23
Camaro Z28
13.3
8
350.0
245
3.73
3.840
15.41
0
0
3
4
24
Pontiac Firebird
19.2
8
400.0
175
3.08
3.845
17.05
0
0
3
2
25
Fiat X1-9
27.3
4
79.0
66
4.08
1.935
18.90
1
1
4
1
26
Porsche 914-2
26.0
4
120.3
91
4.43
2.140
16.70
0
1
5
2
27
Lotus Europa
30.4
4
95.1
113
3.77
1.513
16.90
1
1
5
2
28
Ford Pantera L
15.8
8
351.0
264
4.22
3.170
14.50
0
1
5
4
29
Ferrari Dino
19.7
6
145.0
175
3.62
2.770
15.50
0
1
5
6
30
Maserati Bora
15.0
8
301.0
335
3.54
3.570
14.60
0
1
5
8
31
Volvo 142E
21.4
4
121.0
109
4.11
2.780
18.60
1
1
4
2
In [6]:
help(tabula.read_pdf)
Help on function read_pdf in module tabula.io:
read_pdf(input_path, output_format=None, encoding='utf-8', java_options=None, pandas_options=None, multiple_tables=True, user_agent=None, **kwargs)
Read tables in PDF.
Args:
input_path (str, path object or file-like object):
File like object of tareget PDF file.
It can be URL, which is downloaded by tabula-py automatically.
output_format (str, optional):
Output format for returned object (``dataframe`` or ``json``)
encoding (str, optional):
Encoding type for pandas. Default: ``utf-8``
java_options (list, optional):
Set java options.
Example:
``["-Xmx256m"]``
pandas_options (dict, optional):
Set pandas options.
Example:
``{'header': None}``
Note:
With ``multiple_tables=True`` (default), pandas_options is passed
to pandas.read_csv, otherwise it is passed to pandas.DataFrame.
Those two functions are different for accept options like ``dtype``.
multiple_tables (bool):
It enables to handle multiple tables within a page. Default: ``True``
Note:
If `multiple_tables` option is enabled, tabula-py uses not
:func:`pd.read_csv()`, but :func:`pd.DataFrame()`. Make
sure to pass appropriate `pandas_options`.
user_agent (str, optional):
Set a custom user-agent when download a pdf from a url. Otherwise
it uses the default ``urllib.request`` user-agent.
kwargs:
Dictionary of option for tabula-java. Details are shown in
:func:`build_options()`
Returns:
list of DataFrames or dict.
Raises:
FileNotFoundError:
If downloaded remote file doesn't exist.
ValueError:
If output_format is unknown format, or if downloaded remote file size is 0.
tabula.errors.CSVParseError:
If pandas CSV parsing failed.
tabula.errors.JavaNotFoundError:
If java is not installed or found.
subprocess.CalledProcessError:
If tabula-java execution failed.
Examples:
Here is a simple example.
Note that :func:`read_pdf()` only extract page 1 by default.
Notes:
As of tabula-py 2.0.0, :func:`read_pdf()` sets `multiple_tables=True` by
default. If you want to get consistent output with previous version, set
`multiple_tables=False`.
>>> import tabula
>>> pdf_path = "https://github.com/chezou/tabula-py/raw/master/tests/resources/data.pdf"
>>> tabula.read_pdf(pdf_path, stream=True)
[ Unnamed: 0 mpg cyl disp hp drat wt qsec vs am gear carb
0 Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
1 Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
2 Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
3 Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
4 Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
5 Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
6 Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
7 Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
8 Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
9 Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
10 Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
11 Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
12 Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
13 Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
14 Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
15 Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
16 Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
17 Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
18 Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
19 Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
20 Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
21 Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
22 AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
23 Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
24 Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
25 Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
26 Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
27 Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
28 Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
29 Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
30 Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
31 Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2]
If you want to extract all pages, set ``pages="all"``.
>>> dfs = tabula.read_pdf(pdf_path, pages="all")
>>> len(dfs)
4
>>> dfs
[ 0 1 2 3 4 5 6 7 8 9
0 mpg cyl disp hp drat wt qsec vs am gear
1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4
2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4
3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4
4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3
5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3
6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3
7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3
8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4
9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4
10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4
11 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4
12 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3
13 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3
14 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3
15 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3
16 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3
17 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3
18 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4
19 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4
20 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4
21 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3
22 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3
23 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3
24 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3
25 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3
26 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4
27 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5
28 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5
29 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5
30 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5
31 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5, 0 1 2 3 4
0 Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa, 0 1 2 3 4 5
0 NaN Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 145 6.7 3.3 5.7 2.5 virginica
2 146 6.7 3.0 5.2 2.3 virginica
3 147 6.3 2.5 5.0 1.9 virginica
4 148 6.5 3.0 5.2 2.0 virginica
5 149 6.2 3.4 5.4 2.3 virginica
6 150 5.9 3.0 5.1 1.8 virginica, 0
0 supp
1 VC
2 VC
3 VC
4 VC
5 VC
6 VC
7 VC
8 VC
9 VC
10 VC
11 VC
12 VC
13 VC
14 VC]
In [7]:
help(tabula.io.build_options)
Help on function build_options in module tabula.io:
build_options(pages=None, guess=True, area=None, relative_area=False, lattice=False, stream=False, password=None, silent=None, columns=None, format=None, batch=None, output_path=None, options='')
Build options for tabula-java
Args:
pages (str, int, `list` of `int`, optional):
An optional values specifying pages to extract from. It allows
`str`,`int`, `list` of :`int`. Default: `1`
Examples:
``'1-2,3'``, ``'all'``, ``[1,2]``
guess (bool, optional):
Guess the portion of the page to analyze per page. Default `True`
If you use "area" option, this option becomes `False`.
Note:
As of tabula-java 1.0.3, guess option becomes independent from
lattice and stream option, you can use guess and lattice/stream option
at the same time.
area (list of float, list of list of float, optional):
Portion of the page to analyze(top,left,bottom,right).
Default is entire page.
Note:
If you want to use multiple area options and extract in one table, it
should be better to set ``multiple_tables=False`` for :func:`read_pdf()`
Examples:
``[269.875,12.75,790.5,561]``,
``[[12.1,20.5,30.1,50.2], [1.0,3.2,10.5,40.2]]``
relative_area (bool, optional):
If all area values are between 0-100 (inclusive) and preceded by ``'%'``,
input will be taken as % of actual height or width of the page.
Default ``False``.
lattice (bool, optional):
Force PDF to be extracted using lattice-mode extraction
(if there are ruling lines separating each cell, as in a PDF of an
Excel spreadsheet)
stream (bool, optional):
Force PDF to be extracted using stream-mode extraction
(if there are no ruling lines separating each cell, as in a PDF of an
Excel spreadsheet)
password (str, optional):
Password to decrypt document. Default: empty
silent (bool, optional):
Suppress all stderr output.
columns (list, optional):
X coordinates of column boundaries.
Example:
``[10.1, 20.2, 30.3]``
format (str, optional):
Format for output file or extracted object.
(``"CSV"``, ``"TSV"``, ``"JSON"``)
batch (str, optional):
Convert all PDF files in the provided directory. This argument should be
directory path.
output_path (str, optional):
Output file path. File format of it is depends on ``format``.
Same as ``--outfile`` option of tabula-java.
options (str, optional):
Raw option string for tabula-java.
Returns:
list:
Built list of options
Let's set pages
option. Here is the extraction result of page 3:
In [8]:
# set pages option
dfs = tabula.read_pdf(pdf_path, pages=3, stream=True)
dfs[0]
Got stderr: Jun 04, 2020 8:22:51 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font Symbol
Jun 04, 2020 8:22:51 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font ZapfDingbats
Out[8]:
len
supp
dose
0
4.2
VC
0.5
1
11.5
VC
0.5
2
7.3
VC
0.5
3
5.8
VC
0.5
4
6.4
VC
0.5
5
10.0
VC
0.5
6
11.2
VC
0.5
7
11.2
VC
0.5
8
5.2
VC
0.5
9
7.0
VC
0.5
10
16.5
VC
1.0
11
16.5
VC
1.0
12
15.2
VC
1.0
13
17.3
VC
1.0
14
22.5
VC
1.0
In [9]:
# pass pages as string
tabula.read_pdf(pdf_path, pages="1-2,3", stream=True)
Got stderr: Jun 04, 2020 8:23:57 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font Symbol
Jun 04, 2020 8:23:57 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font ZapfDingbats
Out[9]:
[ Unnamed: 0 mpg cyl disp hp ... qsec vs am gear carb
0 Mazda RX4 21.0 6 160.0 110 ... 16.46 0 1 4 4
1 Mazda RX4 Wag 21.0 6 160.0 110 ... 17.02 0 1 4 4
2 Datsun 710 22.8 4 108.0 93 ... 18.61 1 1 4 1
3 Hornet 4 Drive 21.4 6 258.0 110 ... 19.44 1 0 3 1
4 Hornet Sportabout 18.7 8 360.0 175 ... 17.02 0 0 3 2
5 Valiant 18.1 6 225.0 105 ... 20.22 1 0 3 1
6 Duster 360 14.3 8 360.0 245 ... 15.84 0 0 3 4
7 Merc 240D 24.4 4 146.7 62 ... 20.00 1 0 4 2
8 Merc 230 22.8 4 140.8 95 ... 22.90 1 0 4 2
9 Merc 280 19.2 6 167.6 123 ... 18.30 1 0 4 4
10 Merc 280C 17.8 6 167.6 123 ... 18.90 1 0 4 4
11 Merc 450SE 16.4 8 275.8 180 ... 17.40 0 0 3 3
12 Merc 450SL 17.3 8 275.8 180 ... 17.60 0 0 3 3
13 Merc 450SLC 15.2 8 275.8 180 ... 18.00 0 0 3 3
14 Cadillac Fleetwood 10.4 8 472.0 205 ... 17.98 0 0 3 4
15 Lincoln Continental 10.4 8 460.0 215 ... 17.82 0 0 3 4
16 Chrysler Imperial 14.7 8 440.0 230 ... 17.42 0 0 3 4
17 Fiat 128 32.4 4 78.7 66 ... 19.47 1 1 4 1
18 Honda Civic 30.4 4 75.7 52 ... 18.52 1 1 4 2
19 Toyota Corolla 33.9 4 71.1 65 ... 19.90 1 1 4 1
20 Toyota Corona 21.5 4 120.1 97 ... 20.01 1 0 3 1
21 Dodge Challenger 15.5 8 318.0 150 ... 16.87 0 0 3 2
22 AMC Javelin 15.2 8 304.0 150 ... 17.30 0 0 3 2
23 Camaro Z28 13.3 8 350.0 245 ... 15.41 0 0 3 4
24 Pontiac Firebird 19.2 8 400.0 175 ... 17.05 0 0 3 2
25 Fiat X1-9 27.3 4 79.0 66 ... 18.90 1 1 4 1
26 Porsche 914-2 26.0 4 120.3 91 ... 16.70 0 1 5 2
27 Lotus Europa 30.4 4 95.1 113 ... 16.90 1 1 5 2
28 Ford Pantera L 15.8 8 351.0 264 ... 14.50 0 1 5 4
29 Ferrari Dino 19.7 6 145.0 175 ... 15.50 0 1 5 6
30 Maserati Bora 15.0 8 301.0 335 ... 14.60 0 1 5 8
31 Volvo 142E 21.4 4 121.0 109 ... 18.60 1 1 4 2
[32 rows x 12 columns],
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa
5 5.4 3.9 1.7 0.4 setosa,
Unnamed: 0 Sepal.Length Sepal.Width Petal.Length Petal.Width Species
0 145 6.7 3.3 5.7 2.5 virginica
1 146 6.7 3.0 5.2 2.3 virginica
2 147 6.3 2.5 5.0 1.9 virginica
3 148 6.5 3.0 5.2 2.0 virginica
4 149 6.2 3.4 5.4 2.3 virginica
5 150 5.9 3.0 5.1 1.8 virginica,
len supp dose
0 4.2 VC 0.5
1 11.5 VC 0.5
2 7.3 VC 0.5
3 5.8 VC 0.5
4 6.4 VC 0.5
5 10.0 VC 0.5
6 11.2 VC 0.5
7 11.2 VC 0.5
8 5.2 VC 0.5
9 7.0 VC 0.5
10 16.5 VC 1.0
11 16.5 VC 1.0
12 15.2 VC 1.0
13 17.3 VC 1.0
14 22.5 VC 1.0]
You can set pages="all"
for extration all pages. If you hit OOM error with Java, you should set appropriate -Xmx
option for java_options
.
In [10]:
# extract all pages
tabula.read_pdf(pdf_path, pages="all", stream=True)
Got stderr: Jun 04, 2020 8:24:02 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font Symbol
Jun 04, 2020 8:24:02 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font ZapfDingbats
Out[10]:
[ Unnamed: 0 mpg cyl disp hp ... qsec vs am gear carb
0 Mazda RX4 21.0 6 160.0 110 ... 16.46 0 1 4 4
1 Mazda RX4 Wag 21.0 6 160.0 110 ... 17.02 0 1 4 4
2 Datsun 710 22.8 4 108.0 93 ... 18.61 1 1 4 1
3 Hornet 4 Drive 21.4 6 258.0 110 ... 19.44 1 0 3 1
4 Hornet Sportabout 18.7 8 360.0 175 ... 17.02 0 0 3 2
5 Valiant 18.1 6 225.0 105 ... 20.22 1 0 3 1
6 Duster 360 14.3 8 360.0 245 ... 15.84 0 0 3 4
7 Merc 240D 24.4 4 146.7 62 ... 20.00 1 0 4 2
8 Merc 230 22.8 4 140.8 95 ... 22.90 1 0 4 2
9 Merc 280 19.2 6 167.6 123 ... 18.30 1 0 4 4
10 Merc 280C 17.8 6 167.6 123 ... 18.90 1 0 4 4
11 Merc 450SE 16.4 8 275.8 180 ... 17.40 0 0 3 3
12 Merc 450SL 17.3 8 275.8 180 ... 17.60 0 0 3 3
13 Merc 450SLC 15.2 8 275.8 180 ... 18.00 0 0 3 3
14 Cadillac Fleetwood 10.4 8 472.0 205 ... 17.98 0 0 3 4
15 Lincoln Continental 10.4 8 460.0 215 ... 17.82 0 0 3 4
16 Chrysler Imperial 14.7 8 440.0 230 ... 17.42 0 0 3 4
17 Fiat 128 32.4 4 78.7 66 ... 19.47 1 1 4 1
18 Honda Civic 30.4 4 75.7 52 ... 18.52 1 1 4 2
19 Toyota Corolla 33.9 4 71.1 65 ... 19.90 1 1 4 1
20 Toyota Corona 21.5 4 120.1 97 ... 20.01 1 0 3 1
21 Dodge Challenger 15.5 8 318.0 150 ... 16.87 0 0 3 2
22 AMC Javelin 15.2 8 304.0 150 ... 17.30 0 0 3 2
23 Camaro Z28 13.3 8 350.0 245 ... 15.41 0 0 3 4
24 Pontiac Firebird 19.2 8 400.0 175 ... 17.05 0 0 3 2
25 Fiat X1-9 27.3 4 79.0 66 ... 18.90 1 1 4 1
26 Porsche 914-2 26.0 4 120.3 91 ... 16.70 0 1 5 2
27 Lotus Europa 30.4 4 95.1 113 ... 16.90 1 1 5 2
28 Ford Pantera L 15.8 8 351.0 264 ... 14.50 0 1 5 4
29 Ferrari Dino 19.7 6 145.0 175 ... 15.50 0 1 5 6
30 Maserati Bora 15.0 8 301.0 335 ... 14.60 0 1 5 8
31 Volvo 142E 21.4 4 121.0 109 ... 18.60 1 1 4 2
[32 rows x 12 columns],
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa
5 5.4 3.9 1.7 0.4 setosa,
Unnamed: 0 Sepal.Length Sepal.Width Petal.Length Petal.Width Species
0 145 6.7 3.3 5.7 2.5 virginica
1 146 6.7 3.0 5.2 2.3 virginica
2 147 6.3 2.5 5.0 1.9 virginica
3 148 6.5 3.0 5.2 2.0 virginica
4 149 6.2 3.4 5.4 2.3 virginica
5 150 5.9 3.0 5.1 1.8 virginica,
len supp dose
0 4.2 VC 0.5
1 11.5 VC 0.5
2 7.3 VC 0.5
3 5.8 VC 0.5
4 6.4 VC 0.5
5 10.0 VC 0.5
6 11.2 VC 0.5
7 11.2 VC 0.5
8 5.2 VC 0.5
9 7.0 VC 0.5
10 16.5 VC 1.0
11 16.5 VC 1.0
12 15.2 VC 1.0
13 17.3 VC 1.0
14 22.5 VC 1.0]
In [11]:
# set area option
dfs = tabula.read_pdf(pdf_path, area=[126,149,212,462], pages=2)
dfs[0]
Got stderr: Jun 04, 2020 8:24:12 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font Symbol
Jun 04, 2020 8:24:12 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font ZapfDingbats
Out[11]:
Unnamed: 0
Sepal.Width
Petal.Length
Petal.Width
Species
0
5.1
3.5
1.4
0.2
setosa
1
4.9
3.0
1.4
0.2
setosa
2
4.7
3.2
1.3
0.2
setosa
3
4.6
3.1
1.5
0.2
setosa
4
5.0
3.6
1.4
0.2
setosa
5
3.9
1.7
0.4
setosa
NaN
In [12]:
pdf_path2 = "https://github.com/chezou/tabula-py/raw/master/tests/resources/campaign_donors.pdf"
dfs = tabula.read_pdf(pdf_path2, columns=[47, 147, 256, 310, 375, 431, 504], guess=False, pages=1)
df = dfs[0].drop(["Unnamed: 0"], axis=1)
df
Out[12]:
Apellido
Nombre
Matricula
Cuit
Fecha
Tipo
Importe
0
MENA
JUAN MARTÍN
27.083.460
20-27083460-5
09/10/2013
EFECTIVO
$ 10.000,00
1
MOLLE
MATÍAS
25.348.547
20-25348547-8
09/10/2013
EFECTIVO
$ 10.000,00
2
MOLLEVI
FEDERICO OSCAR
25.028.246
20-25028246-0
09/10/2013
EFECTIVO
$ 10.000,00
3
PERAZZO
PABLO DANIEL
25.348.394
20-25348394-7
09/10/2013
EFECTIVO
$ 10.000,00
4
PICARDI
FRANCO EDUARDO
27.382.271
20-27382271-3
09/10/2013
EFECTIVO
$ 10.000,00
5
PISONI
CARLOS ENRIQUE
26.034.823
20-26034823-0
09/10/2013
EFECTIVO
$ 10.000,00
6
PONTORIERO
MARÍA PAULA
23.249.597
27-23249597-4
09/10/2013
EFECTIVO
$ 10.000,00
7
PULESTON
JUAN MIGUEL
11.895.661
20-11895661-4
09/10/2013
EFECTIVO
$ 10.000,00
8
REMÓN
MABEL AURORA
11.292.939
27-11292939-3
09/10/2013
EFECTIVO
$ 10.000,00
9
SARRABAYROUSE
DIEGO
24.662.899
20-24662899-9
09/10/2013
EFECTIVO
$ 10.000,00
10
SPATOLA
ANALÍA CARMEN
24.560.922
27-24560922-7
09/10/2013
EFECTIVO
$ 10.000,00
11
STAMILLA
SERGIO ADRIÁN
29.364.226
20-29364226-6
09/10/2013
EFECTIVO
$ 10.000,00
12
SZARANGOWICZ
GUSTAVO ALEJANDRO
25.096.244
20-25096244-5
09/10/2013
EFECTIVO
$ 10.000,00
13
TAILHADE
LUIS RODOLFO
21.386.299
20-21386299-6
09/10/2013
EFECTIVO
$ 10.000,00
14
TEDESCHI
ADRIÁN ALBERTO
24.171.507
20-24171507-9
09/10/2013
EFECTIVO
$ 10.000,00
15
URRIZA
MARÍA TERESA
18.135.604
27-18135604-4
09/10/2013
EFECTIVO
$ 10.000,00
16
USTARROZ
GERÓNIMO JAVIER
24.912.947
20-24912947-0
09/10/2013
EFECTIVO
$ 10.000,00
17
VALSANGIACOMO BLANC
OFERNANDO JORGE
26.800.203
20-26800203-1
09/10/2013
EFECTIVO
$ 10.000,00
18
VICENTE
PABLO ARIEL
21.897.586
20-21897586-1
09/10/2013
EFECTIVO
$ 10.000,00
19
AMBURI
HUGO ALBERTO
14.096.560
20-14096560-0
09/10/2013
EFECTIVO
$ 20.000,00
20
BERRA
CLAUDIA SUSANA
14.433.112
27-14433112-0
09/10/2013
EFECTIVO
$ 10.000,00
21
LASALA
RICARDO ALBERTO
21.760.811
20-21760811-3
09/10/2013
EFECTIVO
$ 10.000,00
22
AZPITARTE
MARCELO CARLOS
16.018.569
20-16018569-5
09/10/2013
EFECTIVO
$ 10.000,00
23
BERARDO
CRISTINA BEATRIZ
12.609.615
27-12609615-7
09/10/2013
EFECTIVO
$ 10.000,00
24
OREIRO
RODOLFO MIGUEL
13.655.728
23-13655728-9
09/10/2013
EFECTIVO
$ 10.000,00
25
FAIENZA
MIGUEL ERNESTO
27.264.638
20-27264638-5
09/10/2013
EFECTIVO
$ 20.000,00
26
GRANILLO FERNANDEZ
JUAN MANUEL
27.235.280
20-27235280-2
09/10/2013
EFECTIVO
$ 10.000,00
27
RODRIGUEZ PINGITORE
CLAUDIA MONICA
23.343.332
27-23343332-8
09/10/2013
EFECTIVO
$ 10.000,00
28
CORBELLA
TERESITA GRACIELA
11.353.491
27-11353491-0
09/10/2013
EFECTIVO
$ 5.000,00
29
CORVINO
GERMAN HORACIO
27.491.524
20-27491524-3
09/10/2013
EFECTIVO
$ 20.000,00
30
GARCIA DELATOUR
AGUSTINA
29.501.230
27-29501230-2
09/10/2013
EFECTIVO
$ 5.000,00
31
AGUIRRE
LIDIA ELIZABETH
17.997.634
27-17997634-5
09/10/2013
EFECTIVO
$ 5.000,00
32
FORNERO
RAUL OSCAR
10.177.167
23-10177167-9
09/10/2013
EFECTIVO
$ 5.000,00
33
SKOCILIC
NATALIA ELIZABETH
30.237.737
27-30237737-0
09/10/2013
EFECTIVO
$ 5.000,00
34
SARRA
CARMEN
11.812.381
27-11812381-1
09/10/2013
EFECTIVO
$ 20.000,00
35
VIVALDO
MARÍA ITATÍ
20.458.214
27-20458214-4
09/10/2013
EFECTIVO
$ 10.000,00
36
LORENZO
FERNANDO LUIS
22.608.380
20-22608380-5
09/10/2013
EFECTIVO
$ 15.000,00
37
VESTILLERO
CARLOS SEBASTIAN
25.638.595
20-25638595-4
09/10/2013
EFECTIVO
$ 10.000,00
38
CASTRO
MARIA VICTORIA
22.642.061
23-22642061-4
09/10/2013
EFECTIVO
$ 10.000,00
39
PÉREZ SIMONDINI
ANDREA FABIANA
17.768.992
27-17768992-6
09/10/2013
EFECTIVO
$ 10.000,00
40
CLAVERO MITRE
LILIANA INES
11.176.760
27-11176760-8
09/10/2013
EFECTIVO
$ 10.000,00
41
SÁNCHEZ
MARIANA CLARA
20.205.651
27-20205651-8
09/10/2013
EFECTIVO
$ 10.000,00
42
TALLÓN
MATÍAS NORBERTO
33.498.601
20-33498601-3
09/10/2013
EFECTIVO
$ 10.000,00
43
SUAREZ
FERNANDO
11.633.291
20-11633291-5
09/10/2013
EFECTIVO
$ 10.000,00
44
TARSINO
MARIA EVELYN
23.276.482
27-23276482-7
09/10/2013
EFECTIVO
$ 10.000,00
45
GRIMBERG MARQUEZ
PAULA
27.680.600
27-27680600-4
09/10/2013
EFECTIVO
$ 10.000,00
46
THORP
MARIO
18.179.104
20-18179104-8
09/10/2013
EFECTIVO
$ 10.000,00
47
BARABAN
LORENA ELIZABETH
23.747.106
27-23747106-2
09/10/2013
EFECTIVO
$ 10.000,00
48
FRECHERO
EZEQUIEL
27.746.115
20-27746115-4
09/10/2013
EFECTIVO
$ 10.000,00
49
TORRES
CARLOS HUGO
25.957.223
20-25957223-2
09/10/2013
EFECTIVO
$ 10.000,00
50
BILLERES
GASTON MARTIN
23.301.948
20-23301948-9
09/10/2013
EFECTIVO
$ 10.000,00
51
RODRIGUEZ
EDGARDO JAVIER
21.502.793
20-21502793-8
09/10/2013
EFECTIVO
$ 10.000,00
52
KARPINSKY
ENRIQUE MARCELO
14.877.266
20-14877266-6
09/10/2013
EFECTIVO
$ 10.000,00
53
GRIMBERG MARQUEZ
JUAN CARLOS
7.831.240
20-07831240-9
09/10/2013
EFECTIVO
$ 10.000,00
54
BOGADO GONZALEZ
JULIO BERNABE
27.592.814
20-27592814-4
09/10/2013
EFECTIVO
$ 10.000,00
55
NaN
NaN
NaN
NaN
ANE
XO I-A-1 Págin
a 12 de 14
In [13]:
# read pdf as JSON
tabula.read_pdf(pdf_path, output_format="json")
'pages' argument isn't specified.Will extract only from page 1 by default.
Got stderr: Jun 04, 2020 8:24:24 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font Symbol
Jun 04, 2020 8:24:24 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font ZapfDingbats
Out[13]:
[{'bottom': 520.3189,
'data': [[{'height': 12.186347961425781,
'left': 247.14917,
'text': 'mpg',
'top': 125.17005,
'width': 30.773834228515625},
{'height': 12.186347961425781,
'left': 277.923,
'text': 'cyl',
'top': 125.17005,
'width': 24.407867431640625},
{'height': 12.186347961425781,
'left': 302.33087,
'text': 'disp',
'top': 125.17005,
'width': 34.6478271484375},
{'height': 12.186347961425781,
'left': 336.9787,
'text': 'hp',
'top': 125.17005,
'width': 26.899566650390625},
{'height': 12.186347961425781,
'left': 363.87827,
'text': 'drat',
'top': 125.17005,
'width': 30.24810791015625},
{'height': 12.186347961425781,
'left': 394.12637,
'text': 'wt',
'top': 125.17005,
'width': 34.64752197265625},
{'height': 12.186347961425781,
'left': 428.7739,
'text': 'qsec',
'top': 125.17005,
'width': 34.64801025390625},
{'height': 12.186347961425781,
'left': 463.4219,
'text': 'vs',
'top': 125.17005,
'width': 21.1429443359375},
{'height': 12.186347961425781,
'left': 484.56485,
'text': 'am',
'top': 125.17005,
'width': 25.238006591796875},
{'height': 12.186347961425781,
'left': 509.80286,
'text': 'gear',
'top': 125.17005,
'width': 30.24798583984375}],
[{'height': 12.352630615234375,
'left': 247.14917,
'text': '21.0',
'top': 137.3564,
'width': 30.773834228515625},
{'height': 12.352630615234375,
'left': 277.923,
'text': '6',
'top': 137.3564,
'width': 24.407867431640625},
{'height': 12.352630615234375,
'left': 302.33087,
'text': '160.0',
'top': 137.3564,
'width': 34.6478271484375},
{'height': 12.352630615234375,
'left': 336.9787,
'text': '110',
'top': 137.3564,
'width': 26.899566650390625},
{'height': 12.352630615234375,
'left': 363.87827,
'text': '3.90',
'top': 137.3564,
'width': 30.24810791015625},
{'height': 12.352630615234375,
'left': 394.12637,
'text': '2.620',
'top': 137.3564,
'width': 34.64752197265625},
{'height': 12.352630615234375,
'left': 428.7739,
'text': '16.46',
'top': 137.3564,
'width': 34.64801025390625},
{'height': 12.352630615234375,
'left': 463.4219,
'text': '0',
'top': 137.3564,
'width': 21.1429443359375},
{'height': 12.352630615234375,
'left': 484.56485,
'text': '1',
'top': 137.3564,
'width': 25.238006591796875},
{'height': 12.352630615234375,
'left': 509.80286,
'text': '4',
'top': 137.3564,
'width': 30.24798583984375}],
[{'height': 12.35186767578125,
'left': 247.14917,
'text': '21.0',
'top': 149.70903,
'width': 30.773834228515625},
{'height': 12.35186767578125,
'left': 277.923,
'text': '6',
'top': 149.70903,
'width': 24.407867431640625},
{'height': 12.35186767578125,
'left': 302.33087,
'text': '160.0',
'top': 149.70903,
'width': 34.6478271484375},
{'height': 12.35186767578125,
'left': 336.9787,
'text': '110',
'top': 149.70903,
'width': 26.899566650390625},
{'height': 12.35186767578125,
'left': 363.87827,
'text': '3.90',
'top': 149.70903,
'width': 30.24810791015625},
{'height': 12.35186767578125,
'left': 394.12637,
'text': '2.875',
'top': 149.70903,
'width': 34.64752197265625},
{'height': 12.35186767578125,
'left': 428.7739,
'text': '17.02',
'top': 149.70903,
'width': 34.64801025390625},
{'height': 12.35186767578125,
'left': 463.4219,
'text': '0',
'top': 149.70903,
'width': 21.1429443359375},
{'height': 12.35186767578125,
'left': 484.56485,
'text': '1',
'top': 149.70903,
'width': 25.238006591796875},
{'height': 12.35186767578125,
'left': 509.80286,
'text': '4',
'top': 149.70903,
'width': 30.24798583984375}],
[{'height': 12.35662841796875,
'left': 247.14917,
'text': '22.8',
'top': 162.0609,
'width': 30.773834228515625},
{'height': 12.35662841796875,
'left': 277.923,
'text': '4',
'top': 162.0609,
'width': 24.407867431640625},
{'height': 12.35662841796875,
'left': 302.33087,
'text': '108.0',
'top': 162.0609,
'width': 34.6478271484375},
{'height': 12.35662841796875,
'left': 336.9787,
'text': '93',
'top': 162.0609,
'width': 26.899566650390625},
{'height': 12.35662841796875,
'left': 363.87827,
'text': '3.85',
'top': 162.0609,
'width': 30.24810791015625},
{'height': 12.35662841796875,
'left': 394.12637,
'text': '2.320',
'top': 162.0609,
'width': 34.64752197265625},
{'height': 12.35662841796875,
'left': 428.7739,
'text': '18.61',
'top': 162.0609,
'width': 34.64801025390625},
{'height': 12.35662841796875,
'left': 463.4219,
'text': '1',
'top': 162.0609,
'width': 21.1429443359375},
{'height': 12.35662841796875,
'left': 484.56485,
'text': '1',
'top': 162.0609,
'width': 25.238006591796875},
{'height': 12.35662841796875,
'left': 509.80286,
'text': '4',
'top': 162.0609,
'width': 30.24798583984375}],
[{'height': 12.352005004882812,
'left': 247.14917,
'text': '21.4',
'top': 174.41753,
'width': 30.773834228515625},
{'height': 12.352005004882812,
'left': 277.923,
'text': '6',
'top': 174.41753,
'width': 24.407867431640625},
{'height': 12.352005004882812,
'left': 302.33087,
'text': '258.0',
'top': 174.41753,
'width': 34.6478271484375},
{'height': 12.352005004882812,
'left': 336.9787,
'text': '110',
'top': 174.41753,
'width': 26.899566650390625},
{'height': 12.352005004882812,
'left': 363.87827,
'text': '3.08',
'top': 174.41753,
'width': 30.24810791015625},
{'height': 12.352005004882812,
'left': 394.12637,
'text': '3.215',
'top': 174.41753,
'width': 34.64752197265625},
{'height': 12.352005004882812,
'left': 428.7739,
'text': '19.44',
'top': 174.41753,
'width': 34.64801025390625},
{'height': 12.352005004882812,
'left': 463.4219,
'text': '1',
'top': 174.41753,
'width': 21.1429443359375},
{'height': 12.352005004882812,
'left': 484.56485,
'text': '0',
'top': 174.41753,
'width': 25.238006591796875},
{'height': 12.352005004882812,
'left': 509.80286,
'text': '3',
'top': 174.41753,
'width': 30.24798583984375}],
[{'height': 12.351882934570312,
'left': 247.14917,
'text': '18.7',
'top': 186.76953,
'width': 30.773834228515625},
{'height': 12.351882934570312,
'left': 277.923,
'text': '8',
'top': 186.76953,
'width': 24.407867431640625},
{'height': 12.351882934570312,
'left': 302.33087,
'text': '360.0',
'top': 186.76953,
'width': 34.6478271484375},
{'height': 12.351882934570312,
'left': 336.9787,
'text': '175',
'top': 186.76953,
'width': 26.899566650390625},
{'height': 12.351882934570312,
'left': 363.87827,
'text': '3.15',
'top': 186.76953,
'width': 30.24810791015625},
{'height': 12.351882934570312,
'left': 394.12637,
'text': '3.440',
'top': 186.76953,
'width': 34.64752197265625},
{'height': 12.351882934570312,
'left': 428.7739,
'text': '17.02',
'top': 186.76953,
'width': 34.64801025390625},
{'height': 12.351882934570312,
'left': 463.4219,
'text': '0',
'top': 186.76953,
'width': 21.1429443359375},
{'height': 12.351882934570312,
'left': 484.56485,
'text': '0',
'top': 186.76953,
'width': 25.238006591796875},
{'height': 12.351882934570312,
'left': 509.80286,
'text': '3',
'top': 186.76953,
'width': 30.24798583984375}],
[{'height': 12.356597900390625,
'left': 247.14917,
'text': '18.1',
'top': 199.12141,
'width': 30.773834228515625},
{'height': 12.356597900390625,
'left': 277.923,
'text': '6',
'top': 199.12141,
'width': 24.407867431640625},
{'height': 12.356597900390625,
'left': 302.33087,
'text': '225.0',
'top': 199.12141,
'width': 34.6478271484375},
{'height': 12.356597900390625,
'left': 336.9787,
'text': '105',
'top': 199.12141,
'width': 26.899566650390625},
{'height': 12.356597900390625,
'left': 363.87827,
'text': '2.76',
'top': 199.12141,
'width': 30.24810791015625},
{'height': 12.356597900390625,
'left': 394.12637,
'text': '3.460',
'top': 199.12141,
'width': 34.64752197265625},
{'height': 12.356597900390625,
'left': 428.7739,
'text': '20.22',
'top': 199.12141,
'width': 34.64801025390625},
{'height': 12.356597900390625,
'left': 463.4219,
'text': '1',
'top': 199.12141,
'width': 21.1429443359375},
{'height': 12.356597900390625,
'left': 484.56485,
'text': '0',
'top': 199.12141,
'width': 25.238006591796875},
{'height': 12.356597900390625,
'left': 509.80286,
'text': '3',
'top': 199.12141,
'width': 30.24798583984375}],
[{'height': 12.351959228515625,
'left': 247.14917,
'text': '14.3',
'top': 211.47801,
'width': 30.773834228515625},
{'height': 12.351959228515625,
'left': 277.923,
'text': '8',
'top': 211.47801,
'width': 24.407867431640625},
{'height': 12.351959228515625,
'left': 302.33087,
'text': '360.0',
'top': 211.47801,
'width': 34.6478271484375},
{'height': 12.351959228515625,
'left': 336.9787,
'text': '245',
'top': 211.47801,
'width': 26.899566650390625},
{'height': 12.351959228515625,
'left': 363.87827,
'text': '3.21',
'top': 211.47801,
'width': 30.24810791015625},
{'height': 12.351959228515625,
'left': 394.12637,
'text': '3.570',
'top': 211.47801,
'width': 34.64752197265625},
{'height': 12.351959228515625,
'left': 428.7739,
'text': '15.84',
'top': 211.47801,
'width': 34.64801025390625},
{'height': 12.351959228515625,
'left': 463.4219,
'text': '0',
'top': 211.47801,
'width': 21.1429443359375},
{'height': 12.351959228515625,
'left': 484.56485,
'text': '0',
'top': 211.47801,
'width': 25.238006591796875},
{'height': 12.351959228515625,
'left': 509.80286,
'text': '3',
'top': 211.47801,
'width': 30.24798583984375}],
[{'height': 12.351806640625,
'left': 247.14917,
'text': '24.4',
'top': 223.82997,
'width': 30.773834228515625},
{'height': 12.351806640625,
'left': 277.923,
'text': '4',
'top': 223.82997,
'width': 24.407867431640625},
{'height': 12.351806640625,
'left': 302.33087,
'text': '146.7',
'top': 223.82997,
'width': 34.6478271484375},
{'height': 12.351806640625,
'left': 336.9787,
'text': '62',
'top': 223.82997,
'width': 26.899566650390625},
{'height': 12.351806640625,
'left': 363.87827,
'text': '3.69',
'top': 223.82997,
'width': 30.24810791015625},
{'height': 12.351806640625,
'left': 394.12637,
'text': '3.190',
'top': 223.82997,
'width': 34.64752197265625},
{'height': 12.351806640625,
'left': 428.7739,
'text': '20.00',
'top': 223.82997,
'width': 34.64801025390625},
{'height': 12.351806640625,
'left': 463.4219,
'text': '1',
'top': 223.82997,
'width': 21.1429443359375},
{'height': 12.351806640625,
'left': 484.56485,
'text': '0',
'top': 223.82997,
'width': 25.238006591796875},
{'height': 12.351806640625,
'left': 509.80286,
'text': '4',
'top': 223.82997,
'width': 30.24798583984375}],
[{'height': 12.356719970703125,
'left': 247.14917,
'text': '22.8',
'top': 236.18178,
'width': 30.773834228515625},
{'height': 12.356719970703125,
'left': 277.923,
'text': '4',
'top': 236.18178,
'width': 24.407867431640625},
{'height': 12.356719970703125,
'left': 302.33087,
'text': '140.8',
'top': 236.18178,
'width': 34.6478271484375},
{'height': 12.356719970703125,
'left': 336.9787,
'text': '95',
'top': 236.18178,
'width': 26.899566650390625},
{'height': 12.356719970703125,
'left': 363.87827,
'text': '3.92',
'top': 236.18178,
'width': 30.24810791015625},
{'height': 12.356719970703125,
'left': 394.12637,
'text': '3.150',
'top': 236.18178,
'width': 34.64752197265625},
{'height': 12.356719970703125,
'left': 428.7739,
'text': '22.90',
'top': 236.18178,
'width': 34.64801025390625},
{'height': 12.356719970703125,
'left': 463.4219,
'text': '1',
'top': 236.18178,
'width': 21.1429443359375},
{'height': 12.356719970703125,
'left': 484.56485,
'text': '0',
'top': 236.18178,
'width': 25.238006591796875},
{'height': 12.356719970703125,
'left': 509.80286,
'text': '4',
'top': 236.18178,
'width': 30.24798583984375}],
[{'height': 12.351882934570312,
'left': 247.14917,
'text': '19.2',
'top': 248.5385,
'width': 30.773834228515625},
{'height': 12.351882934570312,
'left': 277.923,
'text': '6',
'top': 248.5385,
'width': 24.407867431640625},
{'height': 12.351882934570312,
'left': 302.33087,
'text': '167.6',
'top': 248.5385,
'width': 34.6478271484375},
{'height': 12.351882934570312,
'left': 336.9787,
'text': '123',
'top': 248.5385,
'width': 26.899566650390625},
{'height': 12.351882934570312,
'left': 363.87827,
'text': '3.92',
'top': 248.5385,
'width': 30.24810791015625},
{'height': 12.351882934570312,
'left': 394.12637,
'text': '3.440',
'top': 248.5385,
'width': 34.64752197265625},
{'height': 12.351882934570312,
'left': 428.7739,
'text': '18.30',
'top': 248.5385,
'width': 34.64801025390625},
{'height': 12.351882934570312,
'left': 463.4219,
'text': '1',
'top': 248.5385,
'width': 21.1429443359375},
{'height': 12.351882934570312,
'left': 484.56485,
'text': '0',
'top': 248.5385,
'width': 25.238006591796875},
{'height': 12.351882934570312,
'left': 509.80286,
'text': '4',
'top': 248.5385,
'width': 30.24798583984375}],
[{'height': 12.3564453125,
'left': 247.14917,
'text': '17.8',
'top': 260.89038,
'width': 30.773834228515625},
{'height': 12.3564453125,
'left': 277.923,
'text': '6',
'top': 260.89038,
'width': 24.407867431640625},
{'height': 12.3564453125,
'left': 302.33087,
'text': '167.6',
'top': 260.89038,
'width': 34.6478271484375},
{'height': 12.3564453125,
'left': 336.9787,
'text': '123',
'top': 260.89038,
'width': 26.899566650390625},
{'height': 12.3564453125,
'left': 363.87827,
'text': '3.92',
'top': 260.89038,
'width': 30.24810791015625},
{'height': 12.3564453125,
'left': 394.12637,
'text': '3.440',
'top': 260.89038,
'width': 34.64752197265625},
{'height': 12.3564453125,
'left': 428.7739,
'text': '18.90',
'top': 260.89038,
'width': 34.64801025390625},
{'height': 12.3564453125,
'left': 463.4219,
'text': '1',
'top': 260.89038,
'width': 21.1429443359375},
{'height': 12.3564453125,
'left': 484.56485,
'text': '0',
'top': 260.89038,
'width': 25.238006591796875},
{'height': 12.3564453125,
'left': 509.80286,
'text': '4',
'top': 260.89038,
'width': 30.24798583984375}],
[{'height': 12.352142333984375,
'left': 247.14917,
'text': '16.4',
'top': 273.24683,
'width': 30.773834228515625},
{'height': 12.352142333984375,
'left': 277.923,
'text': '8',
'top': 273.24683,
'width': 24.407867431640625},
{'height': 12.352142333984375,
'left': 302.33087,
'text': '275.8',
'top': 273.24683,
'width': 34.6478271484375},
{'height': 12.352142333984375,
'left': 336.9787,
'text': '180',
'top': 273.24683,
'width': 26.899566650390625},
{'height': 12.352142333984375,
'left': 363.87827,
'text': '3.07',
'top': 273.24683,
'width': 30.24810791015625},
{'height': 12.352142333984375,
'left': 394.12637,
'text': '4.070',
'top': 273.24683,
'width': 34.64752197265625},
{'height': 12.352142333984375,
'left': 428.7739,
'text': '17.40',
'top': 273.24683,
'width': 34.64801025390625},
{'height': 12.352142333984375,
'left': 463.4219,
'text': '0',
'top': 273.24683,
'width': 21.1429443359375},
{'height': 12.352142333984375,
'left': 484.56485,
'text': '0',
'top': 273.24683,
'width': 25.238006591796875},
{'height': 12.352142333984375,
'left': 509.80286,
'text': '3',
'top': 273.24683,
'width': 30.24798583984375}],
[{'height': 12.35186767578125,
'left': 247.14917,
'text': '17.3',
'top': 285.59897,
'width': 30.773834228515625},
{'height': 12.35186767578125,
'left': 277.923,
'text': '8',
'top': 285.59897,
'width': 24.407867431640625},
{'height': 12.35186767578125,
'left': 302.33087,
'text': '275.8',
'top': 285.59897,
'width': 34.6478271484375},
{'height': 12.35186767578125,
'left': 336.9787,
'text': '180',
'top': 285.59897,
'width': 26.899566650390625},
{'height': 12.35186767578125,
'left': 363.87827,
'text': '3.07',
'top': 285.59897,
'width': 30.24810791015625},
{'height': 12.35186767578125,
'left': 394.12637,
'text': '3.730',
'top': 285.59897,
'width': 34.64752197265625},
{'height': 12.35186767578125,
'left': 428.7739,
'text': '17.60',
'top': 285.59897,
'width': 34.64801025390625},
{'height': 12.35186767578125,
'left': 463.4219,
'text': '0',
'top': 285.59897,
'width': 21.1429443359375},
{'height': 12.35186767578125,
'left': 484.56485,
'text': '0',
'top': 285.59897,
'width': 25.238006591796875},
{'height': 12.35186767578125,
'left': 509.80286,
'text': '3',
'top': 285.59897,
'width': 30.24798583984375}],
[{'height': 12.357086181640625,
'left': 247.14917,
'text': '15.2',
'top': 297.95084,
'width': 30.773834228515625},
{'height': 12.357086181640625,
'left': 277.923,
'text': '8',
'top': 297.95084,
'width': 24.407867431640625},
{'height': 12.357086181640625,
'left': 302.33087,
'text': '275.8',
'top': 297.95084,
'width': 34.6478271484375},
{'height': 12.357086181640625,
'left': 336.9787,
'text': '180',
'top': 297.95084,
'width': 26.899566650390625},
{'height': 12.357086181640625,
'left': 363.87827,
'text': '3.07',
'top': 297.95084,
'width': 30.24810791015625},
{'height': 12.357086181640625,
'left': 394.12637,
'text': '3.780',
'top': 297.95084,
'width': 34.64752197265625},
{'height': 12.357086181640625,
'left': 428.7739,
'text': '18.00',
'top': 297.95084,
'width': 34.64801025390625},
{'height': 12.357086181640625,
'left': 463.4219,
'text': '0',
'top': 297.95084,
'width': 21.1429443359375},
{'height': 12.357086181640625,
'left': 484.56485,
'text': '0',
'top': 297.95084,
'width': 25.238006591796875},
{'height': 12.357086181640625,
'left': 509.80286,
'text': '3',
'top': 297.95084,
'width': 30.24798583984375}],
[{'height': 12.3515625,
'left': 247.14917,
'text': '10.4',
'top': 310.30792,
'width': 30.773834228515625},
{'height': 12.3515625,
'left': 277.923,
'text': '8',
'top': 310.30792,
'width': 24.407867431640625},
{'height': 12.3515625,
'left': 302.33087,
'text': '472.0',
'top': 310.30792,
'width': 34.6478271484375},
{'height': 12.3515625,
'left': 336.9787,
'text': '205',
'top': 310.30792,
'width': 26.899566650390625},
{'height': 12.3515625,
'left': 363.87827,
'text': '2.93',
'top': 310.30792,
'width': 30.24810791015625},
{'height': 12.3515625,
'left': 394.12637,
'text': '5.250',
'top': 310.30792,
'width': 34.64752197265625},
{'height': 12.3515625,
'left': 428.7739,
'text': '17.98',
'top': 310.30792,
'width': 34.64801025390625},
{'height': 12.3515625,
'left': 463.4219,
'text': '0',
'top': 310.30792,
'width': 21.1429443359375},
{'height': 12.3515625,
'left': 484.56485,
'text': '0',
'top': 310.30792,
'width': 25.238006591796875},
{'height': 12.3515625,
'left': 509.80286,
'text': '3',
'top': 310.30792,
'width': 30.24798583984375}],
[{'height': 12.35186767578125,
'left': 247.14917,
'text': '10.4',
'top': 322.6595,
'width': 30.773834228515625},
{'height': 12.35186767578125,
'left': 277.923,
'text': '8',
'top': 322.6595,
'width': 24.407867431640625},
{'height': 12.35186767578125,
'left': 302.33087,
'text': '460.0',
'top': 322.6595,
'width': 34.6478271484375},
{'height': 12.35186767578125,
'left': 336.9787,
'text': '215',
'top': 322.6595,
'width': 26.899566650390625},
{'height': 12.35186767578125,
'left': 363.87827,
'text': '3.00',
'top': 322.6595,
'width': 30.24810791015625},
{'height': 12.35186767578125,
'left': 394.12637,
'text': '5.424',
'top': 322.6595,
'width': 34.64752197265625},
{'height': 12.35186767578125,
'left': 428.7739,
'text': '17.82',
'top': 322.6595,
'width': 34.64801025390625},
{'height': 12.35186767578125,
'left': 463.4219,
'text': '0',
'top': 322.6595,
'width': 21.1429443359375},
{'height': 12.35186767578125,
'left': 484.56485,
'text': '0',
'top': 322.6595,
'width': 25.238006591796875},
{'height': 12.35186767578125,
'left': 509.80286,
'text': '3',
'top': 322.6595,
'width': 30.24798583984375}],
[{'height': 12.357086181640625,
'left': 247.14917,
'text': '14.7',
'top': 335.01135,
'width': 30.773834228515625},
{'height': 12.357086181640625,
'left': 277.923,
'text': '8',
'top': 335.01135,
'width': 24.407867431640625},
{'height': 12.357086181640625,
'left': 302.33087,
'text': '440.0',
'top': 335.01135,
'width': 34.6478271484375},
{'height': 12.357086181640625,
'left': 336.9787,
'text': '230',
'top': 335.01135,
'width': 26.899566650390625},
{'height': 12.357086181640625,
'left': 363.87827,
'text': '3.23',
'top': 335.01135,
'width': 30.24810791015625},
{'height': 12.357086181640625,
'left': 394.12637,
'text': '5.345',
'top': 335.01135,
'width': 34.64752197265625},
{'height': 12.357086181640625,
'left': 428.7739,
'text': '17.42',
'top': 335.01135,
'width': 34.64801025390625},
{'height': 12.357086181640625,
'left': 463.4219,
'text': '0',
'top': 335.01135,
'width': 21.1429443359375},
{'height': 12.357086181640625,
'left': 484.56485,
'text': '0',
'top': 335.01135,
'width': 25.238006591796875},
{'height': 12.357086181640625,
'left': 509.80286,
'text': '3',
'top': 335.01135,
'width': 30.24798583984375}],
[{'height': 12.351531982421875,
'left': 247.14917,
'text': '32.4',
'top': 347.36844,
'width': 30.773834228515625},
{'height': 12.351531982421875,
'left': 277.923,
'text': '4',
'top': 347.36844,
'width': 24.407867431640625},
{'height': 12.351531982421875,
'left': 302.33087,
'text': '78.7',
'top': 347.36844,
'width': 34.6478271484375},
{'height': 12.351531982421875,
'left': 336.9787,
'text': '66',
'top': 347.36844,
'width': 26.899566650390625},
{'height': 12.351531982421875,
'left': 363.87827,
'text': '4.08',
'top': 347.36844,
'width': 30.24810791015625},
{'height': 12.351531982421875,
'left': 394.12637,
'text': '2.200',
'top': 347.36844,
'width': 34.64752197265625},
{'height': 12.351531982421875,
'left': 428.7739,
'text': '19.47',
'top': 347.36844,
'width': 34.64801025390625},
{'height': 12.351531982421875,
'left': 463.4219,
'text': '1',
'top': 347.36844,
'width': 21.1429443359375},
{'height': 12.351531982421875,
'left': 484.56485,
'text': '1',
'top': 347.36844,
'width': 25.238006591796875},
{'height': 12.351531982421875,
'left': 509.80286,
'text': '4',
'top': 347.36844,
'width': 30.24798583984375}],
[{'height': 12.3570556640625,
'left': 247.14917,
'text': '30.4',
'top': 359.71997,
'width': 30.773834228515625},
{'height': 12.3570556640625,
'left': 277.923,
'text': '4',
'top': 359.71997,
'width': 24.407867431640625},
{'height': 12.3570556640625,
'left': 302.33087,
'text': '75.7',
'top': 359.71997,
'width': 34.6478271484375},
{'height': 12.3570556640625,
'left': 336.9787,
'text': '52',
'top': 359.71997,
'width': 26.899566650390625},
{'height': 12.3570556640625,
'left': 363.87827,
'text': '4.93',
'top': 359.71997,
'width': 30.24810791015625},
{'height': 12.3570556640625,
'left': 394.12637,
'text': '1.615',
'top': 359.71997,
'width': 34.64752197265625},
{'height': 12.3570556640625,
'left': 428.7739,
'text': '18.52',
'top': 359.71997,
'width': 34.64801025390625},
{'height': 12.3570556640625,
'left': 463.4219,
'text': '1',
'top': 359.71997,
'width': 21.1429443359375},
{'height': 12.3570556640625,
'left': 484.56485,
'text': '1',
'top': 359.71997,
'width': 25.238006591796875},
{'height': 12.3570556640625,
'left': 509.80286,
'text': '4',
'top': 359.71997,
'width': 30.24798583984375}],
[{'height': 12.351959228515625,
'left': 247.14917,
'text': '33.9',
'top': 372.07703,
'width': 30.773834228515625},
{'height': 12.351959228515625,
'left': 277.923,
'text': '4',
'top': 372.07703,
'width': 24.407867431640625},
{'height': 12.351959228515625,
'left': 302.33087,
'text': '71.1',
'top': 372.07703,
'width': 34.6478271484375},
{'height': 12.351959228515625,
'left': 336.9787,
'text': '65',
'top': 372.07703,
'width': 26.899566650390625},
{'height': 12.351959228515625,
'left': 363.87827,
'text': '4.22',
'top': 372.07703,
'width': 30.24810791015625},
{'height': 12.351959228515625,
'left': 394.12637,
'text': '1.835',
'top': 372.07703,
'width': 34.64752197265625},
{'height': 12.351959228515625,
'left': 428.7739,
'text': '19.90',
'top': 372.07703,
'width': 34.64801025390625},
{'height': 12.351959228515625,
'left': 463.4219,
'text': '1',
'top': 372.07703,
'width': 21.1429443359375},
{'height': 12.351959228515625,
'left': 484.56485,
'text': '1',
'top': 372.07703,
'width': 25.238006591796875},
{'height': 12.351959228515625,
'left': 509.80286,
'text': '4',
'top': 372.07703,
'width': 30.24798583984375}],
[{'height': 12.351531982421875,
'left': 247.14917,
'text': '21.5',
'top': 384.429,
'width': 30.773834228515625},
{'height': 12.351531982421875,
'left': 277.923,
'text': '4',
'top': 384.429,
'width': 24.407867431640625},
{'height': 12.351531982421875,
'left': 302.33087,
'text': '120.1',
'top': 384.429,
'width': 34.6478271484375},
{'height': 12.351531982421875,
'left': 336.9787,
'text': '97',
'top': 384.429,
'width': 26.899566650390625},
{'height': 12.351531982421875,
'left': 363.87827,
'text': '3.70',
'top': 384.429,
'width': 30.24810791015625},
{'height': 12.351531982421875,
'left': 394.12637,
'text': '2.465',
'top': 384.429,
'width': 34.64752197265625},
{'height': 12.351531982421875,
'left': 428.7739,
'text': '20.01',
'top': 384.429,
'width': 34.64801025390625},
{'height': 12.351531982421875,
'left': 463.4219,
'text': '1',
'top': 384.429,
'width': 21.1429443359375},
{'height': 12.351531982421875,
'left': 484.56485,
'text': '0',
'top': 384.429,
'width': 25.238006591796875},
{'height': 12.351531982421875,
'left': 509.80286,
'text': '3',
'top': 384.429,
'width': 30.24798583984375}],
[{'height': 12.357025146484375,
'left': 247.14917,
'text': '15.5',
'top': 396.78052,
'width': 30.773834228515625},
{'height': 12.357025146484375,
'left': 277.923,
'text': '8',
'top': 396.78052,
'width': 24.407867431640625},
{'height': 12.357025146484375,
'left': 302.33087,
'text': '318.0',
'top': 396.78052,
'width': 34.6478271484375},
{'height': 12.357025146484375,
'left': 336.9787,
'text': '150',
'top': 396.78052,
'width': 26.899566650390625},
{'height': 12.357025146484375,
'left': 363.87827,
'text': '2.76',
'top': 396.78052,
'width': 30.24810791015625},
{'height': 12.357025146484375,
'left': 394.12637,
'text': '3.520',
'top': 396.78052,
'width': 34.64752197265625},
{'height': 12.357025146484375,
'left': 428.7739,
'text': '16.87',
'top': 396.78052,
'width': 34.64801025390625},
{'height': 12.357025146484375,
'left': 463.4219,
'text': '0',
'top': 396.78052,
'width': 21.1429443359375},
{'height': 12.357025146484375,
'left': 484.56485,
'text': '0',
'top': 396.78052,
'width': 25.238006591796875},
{'height': 12.357025146484375,
'left': 509.80286,
'text': '3',
'top': 396.78052,
'width': 30.24798583984375}],
[{'height': 12.35205078125,
'left': 247.14917,
'text': '15.2',
'top': 409.13754,
'width': 30.773834228515625},
{'height': 12.35205078125,
'left': 277.923,
'text': '8',
'top': 409.13754,
'width': 24.407867431640625},
{'height': 12.35205078125,
'left': 302.33087,
'text': '304.0',
'top': 409.13754,
'width': 34.6478271484375},
{'height': 12.35205078125,
'left': 336.9787,
'text': '150',
'top': 409.13754,
'width': 26.899566650390625},
{'height': 12.35205078125,
'left': 363.87827,
'text': '3.15',
'top': 409.13754,
'width': 30.24810791015625},
{'height': 12.35205078125,
'left': 394.12637,
'text': '3.435',
'top': 409.13754,
'width': 34.64752197265625},
{'height': 12.35205078125,
'left': 428.7739,
'text': '17.30',
'top': 409.13754,
'width': 34.64801025390625},
{'height': 12.35205078125,
'left': 463.4219,
'text': '0',
'top': 409.13754,
'width': 21.1429443359375},
{'height': 12.35205078125,
'left': 484.56485,
'text': '0',
'top': 409.13754,
'width': 25.238006591796875},
{'height': 12.35205078125,
'left': 509.80286,
'text': '3',
'top': 409.13754,
'width': 30.24798583984375}],
[{'height': 12.351348876953125,
'left': 247.14917,
'text': '13.3',
'top': 421.4896,
'width': 30.773834228515625},
{'height': 12.351348876953125,
'left': 277.923,
'text': '8',
'top': 421.4896,
'width': 24.407867431640625},
{'height': 12.351348876953125,
'left': 302.33087,
'text': '350.0',
'top': 421.4896,
'width': 34.6478271484375},
{'height': 12.351348876953125,
'left': 336.9787,
'text': '245',
'top': 421.4896,
'width': 26.899566650390625},
{'height': 12.351348876953125,
'left': 363.87827,
'text': '3.73',
'top': 421.4896,
'width': 30.24810791015625},
{'height': 12.351348876953125,
'left': 394.12637,
'text': '3.840',
'top': 421.4896,
'width': 34.64752197265625},
{'height': 12.351348876953125,
'left': 428.7739,
'text': '15.41',
'top': 421.4896,
'width': 34.64801025390625},
{'height': 12.351348876953125,
'left': 463.4219,
'text': '0',
'top': 421.4896,
'width': 21.1429443359375},
{'height': 12.351348876953125,
'left': 484.56485,
'text': '0',
'top': 421.4896,
'width': 25.238006591796875},
{'height': 12.351348876953125,
'left': 509.80286,
'text': '3',
'top': 421.4896,
'width': 30.24798583984375}],
[{'height': 12.357208251953125,
'left': 247.14917,
'text': '19.2',
'top': 433.84094,
'width': 30.773834228515625},
{'height': 12.357208251953125,
'left': 277.923,
'text': '8',
'top': 433.84094,
'width': 24.407867431640625},
{'height': 12.357208251953125,
'left': 302.33087,
'text': '400.0',
'top': 433.84094,
'width': 34.6478271484375},
{'height': 12.357208251953125,
'left': 336.9787,
'text': '175',
'top': 433.84094,
'width': 26.899566650390625},
{'height': 12.357208251953125,
'left': 363.87827,
'text': '3.08',
'top': 433.84094,
'width': 30.24810791015625},
{'height': 12.357208251953125,
'left': 394.12637,
'text': '3.845',
'top': 433.84094,
'width': 34.64752197265625},
{'height': 12.357208251953125,
'left': 428.7739,
'text': '17.05',
'top': 433.84094,
'width': 34.64801025390625},
{'height': 12.357208251953125,
'left': 463.4219,
'text': '0',
'top': 433.84094,
'width': 21.1429443359375},
{'height': 12.357208251953125,
'left': 484.56485,
'text': '0',
'top': 433.84094,
'width': 25.238006591796875},
{'height': 12.357208251953125,
'left': 509.80286,
'text': '3',
'top': 433.84094,
'width': 30.24798583984375}],
[{'height': 12.35186767578125,
'left': 247.14917,
'text': '27.3',
'top': 446.19815,
'width': 30.773834228515625},
{'height': 12.35186767578125,
'left': 277.923,
'text': '4',
'top': 446.19815,
'width': 24.407867431640625},
{'height': 12.35186767578125,
'left': 302.33087,
'text': '79.0',
'top': 446.19815,
'width': 34.6478271484375},
{'height': 12.35186767578125,
'left': 336.9787,
'text': '66',
'top': 446.19815,
'width': 26.899566650390625},
{'height': 12.35186767578125,
'left': 363.87827,
'text': '4.08',
'top': 446.19815,
'width': 30.24810791015625},
{'height': 12.35186767578125,
'left': 394.12637,
'text': '1.935',
'top': 446.19815,
'width': 34.64752197265625},
{'height': 12.35186767578125,
'left': 428.7739,
'text': '18.90',
'top': 446.19815,
'width': 34.64801025390625},
{'height': 12.35186767578125,
'left': 463.4219,
'text': '1',
'top': 446.19815,
'width': 21.1429443359375},
{'height': 12.35186767578125,
'left': 484.56485,
'text': '1',
'top': 446.19815,
'width': 25.238006591796875},
{'height': 12.35186767578125,
'left': 509.80286,
'text': '4',
'top': 446.19815,
'width': 30.24798583984375}],
[{'height': 12.35150146484375,
'left': 247.14917,
'text': '26.0',
'top': 458.55002,
'width': 30.773834228515625},
{'height': 12.35150146484375,
'left': 277.923,
'text': '4',
'top': 458.55002,
'width': 24.407867431640625},
{'height': 12.35150146484375,
'left': 302.33087,
'text': '120.3',
'top': 458.55002,
'width': 34.6478271484375},
{'height': 12.35150146484375,
'left': 336.9787,
'text': '91',
'top': 458.55002,
'width': 26.899566650390625},
{'height': 12.35150146484375,
'left': 363.87827,
'text': '4.43',
'top': 458.55002,
'width': 30.24810791015625},
{'height': 12.35150146484375,
'left': 394.12637,
'text': '2.140',
'top': 458.55002,
'width': 34.64752197265625},
{'height': 12.35150146484375,
'left': 428.7739,
'text': '16.70',
'top': 458.55002,
'width': 34.64801025390625},
{'height': 12.35150146484375,
'left': 463.4219,
'text': '0',
'top': 458.55002,
'width': 21.1429443359375},
{'height': 12.35150146484375,
'left': 484.56485,
'text': '1',
'top': 458.55002,
'width': 25.238006591796875},
{'height': 12.35150146484375,
'left': 509.80286,
'text': '5',
'top': 458.55002,
'width': 30.24798583984375}],
[{'height': 12.357025146484375,
'left': 247.14917,
'text': '30.4',
'top': 470.90152,
'width': 30.773834228515625},
{'height': 12.357025146484375,
'left': 277.923,
'text': '4',
'top': 470.90152,
'width': 24.407867431640625},
{'height': 12.357025146484375,
'left': 302.33087,
'text': '95.1',
'top': 470.90152,
'width': 34.6478271484375},
{'height': 12.357025146484375,
'left': 336.9787,
'text': '113',
'top': 470.90152,
'width': 26.899566650390625},
{'height': 12.357025146484375,
'left': 363.87827,
'text': '3.77',
'top': 470.90152,
'width': 30.24810791015625},
{'height': 12.357025146484375,
'left': 394.12637,
'text': '1.513',
'top': 470.90152,
'width': 34.64752197265625},
{'height': 12.357025146484375,
'left': 428.7739,
'text': '16.90',
'top': 470.90152,
'width': 34.64801025390625},
{'height': 12.357025146484375,
'left': 463.4219,
'text': '1',
'top': 470.90152,
'width': 21.1429443359375},
{'height': 12.357025146484375,
'left': 484.56485,
'text': '1',
'top': 470.90152,
'width': 25.238006591796875},
{'height': 12.357025146484375,
'left': 509.80286,
'text': '5',
'top': 470.90152,
'width': 30.24798583984375}],
[{'height': 12.351776123046875,
'left': 247.14917,
'text': '15.8',
'top': 483.25854,
'width': 30.773834228515625},
{'height': 12.351776123046875,
'left': 277.923,
'text': '8',
'top': 483.25854,
'width': 24.407867431640625},
{'height': 12.351776123046875,
'left': 302.33087,
'text': '351.0',
'top': 483.25854,
'width': 34.6478271484375},
{'height': 12.351776123046875,
'left': 336.9787,
'text': '264',
'top': 483.25854,
'width': 26.899566650390625},
{'height': 12.351776123046875,
'left': 363.87827,
'text': '4.22',
'top': 483.25854,
'width': 30.24810791015625},
{'height': 12.351776123046875,
'left': 394.12637,
'text': '3.170',
'top': 483.25854,
'width': 34.64752197265625},
{'height': 12.351776123046875,
'left': 428.7739,
'text': '14.50',
'top': 483.25854,
'width': 34.64801025390625},
{'height': 12.351776123046875,
'left': 463.4219,
'text': '0',
'top': 483.25854,
'width': 21.1429443359375},
{'height': 12.351776123046875,
'left': 484.56485,
'text': '1',
'top': 483.25854,
'width': 25.238006591796875},
{'height': 12.351776123046875,
'left': 509.80286,
'text': '5',
'top': 483.25854,
'width': 30.24798583984375}],
[{'height': 12.356109619140625,
'left': 247.14917,
'text': '19.7',
'top': 495.61032,
'width': 30.773834228515625},
{'height': 12.356109619140625,
'left': 277.923,
'text': '6',
'top': 495.61032,
'width': 24.407867431640625},
{'height': 12.356109619140625,
'left': 302.33087,
'text': '145.0',
'top': 495.61032,
'width': 34.6478271484375},
{'height': 12.356109619140625,
'left': 336.9787,
'text': '175',
'top': 495.61032,
'width': 26.899566650390625},
{'height': 12.356109619140625,
'left': 363.87827,
'text': '3.62',
'top': 495.61032,
'width': 30.24810791015625},
{'height': 12.356109619140625,
'left': 394.12637,
'text': '2.770',
'top': 495.61032,
'width': 34.64752197265625},
{'height': 12.356109619140625,
'left': 428.7739,
'text': '15.50',
'top': 495.61032,
'width': 34.64801025390625},
{'height': 12.356109619140625,
'left': 463.4219,
'text': '0',
'top': 495.61032,
'width': 21.1429443359375},
{'height': 12.356109619140625,
'left': 484.56485,
'text': '1',
'top': 495.61032,
'width': 25.238006591796875},
{'height': 12.356109619140625,
'left': 509.80286,
'text': '5',
'top': 495.61032,
'width': 30.24798583984375}],
[{'height': 12.35247802734375,
'left': 247.14917,
'text': '15.0',
'top': 507.96643,
'width': 30.773834228515625},
{'height': 12.35247802734375,
'left': 277.923,
'text': '8',
'top': 507.96643,
'width': 24.407867431640625},
{'height': 12.35247802734375,
'left': 302.33087,
'text': '301.0',
'top': 507.96643,
'width': 34.6478271484375},
{'height': 12.35247802734375,
'left': 336.9787,
'text': '335',
'top': 507.96643,
'width': 26.899566650390625},
{'height': 12.35247802734375,
'left': 363.87827,
'text': '3.54',
'top': 507.96643,
'width': 30.24810791015625},
{'height': 12.35247802734375,
'left': 394.12637,
'text': '3.570',
'top': 507.96643,
'width': 34.64752197265625},
{'height': 12.35247802734375,
'left': 428.7739,
'text': '14.60',
'top': 507.96643,
'width': 34.64801025390625},
{'height': 12.35247802734375,
'left': 463.4219,
'text': '0',
'top': 507.96643,
'width': 21.1429443359375},
{'height': 12.35247802734375,
'left': 484.56485,
'text': '1',
'top': 507.96643,
'width': 25.238006591796875},
{'height': 12.35247802734375,
'left': 509.80286,
'text': '5',
'top': 507.96643,
'width': 30.24798583984375}]],
'extraction_method': 'lattice',
'height': 395.14886474609375,
'left': 247.14917,
'right': 540.05084,
'top': 125.17005,
'width': 292.90167236328125}]
In [14]:
# You can convert from pdf into JSON, CSV, TSV
tabula.convert_into(pdf_path, "test.json", output_format="json")
!cat test.json
'pages' argument isn't specified.Will extract only from page 1 by default.
Got stderr: Jun 04, 2020 8:24:28 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font Symbol
Jun 04, 2020 8:24:28 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font ZapfDingbats
[{"extraction_method":"lattice","top":125.17005,"left":247.14917,"width":292.90167236328125,"height":395.14886474609375,"right":540.05084,"bottom":520.3189,"data":[[{"top":125.17005,"left":247.14917,"width":30.773834228515625,"height":12.186347961425781,"text":"mpg"},{"top":125.17005,"left":277.923,"width":24.407867431640625,"height":12.186347961425781,"text":"cyl"},{"top":125.17005,"left":302.33087,"width":34.6478271484375,"height":12.186347961425781,"text":"disp"},{"top":125.17005,"left":336.9787,"width":26.899566650390625,"height":12.186347961425781,"text":"hp"},{"top":125.17005,"left":363.87827,"width":30.24810791015625,"height":12.186347961425781,"text":"drat"},{"top":125.17005,"left":394.12637,"width":34.64752197265625,"height":12.186347961425781,"text":"wt"},{"top":125.17005,"left":428.7739,"width":34.64801025390625,"height":12.186347961425781,"text":"qsec"},{"top":125.17005,"left":463.4219,"width":21.1429443359375,"height":12.186347961425781,"text":"vs"},{"top":125.17005,"left":484.56485,"width":25.238006591796875,"height":12.186347961425781,"text":"am"},{"top":125.17005,"left":509.80286,"width":30.24798583984375,"height":12.186347961425781,"text":"gear"}],[{"top":137.3564,"left":247.14917,"width":30.773834228515625,"height":12.352630615234375,"text":"21.0"},{"top":137.3564,"left":277.923,"width":24.407867431640625,"height":12.352630615234375,"text":"6"},{"top":137.3564,"left":302.33087,"width":34.6478271484375,"height":12.352630615234375,"text":"160.0"},{"top":137.3564,"left":336.9787,"width":26.899566650390625,"height":12.352630615234375,"text":"110"},{"top":137.3564,"left":363.87827,"width":30.24810791015625,"height":12.352630615234375,"text":"3.90"},{"top":137.3564,"left":394.12637,"width":34.64752197265625,"height":12.352630615234375,"text":"2.620"},{"top":137.3564,"left":428.7739,"width":34.64801025390625,"height":12.352630615234375,"text":"16.46"},{"top":137.3564,"left":463.4219,"width":21.1429443359375,"height":12.352630615234375,"text":"0"},{"top":137.3564,"left":484.56485,"width":25.238006591796875,"height":12.352630615234375,"text":"1"},{"top":137.3564,"left":509.80286,"width":30.24798583984375,"height":12.352630615234375,"text":"4"}],[{"top":149.70903,"left":247.14917,"width":30.773834228515625,"height":12.35186767578125,"text":"21.0"},{"top":149.70903,"left":277.923,"width":24.407867431640625,"height":12.35186767578125,"text":"6"},{"top":149.70903,"left":302.33087,"width":34.6478271484375,"height":12.35186767578125,"text":"160.0"},{"top":149.70903,"left":336.9787,"width":26.899566650390625,"height":12.35186767578125,"text":"110"},{"top":149.70903,"left":363.87827,"width":30.24810791015625,"height":12.35186767578125,"text":"3.90"},{"top":149.70903,"left":394.12637,"width":34.64752197265625,"height":12.35186767578125,"text":"2.875"},{"top":149.70903,"left":428.7739,"width":34.64801025390625,"height":12.35186767578125,"text":"17.02"},{"top":149.70903,"left":463.4219,"width":21.1429443359375,"height":12.35186767578125,"text":"0"},{"top":149.70903,"left":484.56485,"width":25.238006591796875,"height":12.35186767578125,"text":"1"},{"top":149.70903,"left":509.80286,"width":30.24798583984375,"height":12.35186767578125,"text":"4"}],[{"top":162.0609,"left":247.14917,"width":30.773834228515625,"height":12.35662841796875,"text":"22.8"},{"top":162.0609,"left":277.923,"width":24.407867431640625,"height":12.35662841796875,"text":"4"},{"top":162.0609,"left":302.33087,"width":34.6478271484375,"height":12.35662841796875,"text":"108.0"},{"top":162.0609,"left":336.9787,"width":26.899566650390625,"height":12.35662841796875,"text":"93"},{"top":162.0609,"left":363.87827,"width":30.24810791015625,"height":12.35662841796875,"text":"3.85"},{"top":162.0609,"left":394.12637,"width":34.64752197265625,"height":12.35662841796875,"text":"2.320"},{"top":162.0609,"left":428.7739,"width":34.64801025390625,"height":12.35662841796875,"text":"18.61"},{"top":162.0609,"left":463.4219,"width":21.1429443359375,"height":12.35662841796875,"text":"1"},{"top":162.0609,"left":484.56485,"width":25.238006591796875,"height":12.35662841796875,"text":"1"},{"top":162.0609,"left":509.80286,"width":30.24798583984375,"height":12.35662841796875,"text":"4"}],[{"top":174.41753,"left":247.14917,"width":30.773834228515625,"height":12.352005004882812,"text":"21.4"},{"top":174.41753,"left":277.923,"width":24.407867431640625,"height":12.352005004882812,"text":"6"},{"top":174.41753,"left":302.33087,"width":34.6478271484375,"height":12.352005004882812,"text":"258.0"},{"top":174.41753,"left":336.9787,"width":26.899566650390625,"height":12.352005004882812,"text":"110"},{"top":174.41753,"left":363.87827,"width":30.24810791015625,"height":12.352005004882812,"text":"3.08"},{"top":174.41753,"left":394.12637,"width":34.64752197265625,"height":12.352005004882812,"text":"3.215"},{"top":174.41753,"left":428.7739,"width":34.64801025390625,"height":12.352005004882812,"text":"19.44"},{"top":174.41753,"left":463.4219,"width":21.1429443359375,"height":12.352005004882812,"text":"1"},{"top":174.41753,"left":484.56485,"width":25.238006591796875,"height":12.352005004882812,"text":"0"},{"top":174.41753,"left":509.80286,"width":30.24798583984375,"height":12.352005004882812,"text":"3"}],[{"top":186.76953,"left":247.14917,"width":30.773834228515625,"height":12.351882934570312,"text":"18.7"},{"top":186.76953,"left":277.923,"width":24.407867431640625,"height":12.351882934570312,"text":"8"},{"top":186.76953,"left":302.33087,"width":34.6478271484375,"height":12.351882934570312,"text":"360.0"},{"top":186.76953,"left":336.9787,"width":26.899566650390625,"height":12.351882934570312,"text":"175"},{"top":186.76953,"left":363.87827,"width":30.24810791015625,"height":12.351882934570312,"text":"3.15"},{"top":186.76953,"left":394.12637,"width":34.64752197265625,"height":12.351882934570312,"text":"3.440"},{"top":186.76953,"left":428.7739,"width":34.64801025390625,"height":12.351882934570312,"text":"17.02"},{"top":186.76953,"left":463.4219,"width":21.1429443359375,"height":12.351882934570312,"text":"0"},{"top":186.76953,"left":484.56485,"width":25.238006591796875,"height":12.351882934570312,"text":"0"},{"top":186.76953,"left":509.80286,"width":30.24798583984375,"height":12.351882934570312,"text":"3"}],[{"top":199.12141,"left":247.14917,"width":30.773834228515625,"height":12.356597900390625,"text":"18.1"},{"top":199.12141,"left":277.923,"width":24.407867431640625,"height":12.356597900390625,"text":"6"},{"top":199.12141,"left":302.33087,"width":34.6478271484375,"height":12.356597900390625,"text":"225.0"},{"top":199.12141,"left":336.9787,"width":26.899566650390625,"height":12.356597900390625,"text":"105"},{"top":199.12141,"left":363.87827,"width":30.24810791015625,"height":12.356597900390625,"text":"2.76"},{"top":199.12141,"left":394.12637,"width":34.64752197265625,"height":12.356597900390625,"text":"3.460"},{"top":199.12141,"left":428.7739,"width":34.64801025390625,"height":12.356597900390625,"text":"20.22"},{"top":199.12141,"left":463.4219,"width":21.1429443359375,"height":12.356597900390625,"text":"1"},{"top":199.12141,"left":484.56485,"width":25.238006591796875,"height":12.356597900390625,"text":"0"},{"top":199.12141,"left":509.80286,"width":30.24798583984375,"height":12.356597900390625,"text":"3"}],[{"top":211.47801,"left":247.14917,"width":30.773834228515625,"height":12.351959228515625,"text":"14.3"},{"top":211.47801,"left":277.923,"width":24.407867431640625,"height":12.351959228515625,"text":"8"},{"top":211.47801,"left":302.33087,"width":34.6478271484375,"height":12.351959228515625,"text":"360.0"},{"top":211.47801,"left":336.9787,"width":26.899566650390625,"height":12.351959228515625,"text":"245"},{"top":211.47801,"left":363.87827,"width":30.24810791015625,"height":12.351959228515625,"text":"3.21"},{"top":211.47801,"left":394.12637,"width":34.64752197265625,"height":12.351959228515625,"text":"3.570"},{"top":211.47801,"left":428.7739,"width":34.64801025390625,"height":12.351959228515625,"text":"15.84"},{"top":211.47801,"left":463.4219,"width":21.1429443359375,"height":12.351959228515625,"text":"0"},{"top":211.47801,"left":484.56485,"width":25.238006591796875,"height":12.351959228515625,"text":"0"},{"top":211.47801,"left":509.80286,"width":30.24798583984375,"height":12.351959228515625,"text":"3"}],[{"top":223.82997,"left":247.14917,"width":30.773834228515625,"height":12.351806640625,"text":"24.4"},{"top":223.82997,"left":277.923,"width":24.407867431640625,"height":12.351806640625,"text":"4"},{"top":223.82997,"left":302.33087,"width":34.6478271484375,"height":12.351806640625,"text":"146.7"},{"top":223.82997,"left":336.9787,"width":26.899566650390625,"height":12.351806640625,"text":"62"},{"top":223.82997,"left":363.87827,"width":30.24810791015625,"height":12.351806640625,"text":"3.69"},{"top":223.82997,"left":394.12637,"width":34.64752197265625,"height":12.351806640625,"text":"3.190"},{"top":223.82997,"left":428.7739,"width":34.64801025390625,"height":12.351806640625,"text":"20.00"},{"top":223.82997,"left":463.4219,"width":21.1429443359375,"height":12.351806640625,"text":"1"},{"top":223.82997,"left":484.56485,"width":25.238006591796875,"height":12.351806640625,"text":"0"},{"top":223.82997,"left":509.80286,"width":30.24798583984375,"height":12.351806640625,"text":"4"}],[{"top":236.18178,"left":247.14917,"width":30.773834228515625,"height":12.356719970703125,"text":"22.8"},{"top":236.18178,"left":277.923,"width":24.407867431640625,"height":12.356719970703125,"text":"4"},{"top":236.18178,"left":302.33087,"width":34.6478271484375,"height":12.356719970703125,"text":"140.8"},{"top":236.18178,"left":336.9787,"width":26.899566650390625,"height":12.356719970703125,"text":"95"},{"top":236.18178,"left":363.87827,"width":30.24810791015625,"height":12.356719970703125,"text":"3.92"},{"top":236.18178,"left":394.12637,"width":34.64752197265625,"height":12.356719970703125,"text":"3.150"},{"top":236.18178,"left":428.7739,"width":34.64801025390625,"height":12.356719970703125,"text":"22.90"},{"top":236.18178,"left":463.4219,"width":21.1429443359375,"height":12.356719970703125,"text":"1"},{"top":236.18178,"left":484.56485,"width":25.238006591796875,"height":12.356719970703125,"text":"0"},{"top":236.18178,"left":509.80286,"width":30.24798583984375,"height":12.356719970703125,"text":"4"}],[{"top":248.5385,"left":247.14917,"width":30.773834228515625,"height":12.351882934570312,"text":"19.2"},{"top":248.5385,"left":277.923,"width":24.407867431640625,"height":12.351882934570312,"text":"6"},{"top":248.5385,"left":302.33087,"width":34.6478271484375,"height":12.351882934570312,"text":"167.6"},{"top":248.5385,"left":336.9787,"width":26.899566650390625,"height":12.351882934570312,"text":"123"},{"top":248.5385,"left":363.87827,"width":30.24810791015625,"height":12.351882934570312,"text":"3.92"},{"top":248.5385,"left":394.12637,"width":34.64752197265625,"height":12.351882934570312,"text":"3.440"},{"top":248.5385,"left":428.7739,"width":34.64801025390625,"height":12.351882934570312,"text":"18.30"},{"top":248.5385,"left":463.4219,"width":21.1429443359375,"height":12.351882934570312,"text":"1"},{"top":248.5385,"left":484.56485,"width":25.238006591796875,"height":12.351882934570312,"text":"0"},{"top":248.5385,"left":509.80286,"width":30.24798583984375,"height":12.351882934570312,"text":"4"}],[{"top":260.89038,"left":247.14917,"width":30.773834228515625,"height":12.3564453125,"text":"17.8"},{"top":260.89038,"left":277.923,"width":24.407867431640625,"height":12.3564453125,"text":"6"},{"top":260.89038,"left":302.33087,"width":34.6478271484375,"height":12.3564453125,"text":"167.6"},{"top":260.89038,"left":336.9787,"width":26.899566650390625,"height":12.3564453125,"text":"123"},{"top":260.89038,"left":363.87827,"width":30.24810791015625,"height":12.3564453125,"text":"3.92"},{"top":260.89038,"left":394.12637,"width":34.64752197265625,"height":12.3564453125,"text":"3.440"},{"top":260.89038,"left":428.7739,"width":34.64801025390625,"height":12.3564453125,"text":"18.90"},{"top":260.89038,"left":463.4219,"width":21.1429443359375,"height":12.3564453125,"text":"1"},{"top":260.89038,"left":484.56485,"width":25.238006591796875,"height":12.3564453125,"text":"0"},{"top":260.89038,"left":509.80286,"width":30.24798583984375,"height":12.3564453125,"text":"4"}],[{"top":273.24683,"left":247.14917,"width":30.773834228515625,"height":12.352142333984375,"text":"16.4"},{"top":273.24683,"left":277.923,"width":24.407867431640625,"height":12.352142333984375,"text":"8"},{"top":273.24683,"left":302.33087,"width":34.6478271484375,"height":12.352142333984375,"text":"275.8"},{"top":273.24683,"left":336.9787,"width":26.899566650390625,"height":12.352142333984375,"text":"180"},{"top":273.24683,"left":363.87827,"width":30.24810791015625,"height":12.352142333984375,"text":"3.07"},{"top":273.24683,"left":394.12637,"width":34.64752197265625,"height":12.352142333984375,"text":"4.070"},{"top":273.24683,"left":428.7739,"width":34.64801025390625,"height":12.352142333984375,"text":"17.40"},{"top":273.24683,"left":463.4219,"width":21.1429443359375,"height":12.352142333984375,"text":"0"},{"top":273.24683,"left":484.56485,"width":25.238006591796875,"height":12.352142333984375,"text":"0"},{"top":273.24683,"left":509.80286,"width":30.24798583984375,"height":12.352142333984375,"text":"3"}],[{"top":285.59897,"left":247.14917,"width":30.773834228515625,"height":12.35186767578125,"text":"17.3"},{"top":285.59897,"left":277.923,"width":24.407867431640625,"height":12.35186767578125,"text":"8"},{"top":285.59897,"left":302.33087,"width":34.6478271484375,"height":12.35186767578125,"text":"275.8"},{"top":285.59897,"left":336.9787,"width":26.899566650390625,"height":12.35186767578125,"text":"180"},{"top":285.59897,"left":363.87827,"width":30.24810791015625,"height":12.35186767578125,"text":"3.07"},{"top":285.59897,"left":394.12637,"width":34.64752197265625,"height":12.35186767578125,"text":"3.730"},{"top":285.59897,"left":428.7739,"width":34.64801025390625,"height":12.35186767578125,"text":"17.60"},{"top":285.59897,"left":463.4219,"width":21.1429443359375,"height":12.35186767578125,"text":"0"},{"top":285.59897,"left":484.56485,"width":25.238006591796875,"height":12.35186767578125,"text":"0"},{"top":285.59897,"left":509.80286,"width":30.24798583984375,"height":12.35186767578125,"text":"3"}],[{"top":297.95084,"left":247.14917,"width":30.773834228515625,"height":12.357086181640625,"text":"15.2"},{"top":297.95084,"left":277.923,"width":24.407867431640625,"height":12.357086181640625,"text":"8"},{"top":297.95084,"left":302.33087,"width":34.6478271484375,"height":12.357086181640625,"text":"275.8"},{"top":297.95084,"left":336.9787,"width":26.899566650390625,"height":12.357086181640625,"text":"180"},{"top":297.95084,"left":363.87827,"width":30.24810791015625,"height":12.357086181640625,"text":"3.07"},{"top":297.95084,"left":394.12637,"width":34.64752197265625,"height":12.357086181640625,"text":"3.780"},{"top":297.95084,"left":428.7739,"width":34.64801025390625,"height":12.357086181640625,"text":"18.00"},{"top":297.95084,"left":463.4219,"width":21.1429443359375,"height":12.357086181640625,"text":"0"},{"top":297.95084,"left":484.56485,"width":25.238006591796875,"height":12.357086181640625,"text":"0"},{"top":297.95084,"left":509.80286,"width":30.24798583984375,"height":12.357086181640625,"text":"3"}],[{"top":310.30792,"left":247.14917,"width":30.773834228515625,"height":12.3515625,"text":"10.4"},{"top":310.30792,"left":277.923,"width":24.407867431640625,"height":12.3515625,"text":"8"},{"top":310.30792,"left":302.33087,"width":34.6478271484375,"height":12.3515625,"text":"472.0"},{"top":310.30792,"left":336.9787,"width":26.899566650390625,"height":12.3515625,"text":"205"},{"top":310.30792,"left":363.87827,"width":30.24810791015625,"height":12.3515625,"text":"2.93"},{"top":310.30792,"left":394.12637,"width":34.64752197265625,"height":12.3515625,"text":"5.250"},{"top":310.30792,"left":428.7739,"width":34.64801025390625,"height":12.3515625,"text":"17.98"},{"top":310.30792,"left":463.4219,"width":21.1429443359375,"height":12.3515625,"text":"0"},{"top":310.30792,"left":484.56485,"width":25.238006591796875,"height":12.3515625,"text":"0"},{"top":310.30792,"left":509.80286,"width":30.24798583984375,"height":12.3515625,"text":"3"}],[{"top":322.6595,"left":247.14917,"width":30.773834228515625,"height":12.35186767578125,"text":"10.4"},{"top":322.6595,"left":277.923,"width":24.407867431640625,"height":12.35186767578125,"text":"8"},{"top":322.6595,"left":302.33087,"width":34.6478271484375,"height":12.35186767578125,"text":"460.0"},{"top":322.6595,"left":336.9787,"width":26.899566650390625,"height":12.35186767578125,"text":"215"},{"top":322.6595,"left":363.87827,"width":30.24810791015625,"height":12.35186767578125,"text":"3.00"},{"top":322.6595,"left":394.12637,"width":34.64752197265625,"height":12.35186767578125,"text":"5.424"},{"top":322.6595,"left":428.7739,"width":34.64801025390625,"height":12.35186767578125,"text":"17.82"},{"top":322.6595,"left":463.4219,"width":21.1429443359375,"height":12.35186767578125,"text":"0"},{"top":322.6595,"left":484.56485,"width":25.238006591796875,"height":12.35186767578125,"text":"0"},{"top":322.6595,"left":509.80286,"width":30.24798583984375,"height":12.35186767578125,"text":"3"}],[{"top":335.01135,"left":247.14917,"width":30.773834228515625,"height":12.357086181640625,"text":"14.7"},{"top":335.01135,"left":277.923,"width":24.407867431640625,"height":12.357086181640625,"text":"8"},{"top":335.01135,"left":302.33087,"width":34.6478271484375,"height":12.357086181640625,"text":"440.0"},{"top":335.01135,"left":336.9787,"width":26.899566650390625,"height":12.357086181640625,"text":"230"},{"top":335.01135,"left":363.87827,"width":30.24810791015625,"height":12.357086181640625,"text":"3.23"},{"top":335.01135,"left":394.12637,"width":34.64752197265625,"height":12.357086181640625,"text":"5.345"},{"top":335.01135,"left":428.7739,"width":34.64801025390625,"height":12.357086181640625,"text":"17.42"},{"top":335.01135,"left":463.4219,"width":21.1429443359375,"height":12.357086181640625,"text":"0"},{"top":335.01135,"left":484.56485,"width":25.238006591796875,"height":12.357086181640625,"text":"0"},{"top":335.01135,"left":509.80286,"width":30.24798583984375,"height":12.357086181640625,"text":"3"}],[{"top":347.36844,"left":247.14917,"width":30.773834228515625,"height":12.351531982421875,"text":"32.4"},{"top":347.36844,"left":277.923,"width":24.407867431640625,"height":12.351531982421875,"text":"4"},{"top":347.36844,"left":302.33087,"width":34.6478271484375,"height":12.351531982421875,"text":"78.7"},{"top":347.36844,"left":336.9787,"width":26.899566650390625,"height":12.351531982421875,"text":"66"},{"top":347.36844,"left":363.87827,"width":30.24810791015625,"height":12.351531982421875,"text":"4.08"},{"top":347.36844,"left":394.12637,"width":34.64752197265625,"height":12.351531982421875,"text":"2.200"},{"top":347.36844,"left":428.7739,"width":34.64801025390625,"height":12.351531982421875,"text":"19.47"},{"top":347.36844,"left":463.4219,"width":21.1429443359375,"height":12.351531982421875,"text":"1"},{"top":347.36844,"left":484.56485,"width":25.238006591796875,"height":12.351531982421875,"text":"1"},{"top":347.36844,"left":509.80286,"width":30.24798583984375,"height":12.351531982421875,"text":"4"}],[{"top":359.71997,"left":247.14917,"width":30.773834228515625,"height":12.3570556640625,"text":"30.4"},{"top":359.71997,"left":277.923,"width":24.407867431640625,"height":12.3570556640625,"text":"4"},{"top":359.71997,"left":302.33087,"width":34.6478271484375,"height":12.3570556640625,"text":"75.7"},{"top":359.71997,"left":336.9787,"width":26.899566650390625,"height":12.3570556640625,"text":"52"},{"top":359.71997,"left":363.87827,"width":30.24810791015625,"height":12.3570556640625,"text":"4.93"},{"top":359.71997,"left":394.12637,"width":34.64752197265625,"height":12.3570556640625,"text":"1.615"},{"top":359.71997,"left":428.7739,"width":34.64801025390625,"height":12.3570556640625,"text":"18.52"},{"top":359.71997,"left":463.4219,"width":21.1429443359375,"height":12.3570556640625,"text":"1"},{"top":359.71997,"left":484.56485,"width":25.238006591796875,"height":12.3570556640625,"text":"1"},{"top":359.71997,"left":509.80286,"width":30.24798583984375,"height":12.3570556640625,"text":"4"}],[{"top":372.07703,"left":247.14917,"width":30.773834228515625,"height":12.351959228515625,"text":"33.9"},{"top":372.07703,"left":277.923,"width":24.407867431640625,"height":12.351959228515625,"text":"4"},{"top":372.07703,"left":302.33087,"width":34.6478271484375,"height":12.351959228515625,"text":"71.1"},{"top":372.07703,"left":336.9787,"width":26.899566650390625,"height":12.351959228515625,"text":"65"},{"top":372.07703,"left":363.87827,"width":30.24810791015625,"height":12.351959228515625,"text":"4.22"},{"top":372.07703,"left":394.12637,"width":34.64752197265625,"height":12.351959228515625,"text":"1.835"},{"top":372.07703,"left":428.7739,"width":34.64801025390625,"height":12.351959228515625,"text":"19.90"},{"top":372.07703,"left":463.4219,"width":21.1429443359375,"height":12.351959228515625,"text":"1"},{"top":372.07703,"left":484.56485,"width":25.238006591796875,"height":12.351959228515625,"text":"1"},{"top":372.07703,"left":509.80286,"width":30.24798583984375,"height":12.351959228515625,"text":"4"}],[{"top":384.429,"left":247.14917,"width":30.773834228515625,"height":12.351531982421875,"text":"21.5"},{"top":384.429,"left":277.923,"width":24.407867431640625,"height":12.351531982421875,"text":"4"},{"top":384.429,"left":302.33087,"width":34.6478271484375,"height":12.351531982421875,"text":"120.1"},{"top":384.429,"left":336.9787,"width":26.899566650390625,"height":12.351531982421875,"text":"97"},{"top":384.429,"left":363.87827,"width":30.24810791015625,"height":12.351531982421875,"text":"3.70"},{"top":384.429,"left":394.12637,"width":34.64752197265625,"height":12.351531982421875,"text":"2.465"},{"top":384.429,"left":428.7739,"width":34.64801025390625,"height":12.351531982421875,"text":"20.01"},{"top":384.429,"left":463.4219,"width":21.1429443359375,"height":12.351531982421875,"text":"1"},{"top":384.429,"left":484.56485,"width":25.238006591796875,"height":12.351531982421875,"text":"0"},{"top":384.429,"left":509.80286,"width":30.24798583984375,"height":12.351531982421875,"text":"3"}],[{"top":396.78052,"left":247.14917,"width":30.773834228515625,"height":12.357025146484375,"text":"15.5"},{"top":396.78052,"left":277.923,"width":24.407867431640625,"height":12.357025146484375,"text":"8"},{"top":396.78052,"left":302.33087,"width":34.6478271484375,"height":12.357025146484375,"text":"318.0"},{"top":396.78052,"left":336.9787,"width":26.899566650390625,"height":12.357025146484375,"text":"150"},{"top":396.78052,"left":363.87827,"width":30.24810791015625,"height":12.357025146484375,"text":"2.76"},{"top":396.78052,"left":394.12637,"width":34.64752197265625,"height":12.357025146484375,"text":"3.520"},{"top":396.78052,"left":428.7739,"width":34.64801025390625,"height":12.357025146484375,"text":"16.87"},{"top":396.78052,"left":463.4219,"width":21.1429443359375,"height":12.357025146484375,"text":"0"},{"top":396.78052,"left":484.56485,"width":25.238006591796875,"height":12.357025146484375,"text":"0"},{"top":396.78052,"left":509.80286,"width":30.24798583984375,"height":12.357025146484375,"text":"3"}],[{"top":409.13754,"left":247.14917,"width":30.773834228515625,"height":12.35205078125,"text":"15.2"},{"top":409.13754,"left":277.923,"width":24.407867431640625,"height":12.35205078125,"text":"8"},{"top":409.13754,"left":302.33087,"width":34.6478271484375,"height":12.35205078125,"text":"304.0"},{"top":409.13754,"left":336.9787,"width":26.899566650390625,"height":12.35205078125,"text":"150"},{"top":409.13754,"left":363.87827,"width":30.24810791015625,"height":12.35205078125,"text":"3.15"},{"top":409.13754,"left":394.12637,"width":34.64752197265625,"height":12.35205078125,"text":"3.435"},{"top":409.13754,"left":428.7739,"width":34.64801025390625,"height":12.35205078125,"text":"17.30"},{"top":409.13754,"left":463.4219,"width":21.1429443359375,"height":12.35205078125,"text":"0"},{"top":409.13754,"left":484.56485,"width":25.238006591796875,"height":12.35205078125,"text":"0"},{"top":409.13754,"left":509.80286,"width":30.24798583984375,"height":12.35205078125,"text":"3"}],[{"top":421.4896,"left":247.14917,"width":30.773834228515625,"height":12.351348876953125,"text":"13.3"},{"top":421.4896,"left":277.923,"width":24.407867431640625,"height":12.351348876953125,"text":"8"},{"top":421.4896,"left":302.33087,"width":34.6478271484375,"height":12.351348876953125,"text":"350.0"},{"top":421.4896,"left":336.9787,"width":26.899566650390625,"height":12.351348876953125,"text":"245"},{"top":421.4896,"left":363.87827,"width":30.24810791015625,"height":12.351348876953125,"text":"3.73"},{"top":421.4896,"left":394.12637,"width":34.64752197265625,"height":12.351348876953125,"text":"3.840"},{"top":421.4896,"left":428.7739,"width":34.64801025390625,"height":12.351348876953125,"text":"15.41"},{"top":421.4896,"left":463.4219,"width":21.1429443359375,"height":12.351348876953125,"text":"0"},{"top":421.4896,"left":484.56485,"width":25.238006591796875,"height":12.351348876953125,"text":"0"},{"top":421.4896,"left":509.80286,"width":30.24798583984375,"height":12.351348876953125,"text":"3"}],[{"top":433.84094,"left":247.14917,"width":30.773834228515625,"height":12.357208251953125,"text":"19.2"},{"top":433.84094,"left":277.923,"width":24.407867431640625,"height":12.357208251953125,"text":"8"},{"top":433.84094,"left":302.33087,"width":34.6478271484375,"height":12.357208251953125,"text":"400.0"},{"top":433.84094,"left":336.9787,"width":26.899566650390625,"height":12.357208251953125,"text":"175"},{"top":433.84094,"left":363.87827,"width":30.24810791015625,"height":12.357208251953125,"text":"3.08"},{"top":433.84094,"left":394.12637,"width":34.64752197265625,"height":12.357208251953125,"text":"3.845"},{"top":433.84094,"left":428.7739,"width":34.64801025390625,"height":12.357208251953125,"text":"17.05"},{"top":433.84094,"left":463.4219,"width":21.1429443359375,"height":12.357208251953125,"text":"0"},{"top":433.84094,"left":484.56485,"width":25.238006591796875,"height":12.357208251953125,"text":"0"},{"top":433.84094,"left":509.80286,"width":30.24798583984375,"height":12.357208251953125,"text":"3"}],[{"top":446.19815,"left":247.14917,"width":30.773834228515625,"height":12.35186767578125,"text":"27.3"},{"top":446.19815,"left":277.923,"width":24.407867431640625,"height":12.35186767578125,"text":"4"},{"top":446.19815,"left":302.33087,"width":34.6478271484375,"height":12.35186767578125,"text":"79.0"},{"top":446.19815,"left":336.9787,"width":26.899566650390625,"height":12.35186767578125,"text":"66"},{"top":446.19815,"left":363.87827,"width":30.24810791015625,"height":12.35186767578125,"text":"4.08"},{"top":446.19815,"left":394.12637,"width":34.64752197265625,"height":12.35186767578125,"text":"1.935"},{"top":446.19815,"left":428.7739,"width":34.64801025390625,"height":12.35186767578125,"text":"18.90"},{"top":446.19815,"left":463.4219,"width":21.1429443359375,"height":12.35186767578125,"text":"1"},{"top":446.19815,"left":484.56485,"width":25.238006591796875,"height":12.35186767578125,"text":"1"},{"top":446.19815,"left":509.80286,"width":30.24798583984375,"height":12.35186767578125,"text":"4"}],[{"top":458.55002,"left":247.14917,"width":30.773834228515625,"height":12.35150146484375,"text":"26.0"},{"top":458.55002,"left":277.923,"width":24.407867431640625,"height":12.35150146484375,"text":"4"},{"top":458.55002,"left":302.33087,"width":34.6478271484375,"height":12.35150146484375,"text":"120.3"},{"top":458.55002,"left":336.9787,"width":26.899566650390625,"height":12.35150146484375,"text":"91"},{"top":458.55002,"left":363.87827,"width":30.24810791015625,"height":12.35150146484375,"text":"4.43"},{"top":458.55002,"left":394.12637,"width":34.64752197265625,"height":12.35150146484375,"text":"2.140"},{"top":458.55002,"left":428.7739,"width":34.64801025390625,"height":12.35150146484375,"text":"16.70"},{"top":458.55002,"left":463.4219,"width":21.1429443359375,"height":12.35150146484375,"text":"0"},{"top":458.55002,"left":484.56485,"width":25.238006591796875,"height":12.35150146484375,"text":"1"},{"top":458.55002,"left":509.80286,"width":30.24798583984375,"height":12.35150146484375,"text":"5"}],[{"top":470.90152,"left":247.14917,"width":30.773834228515625,"height":12.357025146484375,"text":"30.4"},{"top":470.90152,"left":277.923,"width":24.407867431640625,"height":12.357025146484375,"text":"4"},{"top":470.90152,"left":302.33087,"width":34.6478271484375,"height":12.357025146484375,"text":"95.1"},{"top":470.90152,"left":336.9787,"width":26.899566650390625,"height":12.357025146484375,"text":"113"},{"top":470.90152,"left":363.87827,"width":30.24810791015625,"height":12.357025146484375,"text":"3.77"},{"top":470.90152,"left":394.12637,"width":34.64752197265625,"height":12.357025146484375,"text":"1.513"},{"top":470.90152,"left":428.7739,"width":34.64801025390625,"height":12.357025146484375,"text":"16.90"},{"top":470.90152,"left":463.4219,"width":21.1429443359375,"height":12.357025146484375,"text":"1"},{"top":470.90152,"left":484.56485,"width":25.238006591796875,"height":12.357025146484375,"text":"1"},{"top":470.90152,"left":509.80286,"width":30.24798583984375,"height":12.357025146484375,"text":"5"}],[{"top":483.25854,"left":247.14917,"width":30.773834228515625,"height":12.351776123046875,"text":"15.8"},{"top":483.25854,"left":277.923,"width":24.407867431640625,"height":12.351776123046875,"text":"8"},{"top":483.25854,"left":302.33087,"width":34.6478271484375,"height":12.351776123046875,"text":"351.0"},{"top":483.25854,"left":336.9787,"width":26.899566650390625,"height":12.351776123046875,"text":"264"},{"top":483.25854,"left":363.87827,"width":30.24810791015625,"height":12.351776123046875,"text":"4.22"},{"top":483.25854,"left":394.12637,"width":34.64752197265625,"height":12.351776123046875,"text":"3.170"},{"top":483.25854,"left":428.7739,"width":34.64801025390625,"height":12.351776123046875,"text":"14.50"},{"top":483.25854,"left":463.4219,"width":21.1429443359375,"height":12.351776123046875,"text":"0"},{"top":483.25854,"left":484.56485,"width":25.238006591796875,"height":12.351776123046875,"text":"1"},{"top":483.25854,"left":509.80286,"width":30.24798583984375,"height":12.351776123046875,"text":"5"}],[{"top":495.61032,"left":247.14917,"width":30.773834228515625,"height":12.356109619140625,"text":"19.7"},{"top":495.61032,"left":277.923,"width":24.407867431640625,"height":12.356109619140625,"text":"6"},{"top":495.61032,"left":302.33087,"width":34.6478271484375,"height":12.356109619140625,"text":"145.0"},{"top":495.61032,"left":336.9787,"width":26.899566650390625,"height":12.356109619140625,"text":"175"},{"top":495.61032,"left":363.87827,"width":30.24810791015625,"height":12.356109619140625,"text":"3.62"},{"top":495.61032,"left":394.12637,"width":34.64752197265625,"height":12.356109619140625,"text":"2.770"},{"top":495.61032,"left":428.7739,"width":34.64801025390625,"height":12.356109619140625,"text":"15.50"},{"top":495.61032,"left":463.4219,"width":21.1429443359375,"height":12.356109619140625,"text":"0"},{"top":495.61032,"left":484.56485,"width":25.238006591796875,"height":12.356109619140625,"text":"1"},{"top":495.61032,"left":509.80286,"width":30.24798583984375,"height":12.356109619140625,"text":"5"}],[{"top":507.96643,"left":247.14917,"width":30.773834228515625,"height":12.35247802734375,"text":"15.0"},{"top":507.96643,"left":277.923,"width":24.407867431640625,"height":12.35247802734375,"text":"8"},{"top":507.96643,"left":302.33087,"width":34.6478271484375,"height":12.35247802734375,"text":"301.0"},{"top":507.96643,"left":336.9787,"width":26.899566650390625,"height":12.35247802734375,"text":"335"},{"top":507.96643,"left":363.87827,"width":30.24810791015625,"height":12.35247802734375,"text":"3.54"},{"top":507.96643,"left":394.12637,"width":34.64752197265625,"height":12.35247802734375,"text":"3.570"},{"top":507.96643,"left":428.7739,"width":34.64801025390625,"height":12.35247802734375,"text":"14.60"},{"top":507.96643,"left":463.4219,"width":21.1429443359375,"height":12.35247802734375,"text":"0"},{"top":507.96643,"left":484.56485,"width":25.238006591796875,"height":12.35247802734375,"text":"1"},{"top":507.96643,"left":509.80286,"width":30.24798583984375,"height":12.35247802734375,"text":"5"}]]}]
In [15]:
tabula.convert_into(pdf_path, "test.tsv", output_format="tsv")
!cat test.tsv
'pages' argument isn't specified.Will extract only from page 1 by default.
Got stderr: Jun 04, 2020 8:24:31 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font Symbol
Jun 04, 2020 8:24:31 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font ZapfDingbats
mpg cyl disp hp drat wt qsec vs am gear
21.0 6 160.0 110 3.90 2.620 16.46 0 1 4
21.0 6 160.0 110 3.90 2.875 17.02 0 1 4
22.8 4 108.0 93 3.85 2.320 18.61 1 1 4
21.4 6 258.0 110 3.08 3.215 19.44 1 0 3
18.7 8 360.0 175 3.15 3.440 17.02 0 0 3
18.1 6 225.0 105 2.76 3.460 20.22 1 0 3
14.3 8 360.0 245 3.21 3.570 15.84 0 0 3
24.4 4 146.7 62 3.69 3.190 20.00 1 0 4
22.8 4 140.8 95 3.92 3.150 22.90 1 0 4
19.2 6 167.6 123 3.92 3.440 18.30 1 0 4
17.8 6 167.6 123 3.92 3.440 18.90 1 0 4
16.4 8 275.8 180 3.07 4.070 17.40 0 0 3
17.3 8 275.8 180 3.07 3.730 17.60 0 0 3
15.2 8 275.8 180 3.07 3.780 18.00 0 0 3
10.4 8 472.0 205 2.93 5.250 17.98 0 0 3
10.4 8 460.0 215 3.00 5.424 17.82 0 0 3
14.7 8 440.0 230 3.23 5.345 17.42 0 0 3
32.4 4 78.7 66 4.08 2.200 19.47 1 1 4
30.4 4 75.7 52 4.93 1.615 18.52 1 1 4
33.9 4 71.1 65 4.22 1.835 19.90 1 1 4
21.5 4 120.1 97 3.70 2.465 20.01 1 0 3
15.5 8 318.0 150 2.76 3.520 16.87 0 0 3
15.2 8 304.0 150 3.15 3.435 17.30 0 0 3
13.3 8 350.0 245 3.73 3.840 15.41 0 0 3
19.2 8 400.0 175 3.08 3.845 17.05 0 0 3
27.3 4 79.0 66 4.08 1.935 18.90 1 1 4
26.0 4 120.3 91 4.43 2.140 16.70 0 1 5
30.4 4 95.1 113 3.77 1.513 16.90 1 1 5
15.8 8 351.0 264 4.22 3.170 14.50 0 1 5
19.7 6 145.0 175 3.62 2.770 15.50 0 1 5
15.0 8 301.0 335 3.54 3.570 14.60 0 1 5
In [16]:
tabula.convert_into(pdf_path, "test.csv", output_format="csv", stream=True)
!cat test.csv
'pages' argument isn't specified.Will extract only from page 1 by default.
Got stderr: Jun 04, 2020 8:24:35 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font Symbol
Jun 04, 2020 8:24:35 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font ZapfDingbats
"",mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
Mazda RX4,21.0,6,160.0,110,3.90,2.620,16.46,0,1,4,4
Mazda RX4 Wag,21.0,6,160.0,110,3.90,2.875,17.02,0,1,4,4
Datsun 710,22.8,4,108.0,93,3.85,2.320,18.61,1,1,4,1
Hornet 4 Drive,21.4,6,258.0,110,3.08,3.215,19.44,1,0,3,1
Hornet Sportabout,18.7,8,360.0,175,3.15,3.440,17.02,0,0,3,2
Valiant,18.1,6,225.0,105,2.76,3.460,20.22,1,0,3,1
Duster 360,14.3,8,360.0,245,3.21,3.570,15.84,0,0,3,4
Merc 240D,24.4,4,146.7,62,3.69,3.190,20.00,1,0,4,2
Merc 230,22.8,4,140.8,95,3.92,3.150,22.90,1,0,4,2
Merc 280,19.2,6,167.6,123,3.92,3.440,18.30,1,0,4,4
Merc 280C,17.8,6,167.6,123,3.92,3.440,18.90,1,0,4,4
Merc 450SE,16.4,8,275.8,180,3.07,4.070,17.40,0,0,3,3
Merc 450SL,17.3,8,275.8,180,3.07,3.730,17.60,0,0,3,3
Merc 450SLC,15.2,8,275.8,180,3.07,3.780,18.00,0,0,3,3
Cadillac Fleetwood,10.4,8,472.0,205,2.93,5.250,17.98,0,0,3,4
Lincoln Continental,10.4,8,460.0,215,3.00,5.424,17.82,0,0,3,4
Chrysler Imperial,14.7,8,440.0,230,3.23,5.345,17.42,0,0,3,4
Fiat 128,32.4,4,78.7,66,4.08,2.200,19.47,1,1,4,1
Honda Civic,30.4,4,75.7,52,4.93,1.615,18.52,1,1,4,2
Toyota Corolla,33.9,4,71.1,65,4.22,1.835,19.90,1,1,4,1
Toyota Corona,21.5,4,120.1,97,3.70,2.465,20.01,1,0,3,1
Dodge Challenger,15.5,8,318.0,150,2.76,3.520,16.87,0,0,3,2
AMC Javelin,15.2,8,304.0,150,3.15,3.435,17.30,0,0,3,2
Camaro Z28,13.3,8,350.0,245,3.73,3.840,15.41,0,0,3,4
Pontiac Firebird,19.2,8,400.0,175,3.08,3.845,17.05,0,0,3,2
Fiat X1-9,27.3,4,79.0,66,4.08,1.935,18.90,1,1,4,1
Porsche 914-2,26.0,4,120.3,91,4.43,2.140,16.70,0,1,5,2
Lotus Europa,30.4,4,95.1,113,3.77,1.513,16.90,1,1,5,2
Ford Pantera L,15.8,8,351.0,264,4.22,3.170,14.50,0,1,5,4
Ferrari Dino,19.7,6,145.0,175,3.62,2.770,15.50,0,1,5,6
Maserati Bora,15.0,8,301.0,335,3.54,3.570,14.60,0,1,5,8
Volvo 142E,21.4,4,121.0,109,4.11,2.780,18.60,1,1,4,2
If your tables have lines separating cells, you can use lattice
option. By default, tabula-py sets guess=True
, which is the same behavior for default of tabula app. If your tables don't have separation lines, you can try stream
option.
As it mentioned, try tabula app before struglling with tabula-py option. Or, PDFplumber can be an alternative since it has different extraction strategy.
In [17]:
pdf_path3 = "https://github.com/tabulapdf/tabula-java/raw/master/src/test/resources/technology/tabula/spanning_cells.pdf"
dfs = tabula.read_pdf(
pdf_path3,
pages="1",
lattice=True,
pandas_options={"header": [0, 1]},
area=[0, 0, 50, 100],
relative_area=True,
multiple_tables=False,
)
dfs[0]
Out[17]:
Improved operation scenario
Unnamed: 1_level_0
Unnamed: 2_level_0
Unnamed: 3_level_0
Unnamed: 4_level_0
Unnamed: 5_level_0
Volume servers in:
2007
2008
2009
2010
2011
0
Server closets
1,505
1,580
1,643
1,673
1,689
1
Server rooms
1,512
1,586
1,646
1,677
1,693
2
Localized data centers
1,512
1,586
1,646
1,677
1,693
3
Mid-tier data centers
1,512
1,586
1,646
1,677
1,693
4
Enterprise-class data centers
1,512
1,586
1,646
1,677
1,693
5
Best practice scenario
NaN
NaN
NaN
NaN
NaN
6
Volume servers in:
2007
2008
2009
2010
2011
7
Server closets
1,456
1,439
1,386
1,296
1,326
8
Server rooms
1,465
1,472
1,427
1,334
1,371
9
Localized data centers
1,465
1,471
1,426
1,334
1,371
10
Mid-tier data centers
1,465
1,471
1,426
1,334
1,371
11
Enterprise-class data centers
1,465
1,471
1,426
1,334
1,371
12
State-of-the-art scenario
NaN
NaN
NaN
NaN
NaN
13
Volume servers in:
2007
2008
2009
2010
2011
14
Server closets
1,485
1,471
1,424
1,315
1,349
15
Server rooms
1,495
1,573
1,586
1,424
1,485
16
Localized data centers
1,495
1,572
1,585
1,424
1,485
17
Mid-tier data centers
1,495
1,572
1,585
1,424
1,485
18
Enterprise-class data centers
1,495
1,572
1,585
1,424
1,485
In [18]:
template_path = "https://github.com/chezou/tabula-py/raw/master/tests/resources/data.tabula-template.json"
tabula.read_pdf_with_template(pdf_path, template_path)
Got stderr: Jun 04, 2020 8:24:53 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font Symbol
Jun 04, 2020 8:24:53 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font ZapfDingbats
Got stderr: Jun 04, 2020 8:24:55 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font Symbol
Jun 04, 2020 8:24:55 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font ZapfDingbats
Got stderr: Jun 04, 2020 8:24:57 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font Symbol
Jun 04, 2020 8:24:57 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font LiberationSans for base font ZapfDingbats
Out[18]:
[ Unnamed: 0 mpg cyl disp hp ... qsec vs am gear carb
0 Mazda RX4 21.0 6 160.0 110 ... 16.46 0 1 4 4
1 Mazda RX4 Wag 21.0 6 160.0 110 ... 17.02 0 1 4 4
2 Datsun 710 22.8 4 108.0 93 ... 18.61 1 1 4 1
3 Hornet 4 Drive 21.4 6 258.0 110 ... 19.44 1 0 3 1
4 Hornet Sportabout 18.7 8 360.0 175 ... 17.02 0 0 3 2
5 Valiant 18.1 6 225.0 105 ... 20.22 1 0 3 1
6 Duster 360 14.3 8 360.0 245 ... 15.84 0 0 3 4
7 Merc 240D 24.4 4 146.7 62 ... 20.00 1 0 4 2
8 Merc 230 22.8 4 140.8 95 ... 22.90 1 0 4 2
9 Merc 280 19.2 6 167.6 123 ... 18.30 1 0 4 4
10 Merc 280C 17.8 6 167.6 123 ... 18.90 1 0 4 4
11 Merc 450SE 16.4 8 275.8 180 ... 17.40 0 0 3 3
12 Merc 450SL 17.3 8 275.8 180 ... 17.60 0 0 3 3
13 Merc 450SLC 15.2 8 275.8 180 ... 18.00 0 0 3 3
14 Cadillac Fleetwood 10.4 8 472.0 205 ... 17.98 0 0 3 4
15 Lincoln Continental 10.4 8 460.0 215 ... 17.82 0 0 3 4
16 Chrysler Imperial 14.7 8 440.0 230 ... 17.42 0 0 3 4
17 Fiat 128 32.4 4 78.7 66 ... 19.47 1 1 4 1
18 Honda Civic 30.4 4 75.7 52 ... 18.52 1 1 4 2
19 Toyota Corolla 33.9 4 71.1 65 ... 19.90 1 1 4 1
20 Toyota Corona 21.5 4 120.1 97 ... 20.01 1 0 3 1
21 Dodge Challenger 15.5 8 318.0 150 ... 16.87 0 0 3 2
22 AMC Javelin 15.2 8 304.0 150 ... 17.30 0 0 3 2
23 Camaro Z28 13.3 8 350.0 245 ... 15.41 0 0 3 4
24 Pontiac Firebird 19.2 8 400.0 175 ... 17.05 0 0 3 2
25 Fiat X1-9 27.3 4 79.0 66 ... 18.90 1 1 4 1
26 Porsche 914-2 26.0 4 120.3 91 ... 16.70 0 1 5 2
27 Lotus Europa 30.4 4 95.1 113 ... 16.90 1 1 5 2
28 Ford Pantera L 15.8 8 351.0 264 ... 14.50 0 1 5 4
29 Ferrari Dino 19.7 6 145.0 175 ... 15.50 0 1 5 6
30 Maserati Bora 15.0 8 301.0 335 ... 14.60 0 1 5 8
31 Volvo 142E 21.4 4 121.0 109 ... 18.60 1 1 4 2
[32 rows x 12 columns],
Unnamed: 0 Sepal.Width Petal.Length Petal.Width Species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa,
Unnamed: 0 Sepal.Length Sepal.Width Petal.Length Petal.Width Species
0 145 6.7 3.3 5.7 2.5 virginica
1 146 6.7 3.0 5.2 2.3 virginica
2 147 6.3 2.5 5.0 1.9 virginica
3 148 6.5 3.0 5.2 2.0 virginica
4 149 6.2 3.4 5.4 2.3 virginica,
Unnamed: 0 supp dose
0 4.2 VC 0.5
1 11.5 VC 0.5
2 7.3 VC 0.5
3 5.8 VC 0.5
4 6.4 VC 0.5
5 10.0 VC 0.5
6 11.2 VC 0.5
7 11.2 VC 0.5
8 5.2 VC 0.5
9 7.0 VC 0.5
10 16.5 VC 1.0
11 16.5 VC 1.0
12 15.2 VC 1.0
13 17.3 VC 1.0]
If you have any question, ask on StackOverflow.
Content source: chezou/tabula-py
Similar notebooks: