The openpyxl
module allows you to manipulate Excel sheets within Python.
Excel files have the following terminology:
.xlsx
extension.
In [1]:
import openpyxl
We must first nagivate to the directory containing the spreadsheets, which for this notebook is the subdirectory 'files'.
In [8]:
# Import OS module to navigate directories
import os
# Change the directory to the excel file location, using relative and absolute paths as previously discussed.
os.chdir('files')
In [6]:
os.listdir()
Out[6]:
We must now open the workbook file.
In [12]:
workbook = openpyxl.load_workbook('example.xlsx')
type(workbook)
Out[12]:
Once the workbook is loaded, we can interact with specific sheets by loading them via workbook methods.
In [13]:
sheet = workbook.get_sheet_by_name('Sheet1')
type(sheet)
Out[13]:
We can also use the .get_sheet_names()
method to print all sheet names, in case we aren't sure.
In [14]:
workbook.get_sheet_names()
Out[14]:
We can now interact with specific cells by creating cell objects, referenced via a sheet method.
In [18]:
# Just references an object exists; requires an additional method to interact with
sheet['A1']
Out[18]:
The .value
method returns the actual value in the cell.
In [19]:
cell = sheet['A1']
cell.value
Out[19]:
This particular cell returns a datetime reference from Excel via Python's own datetime
module. A string value is available by passing into the str()
function:
In [22]:
print(str(cell.value))
print(str(sheet['A1'].value))
All cell values inherit their data types from Excel.
In [35]:
print("The value in cell %s is '%s' and is type %s." %('A1', sheet['A1'].value, type(sheet['A1'].value)))
print("The value in cell %s is '%s' and is type %s." %('B1', sheet['B1'].value, type(sheet['B1'].value)))
print("The value in cell %s is '%s' and is type %s." %('C1', sheet['C1'].value, type(sheet['C1'].value)))
You can also reference cells via rows and columns. Excel rows start at 1 and columns at A.
In [39]:
# B1 Cell
sheet.cell(row = 1, column = 2)
Out[39]:
This can be useful for iterative or looping operations.
In [41]:
for i in range(1,8):
print(i, sheet.cell(row=i, column=2).value)
openpyxl
) handles Excel spreadsheet files (.xlsx
).openpyxl.load_workbook()
opens a workbook file at a location and returns a workbook object..get_sheet_names()
method on a workbook prints all the sheet names inide a workbook..get_sheet_name()
method creates a worksheet object from a sheet in the workbook..value
method which allows you to see the content of the cell. cell()
method also returns a cell object from a sheet, from a defined row and column.