Working with Excel Files in Python
Python is great programming language for manipulating the excel files. Excel files are mainly comprised of the
Workbook , Worksheet and Cells with Rows and Columns.
Python provides a famous module known as the “openpyxl” for working with Excel. Lets explore this module by working with a demo project.
Introduction of the “openpyxl” module
First we have to understand the basics of the Excel components like what is the workbook , worksheet , Cell etc.
Workbook – An Excel spread sheet document is called a workbook
Worksheet – Each workbook can contain multiple sheets(also called worksheets).
Cell – A box at a particular column and row is called a cell. Each cell can contain a number or text value.
Lets install the “Openpyxl”.
[code] pip install –U openpyxl [/code]
Qucik Steps:-
a)Import the openpyxl module.Call the openpyxl.load_workbook() function.
b)Get a Workbook object.Call the get_active_sheet() or get_sheet_by_name() workbook method.
c)Get a Worksheet object.Use indexing or the cell() sheet method with row and column keyword arguments.
d)Get a Cell object.Read the Cell object’s value attribute.
e) Manipulate Cells.
Opening the Excel File
To open the Excel File you need to import the “openpyxl” into your program and call the Function load_workbook.
[python]workbookObject = openpyxl.load_workbook("excel_example.xlsx")[/python]
Get the Sheet Name and Printing them.
[python]sheetNames = workbookObject.get_sheet_names() print(sheetNames)
sheet = workbookObject.get_sheet_by_name(sheetNames[0])[/python]
Lets get the Rows and Columns used
[python]print(sheet.get_highest_row())
print(sheet.get_highest_column())[/python]
Printing Content of Excel File
Now lets go through all rows and column and see what all data Excel Have
[python]
for rowOfCellObj in sheet[‘A1′:’C3’]:
for cellObj in rowOfCellObj:
print(cellObj.coordinate, cellObj.value)
print(‘— ROW END —‘)
[/python]
output :
A1 2015 13
B1 App
C1 73
— ROW END —
A2 2014 03
B2 Mob
C2 85
— ROW END —
Full Code
[python]
import openpyxl
def main():
workbookObject = openpyxl.load_workbook("myexample.xlsx")
sheetNames = workbookObject.get_sheet_names()
print(sheetNames)
sheet = workbookObject.get_sheet_by_name(sheetNames[0])
print(sheet.get_highest_row())
print(sheet.get_highest_column())
##print(sheet)
for rowOfCellObj in sheet[‘A1′:’C3’]:
for cellObj in rowOfCellObj:
print(cellObj.coordinate, cellObj.value)
print(‘— ROW END —‘)
if __name__ == "__main__":
main()
[/python]
We will discuss more about excel in next blog post.