Working with Excel sheets using Python

Photo Credits: pixabay.com

This post is intended as a quick tutorial for reading, writing and modifying excel sheets using python scripts. We will be discussing 3 python libraries that allow us to do so and we will discuss a few of the common operations that can be done with each of the libraries.

Lets take a sample excel file to begin with that contains a list of 5 students and their marks in a test. Lets name it "sample.xlsx".



This is a simple excel sheet that we can work with easily.

Next before we begin, pip install the following python libraries using the following commands on the command prompt -

pip install xlrd
pip install xlwt
pip install openpyxl

The first library we are going to look at is XLRD. This library is primarily used for information extraction/reading from excel sheets.

Lets look at the basic functionalities this library has to offer.


1. Counting no. of rows and columns in the excel sheet -

#import the library
import xlrd 

#setting the location of the file to be read
loc = ("sample.xlsx") 

#crating a workbook to read the file
wb = xlrd.open_workbook(loc) 

#reading the excel sheet
sheet = wb.sheet_by_index(0) 

#setting the access to table cells with starting row and column no.
sheet.cell_value(0, 0) 

#To get the no of rows in the excel sheet
len_rows = sheet.nrows
print("No. of rows :", len_rows)

#To get the no of columns in the excel sheet
len_cols = sheet.ncols
print("No. of columns :", len_cols)




Notice that this has also counted the column headers in the number of rows.


2. Extracting the entire spreadsheet -

#import the library
import xlrd 

#setting the location of the file to be read
loc = ("sample.xlsx") 

#crating a workbook to read the file
wb = xlrd.open_workbook(loc) 

#reading the excel sheet
sheet = wb.sheet_by_index(0) 

#setting the access to table cells with starting row and column no.
sheet.cell_value(0, 0) 

#To get the no of rows in the excel sheet
len_rows = sheet.nrows

#Loop to print all the rows in the spreadsheet
for i in range(len_rows-1):
	print(sheet.row_values(i))




3. Extracting a single row -

#import the library
import xlrd 

#setting the location of the file to be read
loc = ("sample.xlsx") 

#crating a workbook to read the file
wb = xlrd.open_workbook(loc) 

#reading the excel sheet
sheet = wb.sheet_by_index(0) 

#setting the access to table cells with starting row and column no.
sheet.cell_value(0, 0) 

#Printing the second row from the spreadsheet
print(sheet.row_values(2))


4. Extracting a column from the spreadsheet -

#import the library
import xlrd 

#setting the location of the file to be read
loc = ("sample.xlsx") 

#crating a workbook to read the file
wb = xlrd.open_workbook(loc) 

#reading the excel sheet
sheet = wb.sheet_by_index(0) 

#setting the access to table cells with starting row and column no.
sheet.cell_value(0, 0) 

#Getting the no. of rows
len1 = sheet.nrows

#Printing the second column from the spreadsheet
for i in range(len1-1):
	#Printing the cell value for row 'i' and column '1'
	print(sheet.cell_value(i,1))



The XLRD library however doesn't allow writing on an excel sheet. For that we will use the XLWT library.


1. Populating an excel sheet

#import the library
import xlwt 
from xlwt import Workbook 

#Creating a work group
wb = Workbook() 

#Create a sheet
sheet = wb.add_sheet('mySheet') 

#Writing the contents of the Excel 
sheet.write(0, 0, 'NAME') 
sheet.write(1, 0, 'Adam') 
sheet.write(2, 0, 'Bill') 
sheet.write(3, 0, 'Chandler') 

#Save the Workbook
wb.save('sample-1.xls')

2. Adding styles to a cell -

#import the library
import xlwt 
from xlwt import Workbook 

#Creating a work group
wb = Workbook() 

#Create a sheet
sheet = wb.add_sheet('mySheet') 

# Making the Heading Bold and Green
style = xlwt.easyxf('font: bold 1, color green;') 

#Writing the contents of the Excel with style applied to first row 
sheet.write(0, 0, 'NAME',style) 
sheet.write(1, 0, 'Adam') 
sheet.write(2, 0, 'Bill') 
sheet.write(3, 0, 'Chandler') 

#Save the Workbook
wb.save('sample-2.xls')


Now lets look at the third library that allows us to do both. In order to experiment with that library, lets add a column "RESULT" to the sample.xlsx spreadsheet.


Now our requirement is such that any student scoring marks greater than 40 should be given a result of PASS and any student scoring less than 40 will be given a result of fail.

Lets use the openpyxl library to execute this.



1. Conditional code for setting PASS or FAIL -

#Import the library
import openpyxl

#Load the workbook to be modified
wb = openpyxl.load_workbook('sample.xlsx')
sheet = wb['Sheet1']

#Get the no. of rows
row_len = sheet.max_row

#Conditional loop to get set PASS or FAIL
for i in range(2,row_len+1):
	if int(sheet.cell(row=i,column=5).value) >= 40:
		sheet.cell(row=i,column=6).value = "PASS"
	elif (sheet.cell(row=i,column=1).value) < 40:
		sheet.cell(row=i,column=6).value = "FAIL"

#Save the spreadsheet
wb.save('sample.xlsx')



Reactions

Post a Comment

0 Comments