Photo Credits: pixabay.com
This post is intended as a quick tutorial for reading, writing and modifying excel sheets using python scripts. We will be discussing 3 python libraries that allow us to do so and we will discuss a few of the common operations that can be done with each of the libraries.
Lets take a sample excel file to begin with that contains a list of 5 students and their marks in a test. Lets name it "sample.xlsx".
This is a simple excel sheet that we can work with easily.
Next before we begin, pip install the following python libraries using the following commands on the command prompt -
pip install xlrd pip install xlwt pip install openpyxl
The first library we are going to look at is XLRD. This library is primarily used for information extraction/reading from excel sheets.
Lets look at the basic functionalities this library has to offer.
1. Counting no. of rows and columns in the excel sheet -
#import the library import xlrd #setting the location of the file to be read loc = ("sample.xlsx") #crating a workbook to read the file wb = xlrd.open_workbook(loc) #reading the excel sheet sheet = wb.sheet_by_index(0) #setting the access to table cells with starting row and column no. sheet.cell_value(0, 0) #To get the no of rows in the excel sheet len_rows = sheet.nrows print("No. of rows :", len_rows) #To get the no of columns in the excel sheet len_cols = sheet.ncols print("No. of columns :", len_cols)
Notice that this has also counted the column headers in the number of rows.
2. Extracting the entire spreadsheet -
#import the library import xlrd #setting the location of the file to be read loc = ("sample.xlsx") #crating a workbook to read the file wb = xlrd.open_workbook(loc) #reading the excel sheet sheet = wb.sheet_by_index(0) #setting the access to table cells with starting row and column no. sheet.cell_value(0, 0) #To get the no of rows in the excel sheet len_rows = sheet.nrows #Loop to print all the rows in the spreadsheet for i in range(len_rows-1): print(sheet.row_values(i))
3. Extracting a single row -
#import the library import xlrd #setting the location of the file to be read loc = ("sample.xlsx") #crating a workbook to read the file wb = xlrd.open_workbook(loc) #reading the excel sheet sheet = wb.sheet_by_index(0) #setting the access to table cells with starting row and column no. sheet.cell_value(0, 0) #Printing the second row from the spreadsheet print(sheet.row_values(2))
4. Extracting a column from the spreadsheet -
#import the library import xlrd #setting the location of the file to be read loc = ("sample.xlsx") #crating a workbook to read the file wb = xlrd.open_workbook(loc) #reading the excel sheet sheet = wb.sheet_by_index(0) #setting the access to table cells with starting row and column no. sheet.cell_value(0, 0) #Getting the no. of rows len1 = sheet.nrows #Printing the second column from the spreadsheet for i in range(len1-1): #Printing the cell value for row 'i' and column '1' print(sheet.cell_value(i,1))
The XLRD library however doesn't allow writing on an excel sheet. For that we will use the XLWT library.
1. Populating an excel sheet
#import the library import xlwt from xlwt import Workbook #Creating a work group wb = Workbook() #Create a sheet sheet = wb.add_sheet('mySheet') #Writing the contents of the Excel sheet.write(0, 0, 'NAME') sheet.write(1, 0, 'Adam') sheet.write(2, 0, 'Bill') sheet.write(3, 0, 'Chandler') #Save the Workbook wb.save('sample-1.xls')
2. Adding styles to a cell -
#import the library import xlwt from xlwt import Workbook #Creating a work group wb = Workbook() #Create a sheet sheet = wb.add_sheet('mySheet') # Making the Heading Bold and Green style = xlwt.easyxf('font: bold 1, color green;') #Writing the contents of the Excel with style applied to first row sheet.write(0, 0, 'NAME',style) sheet.write(1, 0, 'Adam') sheet.write(2, 0, 'Bill') sheet.write(3, 0, 'Chandler') #Save the Workbook wb.save('sample-2.xls')
Now lets look at the third library that allows us to do both. In order to experiment with that library, lets add a column "RESULT" to the sample.xlsx spreadsheet.
Now our requirement is such that any student scoring marks greater than 40 should be given a result of PASS and any student scoring less than 40 will be given a result of fail.
Lets use the openpyxl library to execute this.
1. Conditional code for setting PASS or FAIL -
#Import the library import openpyxl #Load the workbook to be modified wb = openpyxl.load_workbook('sample.xlsx') sheet = wb['Sheet1'] #Get the no. of rows row_len = sheet.max_row #Conditional loop to get set PASS or FAIL for i in range(2,row_len+1): if int(sheet.cell(row=i,column=5).value) >= 40: sheet.cell(row=i,column=6).value = "PASS" elif (sheet.cell(row=i,column=1).value) < 40: sheet.cell(row=i,column=6).value = "FAIL" #Save the spreadsheet wb.save('sample.xlsx')
0 Comments