How to read a lot of excel files in python pandas?

QBoard » Artificial Intelligence & ML » AI and ML - Python » How to read a lot of excel files in python pandas?

User Dashboard

How to read a lot of excel files in python pandas?

Back To Topics

Tags : python pandas macros

Vaibhav Mali

259

I have lots of excel files(xlsx format) and want to read and handle them.

For example, file names are ex201901, ex201902, 201912.

Its name is made by ex YYYYMM format.

Anyway, to import these files in pandas as an usual case, it's easy.

July 17, 2021 1:44 PM IST

0
Viaan Prakash

461
Given that you'll probably want to somehow work with all data frames at once afterwards, it's a smell if you even put them into separate local variables, and in general, whenever you're experiencing a "this task feels repetitive because I'm doing the same thing over and over again", that calls for introducing loops of some sort. As you're planning to use pandas, chances are that you'll be iterating soon again (now that you have your files, you're probably going to be performing some transformations on the rows of those files), in which case you'll probably be best off looking into how control flow a la loops works in Python (and indeed in pandas) in general; good tutorials are plentiful.

In your particular case, depending on what kind of processing you are planning on doing afterwards, you'd probably benefit from having something like
```
df2019 = [pd.read_excel(rf'C:\users\ex2019{str(i).zfill(2)}.xlsx') for i in range(1, 13)]
```
With that, you can access the individual data frames through e.g. df2019[5]
to get the data frame corresponding to June, or you can collapse all of them into a single data frame using
df = pd.concat(df2019)
if that's what suits your need.
If you have less structure in your file names,
glob
can come in handy. With that, the above could become something like
```
import glob
df2019 = list(map(pd.read_excel, glob.glob(r'C:\users\ex2019*.xlsx')))
```
This post was edited by Viaan Prakash at July 21, 2021 1:54 PM IST
July 21, 2021 1:53 PM IST

0
Samar Patil

346 3
You can use OS module from python. It has a method listdir which stores all the file names in the folder. Check the code below:
```
import os, re
listDir = os.listdir(FILE_PATH)
dfList = []
for aFile in listDir:
    if re.search(r'ex20190[0-9]{1}.xlsx', aFile):
        tmpDf = pd.read_excel(FILE_PATH + aFile)
        dfList.append(tmpDf)

outDf = pd.concat(dfList)
```
July 30, 2021 2:47 PM IST

0
Advika Banerjee

319 1
You can use OS module from python. It has a method listdir which stores all the file names in the folder. Check the code below:
```
import os, re
listDir = os.listdir(FILE_PATH)
dfList = []
for aFile in listDir:
    if re.search(r'ex20190[0-9]{1}.xlsx', aFile):
        tmpDf = pd.read_excel(FILE_PATH + aFile)
        dfList.append(tmpDf)

outDf = pd.concat(dfList)
```
August 1, 2021 11:10 PM IST

0
Rishi Pandya

131 2
```
import pandas
df = pandas.read_excel(open('your_xls_xlsx_filename','rb'), sheetname='Sheet 1')
# or using sheet index starting 0
df = pandas.read_excel(open('your_xls_xlsx_filename','rb'), sheetname=2)
```
August 11, 2021 5:19 PM IST

0

Member Sign In

Member Sign In

Create Account

How to read a lot of excel files in python pandas?

Connect With Us