Python 无法使用xlsx文件的Pandas read_excel（）下载整行_Python_Excel_Pandas_Openpyxl

Python 无法使用xlsx文件的Pandas read_excel（）下载整行

python excel pandas

Python 无法使用xlsx文件的Pandas read_excel（）下载整行,python,excel,pandas,openpyxl,Python,Excel,Pandas,Openpyxl,该文件应该有数千行。但是在下面使用它只返回dataframe中的前几行文件失败示例 import pandas as pd url = 'https://www.hkex.com.hk/eng/services/trading/securities/securitieslists/ListOfSecurities.xlsx' df = pd.read_excel(url, engine='openpyxl', header=2, usecols='A:D', verbose=True)

该文件应该有数千行。但是在下面使用它只返回dataframe中的前几行

文件

失败示例

import pandas as pd

url = 'https://www.hkex.com.hk/eng/services/trading/securities/securitieslists/ListOfSecurities.xlsx'
df = pd.read_excel(url, engine='openpyxl', header=2, usecols='A:D', verbose=True)
print(df.shape)

工作示例

import pandas as pd

url = 'https://www.hkex.com.hk/eng/services/trading/securities/securitieslists/ListOfSecurities.xlsx'
df = pd.read_excel(url, engine='openpyxl', header=2, usecols='A:D', verbose=True)
print(df.shape)

相同的文件。首先下载，在Excel中打开，修改文本并保存（没有更改格式并保留xlsx），然后使用read_Excel（）从文件中打开

url = 'https://www.hkex.com.hk/eng/services/trading/securities/securitieslists/ListOfSecurities.xlsx'
path = os.path.join(os.path.dirname(__file__), 'download')
wget.download(url, out=path)
file = os.path.join(path, 'ListOfSecurities.xlsx')

# open to edit and then save in Excel

df = pd.read_excel(file, engine='openpyxl', header=2, usecols='A:D', verbose=True)
print(df.shape)

更新：基于

xlrd

无法使用的上下文对代码进行的更改

将熊猫作为pd导入
导入操作系统
进口工作组
url='1〕https://www.hkex.com.hk/eng/services/trading/securities/securitieslists/ListOfSecurities.xlsx'
path=os.path.join（os.path.dirname（_文件__），“下载”）
wget.download（url，out=path）
filename=os.path.join（路径'ListOfSecurities.xlsx'）
从openpyxl导入加载工作簿
excel\u文件=加载\u工作簿（文件名）
工作表=excel\u文件[“安全列表”]
表。删除列（5,21）#仅使用列A:D
数据=表1.2值
cols=next（数据）#跳过第0行
cols=下一个（数据）#跳过第1行
cols=next（数据）[0:4]#cols A:D
df=pd.DataFrame（数据，列=cols）
打印（df.形状）

我在pandas中将excel引擎更改为使用默认引擎（

xlrd

），以下代码正常工作

将熊猫作为pd导入
导入操作系统
进口工作组
url='1〕https://www.hkex.com.hk/eng/services/trading/securities/securitieslists/ListOfSecurities.xlsx'
path=os.path.join（os.path.dirname（_文件__），“下载”）
wget.download（url，out=path）
filename=os.path.join（路径'ListOfSecurities.xlsx'）
df=pd.read\u excel（文件名，头=2，usecols='A:D'，verbose=True）
打印（df.形状）

输出中的一个不一致之处是显示的行数减少了4行：

Reading sheet 0
(17486, 4)

更新：基于

xlrd

无法使用的上下文对代码进行的更改

将熊猫作为pd导入
导入操作系统
进口工作组
url='1〕https://www.hkex.com.hk/eng/services/trading/securities/securitieslists/ListOfSecurities.xlsx'
path=os.path.join（os.path.dirname（_文件__），“下载”）
wget.download（url，out=path）
filename=os.path.join（路径'ListOfSecurities.xlsx'）
从openpyxl导入加载工作簿
excel\u文件=加载\u工作簿（文件名）
工作表=excel\u文件[“安全列表”]
表。删除列（5,21）#仅使用列A:D
数据=表1.2值
cols=next（数据）#跳过第0行
cols=下一个（数据）#跳过第1行
cols=next（数据）[0:4]#cols A:D
df=pd.DataFrame（数据，列=cols）
打印（df.形状）

我在pandas中将excel引擎更改为使用默认引擎（

xlrd

），以下代码正常工作

将熊猫作为pd导入
导入操作系统
进口工作组
url='1〕https://www.hkex.com.hk/eng/services/trading/securities/securitieslists/ListOfSecurities.xlsx'
path=os.path.join（os.path.dirname（_文件__），“下载”）
wget.download（url，out=path）
filename=os.path.join（路径'ListOfSecurities.xlsx'）
df=pd.read\u excel（文件名，头=2，usecols='A:D'，verbose=True）
打印（df.形状）

输出中的一个不一致之处是显示的行数减少了4行：

Reading sheet 0
(17486, 4)

问题可能是由正在使用的数据引起的。您应该包含一些示例数据，以便帮助人员复制问题。下载并注意到相同的问题。形状（5,4）。数据格式问题。“排列表”和“板批”列出现，但这仍然是有效的excel格式。因此，在这里寻找解决方案，问题可能是由使用的数据引起的。您应该包含一些示例数据，以便帮助人员复制问题。下载并注意到相同的问题。形状（5,4）。数据格式问题。“排列表”和“板批”列出现，但这仍然是有效的excel格式。因此，在这里寻找解决方案谢谢Aasim。但是，我使用的xlrd版本不支持xlsx文件。xlrd.biffh.xlrd错误：Excel xlsx文件；不支持。xlrd==2.0.1熊猫==1.1.5您使用的熊猫版本是什么？熊猫==1.1。5@memento对答案进行了更改，以使用

openpyxl

自身打开文件，然后将其加载到

pandas

中。让我知道这是否有效。非常有效，谢谢Hanks Aasim。但是，我使用的xlrd版本不支持xlsx文件。xlrd.biffh.xlrd错误：Excel xlsx文件；不支持。xlrd==2.0.1熊猫==1.1.5您使用的熊猫版本是什么？熊猫==1.1。5@memento对答案进行了更改，以使用

openpyxl

自身打开文件，然后将其加载到

pandas

中。让我知道这是否有效。非常有效，谢谢