Python 从url加载压缩(.gz).csv文件时出现问题

Python 从url加载压缩(.gz).csv文件时出现问题,python,pandas,Python,Pandas,我正在尝试直接从url加载csv文件。csv文件被压缩为.gz文件: #Importing libraries import pandas as pd import requests import io #defining the url url = "https://data.brasil.io/dataset/covid19/caso_full.csv.gz" 错误如下: ------------------------------------------------

我正在尝试直接从url加载csv文件。csv文件被压缩为.gz文件:

#Importing libraries
import pandas as pd
import requests
import io

#defining the url
url = "https://data.brasil.io/dataset/covid19/caso_full.csv.gz"
错误如下:

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-28-58ebbb6aba80> in <module>
      7 url = "https://data.brasil.io/dataset/covid19/caso_full.csv.gz"
      8 s=requests.get(url).content
----> 9 df=pd.read_csv(io.StringIO(s.decode('utf-8')), sep=',', compression='gzip', index_col=0, quotechar='"')
     10 
     11 #df=pd.read_csv("caso_full.csv.gz")

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte


    s=requests.get(url).content
    df=pd.read_csv(io.StringIO(s.decode('utf-8')), sep=',', compression='gzip', index_col=0, quotechar='"')
有关于为什么会发生这种情况的提示吗


谢谢大家!

问题是您正在解码内容,然后使用
io.StringIO

解决方案是不解码字节并使用
io.BytesIO

请参阅此堆栈溢出回答:

url以GNU ZIP的形式返回内容
pd.read\u csv
需要一个文件路径或缓冲区作为其第一个参数。因为内容是字节,所以必须使用
io.BytesIO
对象。Pandas然后将数据解压缩到CSV文件中

import io
import pandas as pd
import requests

# defining the url
url = "https://data.brasil.io/dataset/covid19/caso_full.csv.gz"
response = requests.get(url)
content = response.content
print(type(content))
df = pd.read_csv(
    io.BytesIO(content), sep=",", compression="gzip", index_col=0, quotechar='"',
)
print(df.head())
输出:

<class 'bytes'>

你能发布例外情况吗?您能打印申请的s和返回代码吗?非常感谢!请不要忘记接受答案。答案旁边将出现一个绿色复选标记。
<class 'bytes'>
city_ibge_code        date  epidemiological_week  estimated_population_2019  ...  place_type  state  new_confirmed  new_deaths
city                                                                                    ...
São Paulo       3550308.0  2020-02-25                     9                 12252023.0  ...        city     SP              1           0
NaN                  35.0  2020-02-25                     9                 45919049.0  ...       state     SP              1           0
São Paulo       3550308.0  2020-02-26                     9                 12252023.0  ...        city     SP              0           0
NaN                  35.0  2020-02-26                     9                 45919049.0  ...       state     SP              0           0
São Paulo       3550308.0  2020-02-27                     9                 12252023.0  ...        city     SP              0           0

[5 rows x 16 columns]