Python 访问Jupyter书籍中的Github数据

Python 访问Jupyter书籍中的Github数据,python,pandas,csv,github,Python,Pandas,Csv,Github,尝试访问Jupyter Books中的csv文件时出现标记化错误。查看了一些回复,但似乎没有任何帮助。任何帮助都将不胜感激。谢谢 url = "https://github.com/Kallikrates/bde_at2/blob/3875fd9b03b02b2772129acf2d8d83619971b2eb/2016Census_G01_NSW_LGA.csv" insert_df = pd.read_csv(url, header=0, sep=',', quotech

尝试访问Jupyter Books中的csv文件时出现标记化错误。查看了一些回复,但似乎没有任何帮助。任何帮助都将不胜感激。谢谢

url = "https://github.com/Kallikrates/bde_at2/blob/3875fd9b03b02b2772129acf2d8d83619971b2eb/2016Census_G01_NSW_LGA.csv"
insert_df = pd.read_csv(url, header=0, sep=',', quotechar='"')
insert_df.head()

错误:

---------------------------------------------------------------------------

ParserError                               Traceback (most recent call last)

<ipython-input-21-21c294baaa45> in <module>()
      1 url = "https://github.com/Kallikrates/bde_at2/blob/3875fd9b03b02b2772129acf2d8d83619971b2eb/2016Census_G01_NSW_LGA.csv"
----> 2 insert_df = pd.read_csv(url, header=0, sep=',', quotechar='"')
      3 insert_df.head()

3 frames

/usr/local/lib/python3.7/dist-packages/pandas/io/parsers.py in read(self, nrows)
   2155     def read(self, nrows=None):
   2156         try:
-> 2157             data = self._reader.read(nrows)
   2158         except StopIteration:
   2159             if self._first_chunk:

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.read()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_rows()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows()

pandas/_libs/parsers.pyx in pandas._libs.parsers.raise_parser_error()

ParserError: Error tokenizing data. C error: Expected 1 fields in line 79, saw 2

---------------------------------------------------------------------------
ParserError回溯(上次最近的调用)
在()
1 url=”https://github.com/Kallikrates/bde_at2/blob/3875fd9b03b02b2772129acf2d8d83619971b2eb/2016Census_G01_NSW_LGA.csv"
---->2插入_df=pd.read _csv(url,头=0,sep=',',quotechar=')
3插入方向图头部()
3帧
/读取中的usr/local/lib/python3.7/dist-packages/pandas/io/parsers.py(self,nrows)
2155 def读取(自身,nrows=无):
2156尝试:
->2157数据=自身读取(nrows)
2158除停止迭代外:
2159如果自我第一块:
pandas/_libs/parsers.pyx在pandas中。_libs.parsers.textleader.read()
pandas/_libs/parsers.pyx在pandas中。_libs.parsers.TextReader._read_low_memory()
pandas/_libs/parsers.pyx在pandas中。_libs.parsers.TextReader._read_rows()
pandas/_libs/parsers.pyx在pandas中。_libs.parsers.TextReader。_tokenize_rows()
pandas/_libs/parsers.pyx在pandas中。_libs.parsers.raise_parser_error()
ParserError:将数据标记化时出错。C错误:第79行中应有1个字段,saw 2
两个选项:

1st:读取为html

url = "https://github.com/Kallikrates/bde_at2/blob/3875fd9b03b02b2772129acf2d8d83619971b2eb/2016Census_G01_NSW_LGA.csv"
insert_df = pd.read_html(url)
insert_df[0].head(2)
第二次读取为原始,观察其中的URL,“原始”

输出:

尝试
insert\u df=pd。阅读html(url)
结果将是列表,因此您的数据集将是
insert\u df[0]。head()
太棒了。谢谢@simpleApp!
url="https://raw.githubusercontent.com/Kallikrates/bde_at2/3875fd9b03b02b2772129acf2d8d83619971b2eb/2016Census_G01_NSW_LGA.csv"
insert_df_raw = pd.read_csv(url, header=0, sep=',', quotechar='"')
insert_df_raw.head(2)