Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/362.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
python解析选项卡分隔文件_Python - Fatal编程技术网

python解析选项卡分隔文件

python解析选项卡分隔文件,python,Python,对python来说相当陌生 我想解析一个带有分隔值的文件,\t如下图所示。如何从文件中删除\t并将值分隔为列? 代码如下 import pandas as pd import io import requests url = "https://archive.ics.uci.edu/ml/machine-learning-databases/00236/seeds_dataset.txt" s = requests.get(url).content df = pd.read_csv(io.Str

对python来说相当陌生

我想解析一个带有分隔值的文件,\t如下图所示。如何从文件中删除\t并将值分隔为列? 代码如下

import pandas as pd
import io
import requests
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/00236/seeds_dataset.txt"
s = requests.get(url).content
df = pd.read_csv(io.StringIO(s.decode('utf-8')))

sep=“\t”
添加到
pd中。读取\u csv
。数据混乱,因此需要更换双制表符:

df = pd.read_csv(
    io.StringIO(s.decode('utf-8').replace("\t\t", "\t")), 
    header=None, sep="\t")

如果选择使用csv库,您可以尝试:

import pandas as pd
import requests
import csv

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/00236/seeds_dataset.txt"
raw_data = requests.get(url).content
file = open("raw_data.txt","w")
file.write(raw_data)
data = list(csv.reader(open('raw_data.txt', 'rb'), delimiter='\t'))
df = pd.DataFrame.from_records(data)
print df

在()4 url=“”5 s=requests.get(url).content-->6 df=pd.read\u csv(io.StringIO(s.decode('utf-8')),sep=“\t”)7中获取ParserError:ParserError回溯(最近一次调用)无效df@jkhammerseth再试一次,请使用正则表达式在多个选项卡上进行分隔,例如
sep=r'\t+'