如何使用python处理数据集?
我有一个输入数据集名称data.csv 内容是如何使用python处理数据集?,python,pandas,Python,Pandas,我有一个输入数据集名称data.csv 内容是 id , name 1 , Jone/Elvis/Tom 2 , Elvis/Tonny 名称列使用斜杠作为分隔符 我需要处理data.csv,我的预期输出是 id, Jone, Elvis, Tom, Toony 1, 1 , 1 , 1 , 0 2, 0 , 1 , 0 , 1 1表示名称中已存在列名,0表示不存在。 如何使用python和pandas来传输输入 import pandas as
id , name
1 , Jone/Elvis/Tom
2 , Elvis/Tonny
名称列使用斜杠作为分隔符
我需要处理data.csv,我的预期输出是
id, Jone, Elvis, Tom, Toony
1, 1 , 1 , 1 , 0
2, 0 , 1 , 0 , 1
1表示名称中已存在列名,0表示不存在。
如何使用python和pandas来传输输入
import pandas as pd
data = pd.read_csv("./data.csv")
data["name"]= data["name"].str.split("/")
jone = [0, 0]
elvis = [0, 0]
tom = [0, 0]
tonny = [0, 0]
for i in data.index:
if any("Jone" in s for s in data.name[i]):
jone[i] = 1
else:
jone[i] = 0
for i in data.index:
if any("Elvis" in s for s in data.name[i]):
elvis[i] = 1
else:
elvis[i] = 0
for i in data.index:
if any("Tom" in s for s in data.name[i]):
tom[i] = 1
else:
tom[i] = 0
for i in data.index:
if any("Tonny" in s for s in data.name[i]):
tonny[i] = 1
else:
tonny[i] = 0
data['Jone'] = jone
data['Elvis'] = elvis
data['Tom'] = tom
data['Tonny'] = tonny
让我们使用熊猫和
.str.get\u假人
和sep
参数:
从剪贴板读入数据帧
df = pd.read_clipboard(sep='\s+\,\s+')
df
输入数据帧:
id name
0 1 Jone/Elvis/Tom
1 2 Elvis/Tonny
设置索引并使用字符串访问器与get\u dummies
:
df1 = df.set_index('id')
df1['name'].str.get_dummies(sep='/').reset_index()
输出:
id Elvis Jone Tom Tonny
0 1 1 1 1 0
1 2 1 0 0 1
id Elvis Jone Tom Tonny
0 1 1 1 1 0
1 2 1 0 0 1