Python 将共享同一密钥的行动态合并为一个
我已经并希望创建另一个列,该列组合了名称以Answer和QID中相同值开头的列 也就是说,这里是数据帧的一个练习:Python 将共享同一密钥的行动态合并为一个,python,python-3.x,pandas,dataframe,pandas-groupby,Python,Python 3.x,Pandas,Dataframe,Pandas Groupby,我已经并希望创建另一个列,该列组合了名称以Answer和QID中相同值开头的列 也就是说,这里是数据帧的一个练习: QID Category Text QType Question Answer0 Answer1 0 16 Automotive Access to car Single Do you have access to a car? I own a car/cars I own a car/cars 1 16
QID Category Text QType Question Answer0 Answer1
0 16 Automotive Access to car Single Do you have access to a car? I own a car/cars I own a car/cars
1 16 Automotive Access to car Single Do you have access to a car? I lease/ have a company car I lease/have a company car
2 16 Automotive Access to car Single Do you have access to a car? I have access to a car/cars I have access to a car/cars
3 16 Automotive Access to car Single Do you have access to a car? No, I don’t have access to a car/cars No, I don't have access to a car
4 16 Automotive Access to car Single Do you have access to a car? Prefer not to say Prefer not to say
5 17 Automotive Make of car/cars Multiple If you own/lease a car(s), which brand are they? Audi Audi
6 17 Automotive Make of car/cars Multiple If you own/lease a car(s), which brand are they? Alfa Romeo Alfa Romeo
7 17 Automotive Make of car/cars Multiple If you own/lease a car(s), which brand are they? BMW BMW
8 17 Automotive Make of car/cars Multiple If you own/lease a car(s), which brand are they? Cadillac Cadillac
9 17 Automotive Make of car/cars Multiple If you own/lease a car(s), which brand are they? Chevrolet Chevrolet
10 17 Automotive Make of car/cars Multiple If you own/lease a car(s), which brand are they? Chrysler Chrysler
11 17 Automotive Make of car/cars Multiple If you own/lease a car(s), which brand are they? Citroen Citroen
12 17 Automotive Make of car/cars Multiple If you own/lease a car(s), which brand are they? Daihatsu Daihatsu
13 17 Automotive Make of car/cars Multiple If you own/lease a car(s), which brand are they? Fiat Fiat
14 17 Automotive Make of car/cars Multiple If you own/lease a car(s), which brand are they? Ford Ford
15 17 Automotive Make of car/cars Multiple If you own/lease a car(s), which brand are they? Honda Honda
16 17 Automotive Make of car/cars Multiple If you own/lease a car(s), which brand are they? Hyundai Hyundai
...
我想得到这样的东西:
QID Category Text QType Question Answer0 Answer1 Answer3 Answer4 Answer5 Answer6 Answer7 Answer8 Answer9 Answer10 Answer11 Answer12 ...
4 16 Automotive Access to car Single Do you have access to a car? I own a car/cars I lease/ have a company car I have access to a car/cars No, I don’t have access to a car/cars Prefer not to say
5 17 Automotive Make of car/cars Multiple If you own/lease a car(s), which brand are they? Audi Alfa Romeo BMW Cadillac Chevrolet Chrysler Citroen ...
由于,我可以组合一个给定/静态数量的列,这些列的名称在Answer和QID中以相同的值开头:
df = pd.DataFrame('path/to/file')
# lazy - want first of all attributes except QID and Answer columns
agg = {col:"first" for col in list(df.columns) if col!="QID" and "Answer" not in col}
# get a list of all answers in Answer0 for a QID
agg = {**agg, **{"Answer0":lambda s: list(s)}}
# helper function for row call. not needed but makes more readable
def ans(r, i):
return "" if i>=len(r["AnswerT"]) else r["AnswerT"][i]
# split list from aggregation back out into columns using assign
# rename Answer0 to AnserT from aggregation so that it can be referred to.
# AnswerT drop it when don't want it any more
dfgrouped = df.groupby("QID").agg(agg).reset_index().rename(columns={"Answer0":"AnswerT"}).assign(
Answer0=lambda dfa: dfa.apply(lambda r: ans(r, 0), axis=1),
Answer1=lambda dfa: dfa.apply(lambda r: ans(r, 1), axis=1),
Answer2=lambda dfa: dfa.apply(lambda r: ans(r, 2), axis=1),
Answer3=lambda dfa: dfa.apply(lambda r: ans(r, 3), axis=1),
Answer4=lambda dfa: dfa.apply(lambda r: ans(r, 4), axis=1),
Answer5=lambda dfa: dfa.apply(lambda r: ans(r, 5), axis=1),
Answer6=lambda dfa: dfa.apply(lambda r: ans(r, 6), axis=1),
).drop("AnswerT", axis=1)
print(dfgrouped.to_string(index=False))
如果列的名称在Answer和QID中以相同的值开头,如何组合一个动态数
merge()
使用内部联接将其恢复data=“”QID类别文本QType问题答案0答案1
你有车吗?我有车我有车
你有车吗?我租了一辆公司的车我租了一辆公司的车
你有车吗?我有车我有车
你有车吗?没有,我没有车不,我没有车
4 16汽车进入汽车单人你有进入汽车的权利吗?宁愿不说也不说
5 17汽车制造汽车/汽车多如果您拥有/租赁一辆汽车,它们是哪个品牌的?奥迪
6 17汽车制造汽车/汽车多如果您拥有/租赁一辆汽车,它们是哪个品牌的?阿尔法罗密欧阿尔法罗密欧
7 17汽车制造汽车/汽车多如果你拥有/租赁一辆汽车,它们是哪个品牌的?宝马
8 17汽车制造汽车/汽车多如果您拥有/租赁一辆汽车,它们是哪个品牌的?凯迪拉克凯迪拉克
9 17汽车制造汽车/汽车多如果您拥有/租赁一辆汽车,它们是哪个品牌?雪佛兰雪佛兰
10 17汽车制造汽车/汽车倍数如果您拥有/租赁一辆汽车,它们是哪个品牌的?克莱斯勒
11 17汽车制造汽车/汽车多如果您拥有/租赁一辆汽车,它们是哪个品牌?雪铁龙雪铁龙
12 17汽车制造汽车/汽车倍数如果您拥有/租赁一辆汽车,它们是哪个品牌?大发大发
13 17汽车制造汽车/汽车倍数如果您拥有/租赁一辆汽车,它们是哪个品牌?菲亚特菲亚特
14 17汽车制造汽车/汽车倍数如果您拥有/租赁一辆汽车,它们是哪个品牌的?福特
15 17汽车制造汽车/汽车多如果您拥有/租赁一辆汽车,它们是哪个品牌?本田
16 17汽车制造汽车/汽车多如果您拥有/租赁一辆汽车,它们是哪个品牌?现代
a=[[t.strip()表示重新拆分中的t(“,l),如果t!=”“]表示重新拆分中的l(([0-9]+[])*(.*),r“\2”,l)表示数据拆分中的l(“\n”)]]
df=pd.DataFrame(data=a[1:],columns=a[0])
#lazy-除了QID和Answer列之外,首先需要所有属性
agg={col:“first”表示列表(df.columns)中的col,如果col!=“QID”和“Answer”不在col}
#获取QID答案0中所有答案的列表
agg={**agg,***{“Answer0”:lambda s:list}
#行调用的助手函数。不需要,但更具可读性
def ans(r,i):
返回“”如果i>=len(r[“AnswerT”])否则r[“AnswerT”][i]
#按QID分组,并构建新的回答栏,即答案列表
dfgrouped=df.groupby(“QID”).agg(agg).reset_index().rename(columns={“Answer0”:“AnswerT”})#(
#通过向构造函数构建标准列表/目录结构,从AnswerT构建一个新的数据框架
#在QID上合并,最后删除临时应答列
dfgrouped=dfgrouped.merge(
数据帧(
[{**{QID:r[0]},**{f“Answer{i}”:v代表i,v在枚举(r[1])}中
对于DFG中的r[[“QID”,“AnswerT”]].values.tolist()]
),on=“QID”,how=“inner”).drop(columns=“AnswerT”)
打印(dfgrouped.to_字符串(index=False))
输出
QID Category Text QType Question Answer0 Answer1 Answer2 Answer3 Answer4 Answer5 Answer6 Answer7 Answer8 Answer9 Answer10 Answer11
16 Automotive Access to car Single Do you have access to a car? I own a car/cars I lease/ have a company car I have access to a car/cars No, I don’t have access to a car/cars Prefer not to say NaN NaN NaN NaN NaN NaN NaN
17 Automotive Make of car/cars Multiple If you own/lease a car(s), which brand are they? Audi Alfa Romeo BMW Cadillac Chevrolet Chrysler Citroen Daihatsu Fiat Ford Honda Hyundai