Python 3.x 基于中的前缀将多个列熔化为对应列
假设以下数据帧:Python 3.x 基于中的前缀将多个列熔化为对应列,python-3.x,pandas,dataframe,Python 3.x,Pandas,Dataframe,假设以下数据帧: id transaction seller0 seller1 seller2 buyer0 buyer1 0 1 Subject1 Tim Jamie Melissa Rosie NaN 1 2 Subject2 Rima Derren NaN Annalise Hania 2 3 Subject3 Rosa NaN NaN Joshua NaN
id transaction seller0 seller1 seller2 buyer0 buyer1
0 1 Subject1 Tim Jamie Melissa Rosie NaN
1 2 Subject2 Rima Derren NaN Annalise Hania
2 3 Subject3 Rosa NaN NaN Joshua NaN
我如何将其重塑为以下格式?即,每笔交易的seller0 seller1 seller2
至seller
,以及buyer0 buyer1
至buyer
列
所需的产出:
id transaction seller buyer
0 1 Subject1 Tim Rosie
1 1 Subject1 Jamie NaN
2 1 Subject1 Melissa NaN
3 2 Subject2 Rima Annalise
4 2 Subject2 Derren Hania
5 3 Subject3 Rosa Joshua
代码:
输出:
所需更新输出:
id transaction type name
0 1 Subject1 seller Tim
1 1 Subject1 seller Jamie
2 1 Subject1 seller Melissa
3 2 Subject2 seller Rima
4 2 Subject2 seller Derren
5 3 Subject3 seller Rosa
6 1 Subject1 buyer Rosie
7 2 Subject2 buyer Annalise
8 2 Subject2 buyer Hania
9 3 Subject3 buyer Joshua
使用
您还可以使用来自的函数;目前,您必须从以下位置安装最新的开发版本:
更新:
对于更新的结果,您可以堆叠并进行一些小的调整:
(
df.set_index(["id", "transaction"])
.stack()
.rename_axis(["id", "transaction", "type"])
.reset_index(name="name")
.assign(type=lambda df: df["type"].str[:-1])
)
id transaction type name
0 1 Subject1 seller Tim
1 1 Subject1 seller Jamie
2 1 Subject1 seller Melissa
3 1 Subject1 buyer Rosie
4 2 Subject2 seller Rima
5 2 Subject2 seller Derren
6 2 Subject2 buyer Annalise
7 2 Subject2 buyer Hania
8 3 Subject3 seller Rosa
9 3 Subject3 buyer Joshua
您还可以使用:
在这两种情况下,您都试图完全消除空条目。默认情况下,堆栈将删除空项。使用
您还可以使用来自的函数;目前,您必须从以下位置安装最新的开发版本:
更新:
对于更新的结果,您可以堆叠并进行一些小的调整:
(
df.set_index(["id", "transaction"])
.stack()
.rename_axis(["id", "transaction", "type"])
.reset_index(name="name")
.assign(type=lambda df: df["type"].str[:-1])
)
id transaction type name
0 1 Subject1 seller Tim
1 1 Subject1 seller Jamie
2 1 Subject1 seller Melissa
3 1 Subject1 buyer Rosie
4 2 Subject2 seller Rima
5 2 Subject2 seller Derren
6 2 Subject2 buyer Annalise
7 2 Subject2 buyer Hania
8 3 Subject3 seller Rosa
9 3 Subject3 buyer Joshua
您还可以使用:
在这两种情况下,您都试图完全消除空条目。默认情况下,堆栈将删除空项。如果我想获得问题中更新的新输出,非常感谢?你能帮我吗?我的错。看到。我会给你回复两次主题为NaN的买家;我不确定这是否正确对不起,我的错误,我会再次更新。是的。正是。非常感谢,如果我想得到问题中更新的新输出?你能帮我吗?我的错。看到。我会给你回复两次主题为NaN的买家;我不确定这是否正确对不起,我的错误,我会再次更新。是的。正是这样。
(
pd.wide_to_long(df,
stubnames=["seller", "buyer"],
i=["id", "transaction"],
j="num")
.dropna(how="all")
.droplevel(level=-1)
.reset_index()
)
id transaction seller buyer
0 1 Subject1 Tim Rosie
1 1 Subject1 Jamie NaN
2 1 Subject1 Melissa NaN
3 2 Subject2 Rima Annalise
4 2 Subject2 Derren Hania
5 3 Subject3 Rosa Joshua
# install latest dev version
# pip install git+https://github.com/ericmjl/pyjanitor.git
import janitor
(
df.pivot_longer(index=["id", "transaction"],
names_to=".value",
names_pattern=r"([a-z]+)\d")
.dropna(subset=["seller", "buyer"], how="all")
)
id transaction seller buyer
0 1 Subject1 Tim Rosie
1 2 Subject2 Rima Annalise
2 3 Subject3 Rosa Joshua
3 1 Subject1 Jamie NaN
4 2 Subject2 Derren Hania
6 1 Subject1 Melissa NaN
(
df.set_index(["id", "transaction"])
.stack()
.rename_axis(["id", "transaction", "type"])
.reset_index(name="name")
.assign(type=lambda df: df["type"].str[:-1])
)
id transaction type name
0 1 Subject1 seller Tim
1 1 Subject1 seller Jamie
2 1 Subject1 seller Melissa
3 1 Subject1 buyer Rosie
4 2 Subject2 seller Rima
5 2 Subject2 seller Derren
6 2 Subject2 buyer Annalise
7 2 Subject2 buyer Hania
8 3 Subject3 seller Rosa
9 3 Subject3 buyer Joshua
result = df.pivot_longer(index=["id", "transaction"],
names_to="type",
names_pattern=r"([a-z]+)\d",
values_to="name").dropna()
result
id transaction type name
0 1 Subject1 seller Tim
1 2 Subject2 seller Rima
2 3 Subject3 seller Rosa
3 1 Subject1 seller Jamie
4 2 Subject2 seller Derren
6 1 Subject1 seller Melissa
9 1 Subject1 buyer Rosie
10 2 Subject2 buyer Annalise
11 3 Subject3 buyer Joshua
13 2 Subject2 buyer Hania