Python 熊猫使用预定义列表分解行，同时保留现有行的值_Python_Pandas_Explode

Python 熊猫使用预定义列表分解行，同时保留现有行的值

python pandas

Python 熊猫使用预定义列表分解行，同时保留现有行的值,python,pandas,explode,Python,Pandas,Explode,我正在努力使用预定义的列表分解行，同时保留现有行的值我有这样一个数据帧： df = pd.DataFrame({'id': ['01','01','02'], 'color': ['red', 'yellow','yellow'], 'wave': ['1', '2', '2'], 'count':[1,2,1]}) print(df) [output] id co

我正在努力使用预定义的列表分解行，同时保留现有行的值

我有这样一个数据帧：

df = pd.DataFrame({'id': ['01','01','02'],
                    'color': ['red', 'yellow','yellow'],
                    'wave': ['1', '2', '2'],
                    'count':[1,2,1]})
print(df)
[output]
   id   color wave  count
0  01     red    1      1
1  01  yellow    2      2
2  02  yellow    2      1

我有两份清单：

ls_color = ['yellow', 'red', 'blue']
ls_wave = ['1','2']

我的预期数据帧：

    id   color wave  count
0   01  yellow    1      0
1   01  yellow    2      2
2   01     red    1      1
3   01     red    2      0
4   01    blue    1      0
5   01    blue    2      0
6   02  yellow    1      0
7   02  yellow    2      1
8   02     red    1      0
9   02     red    2      0
10  02    blue    1      0
11  02    blue    2      0

非常感谢您的回答

使用

分解

，这里有一种方法：

from itertools import product
# create an expand of all combinations of color and wave
expand = list(product(df.id.unique(), ls_color, ls_wave))

expand
[('01', 'yellow', '1'), ('01', 'yellow', '2'), ('01', 'red', '1'), ('01', 'red', '2'), ('01', 'blue', '1'), ('01', 'blue', '2'), ('02', 'yellow', '1'), ('02', 'yellow', '2'), ('02', 'red', '1'), ('02', 'red', '2'), ('02', 'blue', '1'), ('02', 'blue', '2')]

# merge with original dataframe to add the count column
pd.DataFrame.from_records(expand, columns=['id', 'color', 'wave'])
  .merge(df, how='left').fillna(0)

    id   color wave  count
0   01  yellow    1    0.0
1   01  yellow    2    2.0
2   01     red    1    1.0
3   01     red    2    0.0
4   01    blue    1    0.0
5   01    blue    2    0.0
6   02  yellow    1    0.0
7   02  yellow    2    1.0
8   02     red    1    0.0
9   02     red    2    0.0
10  02    blue    1    0.0
11  02    blue    2    0.0

df2 = pd.DataFrame({'id':df['id'], 'color':[ls_color]*len(df), 'wave':[ls_wave]*len(df)})
df2 = df2.explode('color', ignore_index=True).explode('wave', ignore_index=True)
df2.merge(df, how = 'left').fillna(0).drop_duplicates(ignore_index=True)

输出：

df2

    id   color  wave    count
0   01  yellow     1      0.0
1   01  yellow     2      2.0
2   01     red     1      1.0
3   01     red     2      0.0
4   01    blue     1      0.0
5   01    blue     2      0.0
6   02  yellow     1      0.0
7   02  yellow     2      1.0
8   02     red     1      0.0
9   02     red     2      0.0
10  02    blue     1      0.0
11  02    blue     2      0.0

使用

分解

，以下是一种方法：

df2 = pd.DataFrame({'id':df['id'], 'color':[ls_color]*len(df), 'wave':[ls_wave]*len(df)})
df2 = df2.explode('color', ignore_index=True).explode('wave', ignore_index=True)
df2.merge(df, how = 'left').fillna(0).drop_duplicates(ignore_index=True)

输出：

df2

    id   color  wave    count
0   01  yellow     1      0.0
1   01  yellow     2      2.0
2   01     red     1      1.0
3   01     red     2      0.0
4   01    blue     1      0.0
5   01    blue     2      0.0
6   02  yellow     1      0.0
7   02  yellow     2      1.0
8   02     red     1      0.0
9   02     red     2      0.0
10  02    blue     1      0.0
11  02    blue     2      0.0

非常感谢你！它解决了问题！非常感谢你！它解决了问题！非常感谢您的回答！非常感谢您的回答！