Python 如果存在';熊猫中的混合列数据
我有一个CSV文件,看起来像:Python 如果存在';熊猫中的混合列数据,python,pandas,dataframe,Python,Pandas,Dataframe,我有一个CSV文件,看起来像: Timestamp Surface_Data 8737.37 Maze_A 8737.42 Maze_A 8740.40 Phone_Surface 8743.23 Desktop_Surface 8765.26 Phone_Surface 8765.29 Maze_A 8765.30 Phone_Surface 8765.56
Timestamp Surface_Data
8737.37 Maze_A
8737.42 Maze_A
8740.40 Phone_Surface
8743.23 Desktop_Surface
8765.26 Phone_Surface
8765.29 Maze_A
8765.30 Phone_Surface
8765.56 Maze_B
8766.16 Maze_B
8783.74 Maze_A
8793.20 Maze_A
8840.12 Phone_Surface
8840.40 Phone_Surface
8841.40 Maze_B
我想添加一列,计算迷宫a到迷宫B或迷宫B到迷宫a的变化,它必须看起来像:
Timestamp Surface_Data Maze_Count
8737.37 Maze_A 1
8737.42 Maze_A
8740.40 Phone_Surface
8743.23 Desktop_Surface
8765.26 Phone_Surface
8765.29 Maze_A
8765.30 Phone_Surface
8765.56 Maze_B 2
8766.16 Maze_B
8783.74 Maze_A 3
8793.20 Maze_A
8840.12 Phone_Surface
8840.40 Phone_Surface
8841.40 Maze_B 4
当“Surface_Data”列中的值发生更改时,我尝试使用cumsum(),但它考虑了所有更改,包括其他不需要的值。因此,我想要的东西只有在遇到迷宫A或迷宫B值时才会增加。shift
,where
,cumsum
一次尝试:
c=df['Surface_Data'].str.contains('Maze'))
df['Maze_Count']=df.loc[c',Surface_Data'].ne(df.loc[c',Surface_Data'].shift()
).astype(int).replace(0,np.nan).cumsum()
您也可以尝试过滤“迷宫A”和“迷宫B”的数据帧,使用
shift
查找更改,然后cumsum
和删除重复项
,最后,使用内在索引对齐将
分配回数据帧:
x = df.loc[df['Surface_Data'].isin(['Maze_A','Maze_B']), 'Surface_Data']
df.assign(Maze_count=(x != x.shift()).cumsum().drop_duplicates())
输出:
Timestamp Surface_Data Maze_count
0 8737.37 Maze_A 1.0
1 8737.42 Maze_A NaN
2 8740.40 Phone_Surface NaN
3 8743.23 Desktop_Surface NaN
4 8765.26 Phone_Surface NaN
5 8765.29 Maze_A NaN
6 8765.30 Phone_Surface NaN
7 8765.56 Maze_B 2.0
8 8766.16 Maze_B NaN
9 8783.74 Maze_A 3.0
10 8793.20 Maze_A NaN
11 8840.12 Phone_Surface NaN
12 8840.40 Phone_Surface NaN
13 8841.40 Maze_B 4.0
x = df.loc[df['Surface_Data'].isin(['Maze_A','Maze_B']), 'Surface_Data']
df.assign(Maze_count=(x != x.shift()).cumsum().drop_duplicates())
Timestamp Surface_Data Maze_count
0 8737.37 Maze_A 1.0
1 8737.42 Maze_A NaN
2 8740.40 Phone_Surface NaN
3 8743.23 Desktop_Surface NaN
4 8765.26 Phone_Surface NaN
5 8765.29 Maze_A NaN
6 8765.30 Phone_Surface NaN
7 8765.56 Maze_B 2.0
8 8766.16 Maze_B NaN
9 8783.74 Maze_A 3.0
10 8793.20 Maze_A NaN
11 8840.12 Phone_Surface NaN
12 8840.40 Phone_Surface NaN
13 8841.40 Maze_B 4.0