Python 使用numpy根据多个where条件更新数据帧值
我想根据多个where条件更改Python 使用numpy根据多个where条件更新数据帧值,python,numpy,dataframe,Python,Numpy,Dataframe,我想根据多个where条件更改DateWork['Variable']值,并在DateWork['Date'] 如果Frequency=3和len(变量)=6则将M替换为“-0”,并在DateWork['Date']中更新 如果Frequency=3和len(变量)=7则将M替换为“-”,并在DateWork['Date']中更新 数据帧:日期工作 Frequency Variable Date 3 1950M2 1950-02-01 3
DateWork['Variable']
值,并在DateWork['Date']
如果Frequency=3
和len(变量)=6
则将M替换为“-0”,并在DateWork['Date']中更新
如果Frequency=3
和len(变量)=7
则将M替换为“-”,并在DateWork['Date']中更新
数据帧:日期工作
Frequency Variable Date
3 1950M2 1950-02-01
3 1950M3 1950-03-01
2 1950-07-01 1950-07-01
3 1950M9 1950-09-01
2 1950-10-01 1950-10-01
3 1950M10 1950-10-01
我的代码:
DateWork.loc[DateWork['Date']] = np.where(((DateWork['Frequency'] == 3) & (DateWork['variable'].str.len() == 6)), 'M', '-0', DateWork['Date'])
DateWork.loc[DateWork['Date']] = np.where(((DateWork['Frequency'] == 3) & (DateWork['variable'].str.len() == 7)), 'M', '-', DateWork['Date'])
DateWork.loc[DateWork['Frequency'] == 3, 'Date'] = DateWork.loc[DateWork['Frequency'] == 3, 'variable'] + '-01'
这会产生以下错误:
TypeError:where()最多接受3个参数(给定4个)
出现错误是因为您向np传递了一个额外的参数。在中,您可以查看有关此方法的文档,链接如下。同样,一旦这个问题得到解决,您编写代码的方式会使最后一个np.where
调用更新并替换之前的所有,因此它们需要“嵌套”才能正常工作
我还提供了一个没有np的解决方案。如果您需要它,应该在哪里
解决方案包括:
解决方案包括:
如果嵌套np.
很难读取
DateWork
Out[32]:
Frequency Variable Date
0 3 1950M2 1950-02-01
1 3 1950M3 1950-03-01
2 2 1950-07-01 1950-07-01
3 3 1950M9 1950-09-01
4 2 1950-10-01 1950-10-01
5 3 1950M10 1950-10-01
首先,如果:
else条件是原始的Date
列本身
DateWork['Date'] = np.where((DateWork['Frequency'] == 3) & (DateWork['Variable'].str.len() == 6), DateWork['Variable'].str.replace('M','-0'), DateWork['Date'])
DateWork
Out[34]:
Frequency Variable Date
0 3 1950M2 1950-02
1 3 1950M3 1950-03
2 2 1950-07-01 1950-07-01
3 3 1950M9 1950-09
4 2 1950-10-01 1950-10-01
5 3 1950M10 1950-10-01
第二,如果:
这里,else条件是上述步骤的输出date
列
DateWork['Date'] = np.where((DateWork['Frequency'] == 3) & (DateWork['Variable'].str.len() == 7), DateWork['Variable'].str.replace('M','-'), DateWork['Date'])
DateWork
Out[36]:
Frequency Variable Date
0 3 1950M2 1950-02
1 3 1950M3 1950-03
2 2 1950-07-01 1950-07-01
3 3 1950M9 1950-09
4 2 1950-10-01 1950-10-01
5 3 1950M10 1950-10
DateWork['Date'] = np.where((DateWork['Frequency'] == 3) & (DateWork['Variable'].str.len() == 6), DateWork['Variable'].str.replace('M','-0'), DateWork['Date'])
DateWork
Out[34]:
Frequency Variable Date
0 3 1950M2 1950-02
1 3 1950M3 1950-03
2 2 1950-07-01 1950-07-01
3 3 1950M9 1950-09
4 2 1950-10-01 1950-10-01
5 3 1950M10 1950-10-01
DateWork['Date'] = np.where((DateWork['Frequency'] == 3) & (DateWork['Variable'].str.len() == 7), DateWork['Variable'].str.replace('M','-'), DateWork['Date'])
DateWork
Out[36]:
Frequency Variable Date
0 3 1950M2 1950-02
1 3 1950M3 1950-03
2 2 1950-07-01 1950-07-01
3 3 1950M9 1950-09
4 2 1950-10-01 1950-10-01
5 3 1950M10 1950-10