Python 如何在某些条件下重命名dataframe中的列值_Python_Pandas_Dataframe_Cumsum

Python 如何在某些条件下重命名dataframe中的列值

python pandas dataframe

Python 如何在某些条件下重命名dataframe中的列值,python,pandas,dataframe,cumsum,Python,Pandas,Dataframe,Cumsum,我有这样一个熊猫数据框： order_id buyer_id phone_no 611 261 9920570003 681 261 9321613595 707 261 9768270700 707 261 9768270700 707 261 9768270700 708 261 9820895896 710 261 7

我有这样一个熊猫数据框：

 order_id  buyer_id  phone_no
      611      261  9920570003
      681      261  9321613595
      707      261  9768270700
      707      261  9768270700
      707      261  9768270700
      708      261  9820895896
      710      261  7208615775
      710      261  7208615775
      710      261  7208615775
      711      261  9920986486
      800      234    Null
      801      256    Null
      803      289    Null

我必须替换买方id列，如下所示：

   order_id   buyer_id   phone_no
      611      261_01  9920570003
      681      261_02  9321613595
      707      261_03  9768270700
      707      261_03  9768270700
      707      261_03  9768270700
      708      261_04  9820895896
      710      261_05  7208615775
      710      261_05  7208615775
      710      261_05  7208615775
      711      261_06  9920986486
      800       234       Null
      801       256       Null
      803       289       Null

因此，如果手机号码相同，则应将其视为同一买家，否则应在261中添加新系列。我只想将

261买方id

重命名，其他行应相同。因为我将来自电话的订单视为

我可以使用以下代码在

buyer\u id中添加系列：

for i in range((len(phone_orders):
    print '261_%d' %i
    segments_data['buyer_id']

电话订单

包含所有电话订单

但是我不知道如何用所需的输出替换

buyer\u id

列

df['buyer_id'] = '261_' + (df['phone_no'] !=      
df['phone_no'].shift()).cumsum().map("{:02}".format)


  buyer_id    phone_no
  261_01  9920570003
  261_02  9321613595
  261_03  9768270700
  261_03  9768270700
  261_03  9768270700
  261_04  9820895896
  261_05  7208615775
  261_05  7208615775
  261_05  7208615775
  261_06  9920986486
  261_07  9768270700
  261_07  9768270700
  261_07  9768270700
  261_08  9820895896
  261_09  7208615775
  261_09  7208615775
  261_09  7208615775

因此，

7208615775

phone\u no应该是

261\u 05

，但它给出了

261\u 09

您可以使用将列

买方id

转换为

字符串

，然后：

说明：

print (df['phone_no'] != df['phone_no'].shift())
0     True
1     True
2     True
3    False
4    False
5     True
6     True
7    False
8    False
9     True
Name: phone_no, dtype: bool
print (df['phone_no'] != df['phone_no'].shift()).cumsum()
0    1
1    2
2    3
3    3
4    3
5    4
6    5
7    5
8    5
9    6
Name: phone_no, dtype: int32
print (df['phone_no'] != df['phone_no'].shift()).cumsum().map("{:02}".format)
0    01
1    02
2    03
3    03
4    03
5    04
6    05
7    05
8    05
9    06
Name: phone_no, dtype: object

编辑：

如果在

buyer\u id

列中有所需的筛选值

，则可以使用以下方法筛选：

原始问题首先查找唯一的电话号码并创建ID：

id_map = {k: v for v, k in enumerate(df.phone_no.unique(), 1)}

现在，抛出所有条目，将它们添加到相应的电话号码中：

df.buyer_id = df.apply(lambda x: '{}_{:02d}'.format(x.buyer_id, id_map[x.phone_no]), axis=1)

结果:

   order_id buyer_id    phone_no
0       611   261_01  9920570003
1       681   261_02  9321613595
2       707   261_03  9768270700
3       707   261_03  9768270700
4       707   261_03  9768270700
5       708   261_04  9820895896
6       710   261_05  7208615775
7       710   261_05  7208615775
8       710   261_05  7208615775
9       711   261_06  9920986486

仅适用于买方id 261 结果:

    order_id buyer_id    phone_no
0        611   261_01  9920570003
1        681   261_02  9321613595
2        707   261_03  9768270700
3        707   261_03  9768270700
4        707   261_03  9768270700
5        708   261_04  9820895896
6        710   261_05  7208615775
7        710   261_05  7208615775
8        710   261_05  7208615775
9        711   261_06  9920986486
10       800      234        Null
11       801      256        Null
12       803      289        Null

我在

phone\u no

列中有空值如何处理它。我只想在重命名

买家时处理电话号码_id@jazrael您的代码返回以下输出。我正在编辑它，在处理之前是否可以按列对值进行排序？非常感谢您的回答：）Muller真棒非常感谢。。按照我想要的方式工作……）

   order_id buyer_id    phone_no
0       611   261_01  9920570003
1       681   261_02  9321613595
2       707   261_03  9768270700
3       707   261_03  9768270700
4       707   261_03  9768270700
5       708   261_04  9820895896
6       710   261_05  7208615775
7       710   261_05  7208615775
8       710   261_05  7208615775
9       711   261_06  9920986486

id_map = {k: v for v, k in enumerate(df[df.buyer_id==261].phone_no.unique(), 1) }

def make_buyer_id(x):
    try:
        return '{}_{:02d}'.format(x.buyer_id, id_map[x.phone_no])
    except KeyError:
        return x.buyer_id

df.buyer_id = df.apply(make_buyer_id, axis=1)

    order_id buyer_id    phone_no
0        611   261_01  9920570003
1        681   261_02  9321613595
2        707   261_03  9768270700
3        707   261_03  9768270700
4        707   261_03  9768270700
5        708   261_04  9820895896
6        710   261_05  7208615775
7        710   261_05  7208615775
8        710   261_05  7208615775
9        711   261_06  9920986486
10       800      234        Null
11       801      256        Null
12       803      289        Null