Python 熊猫:根据条件将行转换为单列

Python 熊猫:根据条件将行转换为单列,python,pandas,transform,Python,Pandas,Transform,我有以下名为的数据帧匹配项: id | name | age 1 | a | 19 1 | b | 25 2 | c | 19 2 | d | 22 如果某列(age)的值满足条件(x

我有以下名为
的数据帧匹配项

id  |  name  |  age
1   |  a     |  19
1   |  b     |  25
2   |  c     |  19
2   |  d     |  22
如果某列(
age
)的值满足条件(
x<21
),我将使用
groupby
+
count()
)。结果将写入新列(
new\u col
):

现在,我想以一种更可读的方式输出结果,也就是说,满足条件(年龄<21岁)的每一行的
name
-列应该写入一个新列,例如
result

我希望这样(但是,可能还有其他方法可以实现这一点。甚至可能在第一步中就已经这样做了,在第一步中,我添加了
new\u col
):


最后一步(添加
结果
列)就是我现在遇到的问题。

我现在这样做:
groupBy
+
apply
+添加新列的应用函数:

matches = matches.groupby(['id']).apply(concat)
concat是:

def concat(group):
    group['result'] = "{%s}" % ', '.join(group['name'][group['age'] < 21])
    return group
def concat(组):
组['result']=“{%s}”%,”。加入(组['name'][组['age']<21])
返回组
是否有其他/更好的解决方案?

首先筛选行,然后最后筛选原始行:

matches1 = matches[matches.age < 21]
                          .groupby(['id'])['name'].agg({'result':', '.join, 'new_col': len})
print (matches1)
    new_col result
id                
1         1      a
2         2   c, d

print (matches.join(matches1, on='id'))
   id name  age  new_col result
0   1    a   19        1      a
1   1    b   25        1      a
2   2    c   19        2   c, d
3   2    d   18        2   c, d
更好地解释必要的
排序
有点变化
df

print (matches)
   id name  age
0   1    a   25 > first value is filter out by condition
1   1    b   12
2   2    c   19
3   2    d   18

matches = matches.sort_values(['id','age'])
g =  matches[matches.age < 21].groupby(['id'])['name']
matches['new_col'] = g.transform(len)
matches['result'] = g.transform(', '.join)
matches[['new_col','result']] = matches[['new_col','result']].ffill()

print (matches)
  id name  age  new_col result
1   1    b   12        1      b
0   1    a   25        1      b
3   2    d   18        2   d, c
2   2    c   19        2   d, c

print (matches.sort_index())
   id name  age  new_col result
0   1    a   25        1      b
1   1    b   12        1      b
2   2    c   19        2   d, c
3   2    d   18        2   d, c
打印(匹配项)
身份证姓名年龄
0 1 25>第一个值按条件过滤掉
11B12
2 c 19
3 2 d 18
matches=matches.sort_值(['id','age'])
g=matches[matches.age<21].groupby(['id'])['name']
匹配['new_col']=g.transform(len)
匹配['result']=g.transform(','.join)
匹配项[['new\u col','result']]=匹配项[['new\u col','result']]]。ffill()
打印(匹配)
id名称年龄新列结果
11B121B
01A251B
3 2 d 18 2 d,c
2 2 c 19 2 d,c
打印(匹配.sort\u index())
id名称年龄新列结果
01A251B
11B121B
2 2 c 19 2 d,c
3 2 d 18 2 d,c
def concat(group):
    group['result'] = "{%s}" % ', '.join(group['name'][group['age'] < 21])
    return group
matches1 = matches[matches.age < 21]
                          .groupby(['id'])['name'].agg({'result':', '.join, 'new_col': len})
print (matches1)
    new_col result
id                
1         1      a
2         2   c, d

print (matches.join(matches1, on='id'))
   id name  age  new_col result
0   1    a   19        1      a
1   1    b   25        1      a
2   2    c   19        2   c, d
3   2    d   18        2   c, d
matches = matches.sort_values(['id','age'])
g =  matches[matches.age < 21].groupby(['id'])['name']
matches['new_col'] = g.transform(len)
matches['result'] = g.transform(', '.join)
matches[['new_col','result']] = matches[['new_col','result']].ffill()

print (matches)
   id name  age  new_col result
0   1    a   19        1      a
1   1    b   25        1      a
3   2    d   18        2   d, c
2   2    c   19        2   d, c
print (matches)
   id name  age
0   1    a   25 > first value is filter out by condition
1   1    b   12
2   2    c   19
3   2    d   18

matches = matches.sort_values(['id','age'])
g =  matches[matches.age < 21].groupby(['id'])['name']
matches['new_col'] = g.transform(len)
matches['result'] = g.transform(', '.join)
matches[['new_col','result']] = matches[['new_col','result']].ffill()

print (matches)
  id name  age  new_col result
1   1    b   12        1      b
0   1    a   25        1      b
3   2    d   18        2   d, c
2   2    c   19        2   d, c

print (matches.sort_index())
   id name  age  new_col result
0   1    a   25        1      b
1   1    b   12        1      b
2   2    c   19        2   d, c
3   2    d   18        2   d, c