Python 根据表中的索引范围合并列的行_Python_Pandas_List_List Comprehension

Python 根据表中的索引范围合并列的行

python pandas list

Python 根据表中的索引范围合并列的行,python,pandas,list,list-comprehension,Python,Pandas,List,List Comprehension,我有一个只有一列的数据框 Index | column1 | 0 and 1 too 2 ask 3 the 4 but 5 hat 6 hot 7 top 8 tap 我想根据条件在索引之间组合行。例如，如果一行有字母“a”，则索引将为： 0, 2, 5, 8 因此，组合行： (0, 1), (2, 3, 4), (5, 6, 7), (8) 最

我有一个只有一列的数据框

Index | column1 |
0         and
1         too
2         ask
3         the
4         but
5         hat
6         hot
7         top
8         tap

我想根据条件在索引之间组合行。例如，如果一行有字母“a”，则索引将为：

0, 2, 5, 8

因此，组合行：

(0, 1), (2, 3, 4), (5, 6, 7), (8)

最后，输出为：

Index | column1 |
0         and, too
1         ask, the, but
2         hat, hot, top
3         tap

我尝试的是：

[i for i in range(len(df['column1'])) if 'a' in df['column1'][i]]

给我指数：

[0, 2, 5, 8]

但是从这里开始。谢谢

stuff=["and","too","ask","the","but","hat","hot","top","tap"]

newlist=[]
collection=[]
for i in stuff:
    if "a" in i:
        if len(collection) >0:
            newlist.append(collection)
        collection=[]
    collection.append(i)
newlist.append(collection)

尝试类似这样的操作

通过将

与进行比较并创建组，然后通过过滤

g[g>0]

删除第一个可能包含非

值的组，最后使用

加入进行聚合

：

g = df['column1'].str.contains('a').cumsum()

df = df.groupby(g[g > 0])['column1'].apply(', '.join).reset_index(drop=True).to_frame()
print (df)
         column1
0       and, too
1  ask, the, but
2  hat, hot, top
3            tap

第一个值不包含

：

print (df)
  column1
1     too
2     ask
3     the
4     but
5     hat
6     hot
7     top
8     tap

g = df['column1'].str.contains('a').cumsum()

df = df.groupby(g[g > 0])['column1'].apply(', '.join).reset_index(drop=True).to_frame()
print (df)
         column1
0  ask, the, but
1  hat, hot, top
2            tap