Python 从熊猫数据框的一列中提取组标题作为分隔列_Python_Pandas

Python 从熊猫数据框的一列中提取组标题作为分隔列

python pandas

Python 从熊猫数据框的一列中提取组标题作为分隔列,python,pandas,Python,Pandas,想象一下，我有一只熊猫，像这样： 0 'A' 1 'some text' 2 'more text' 3 'B' 4 'hello' 5 'hi' 0 'A' 'some text' 1 'A' 'more text' 2 'B' 'hello' 3 'B' 'hi' 我还有一个列表，包含每个组的标题 …我想将df转换为如下所示： 0 'A' 1 'some text' 2 'more text' 3 'B' 4 'hello' 5 'hi' 0 'A' 'some text' 1 'A'

想象一下，我有一只熊猫，像这样：

0 'A'
1 'some text'
2 'more text'
3 'B'
4 'hello'
5 'hi'

0 'A' 'some text'
1 'A' 'more text'
2 'B' 'hello'
3 'B' 'hi'

我还有一个列表，包含每个组的标题

…我想将df转换为如下所示：

0 'A'
1 'some text'
2 'more text'
3 'B'
4 'hello'
5 'hi'

0 'A' 'some text'
1 'A' 'more text'
2 'B' 'hello'
3 'B' 'hi'

实际上，我想在一个单独的列中指定组。

您可以执行

mask

，然后执行

ffill

来提取组：

s = ~df['str'].isin(lst)

df['group'] = df['str'].mask(s).ffill()
df = df[s]

输出：

   idx        str group
1    1  some text     A
2    2  more text     A
4    4      hello     B
5    5         hi     B

你的df中有引号吗？或者这只是一个表示？没有引号。。。sorryand是列表中包含的所有组，或者列表中可能有较小的组需要筛选..您的解决方案有效，谢谢。我将把它标记为答案。但是，为了更好地理解~df[…]符号，您可以参考一些示例/文档吗？特别是，我不知道如何使用~

是求反操作，在系列上工作。那一行实际上是

~（df['str'].isin（lst））

。您可以打印

df['str'].isin（lst）

，然后打印

以查看详细信息。太棒了！非常感谢您的回答；-）