Python 3.x 如何从熊猫中提取匹配模式后的所有文本?

Python 3.x 如何从熊猫中提取匹配模式后的所有文本?,python-3.x,pandas,Python 3.x,Pandas,我的数据帧是: name type 0 apple red fruit with red peel that is edible 1 orange thick peel that is bitter and used dried sometimes 我想从每一行中提取剥离后的所有文本,并创建一个单独的列 name type peel 0 ap

我的数据帧是:

     name     type
0    apple    red fruit with red peel that is edible
1    orange   thick peel that is bitter and used dried sometimes
我想从每一行中提取
剥离
后的所有文本,并创建一个单独的列

     name     type                                              peel
0    apple    red fruit with red peel that is edible            that is edible
1    orange   thick peel is bitter and used dried               is bitter and used dried
我正在尝试:

def get_peel(desc):
    text = desc.split(' ')
    for i,t in enumerate(text):
        if t.lower() == 'peel':
            return text[i:]
    return 'not found'

df['peel'] = df['type'].apply(get_peel)
但我得到的结果是:

0         not found
1         not found

我做错了什么?

使用正则表达式的
str.extract

Ex:

df = pd.DataFrame({"name": ['apple', 'orange'], 'type': ['red fruit with red peel that is edible', 'thick peel that is bitter and used dried sometimes']})
df['peel'] = df['type'].str.extract(r"(?<=\bpeel\b)(.*)$")
print(df['peel'])
0                              that is edible
1     that is bitter and used dried sometimes
Name: peel, dtype: object

使用带有正则表达式的
str.extract

Ex:

df = pd.DataFrame({"name": ['apple', 'orange'], 'type': ['red fruit with red peel that is edible', 'thick peel that is bitter and used dried sometimes']})
df['peel'] = df['type'].str.extract(r"(?<=\bpeel\b)(.*)$")
print(df['peel'])
0                              that is edible
1     that is bitter and used dried sometimes
Name: peel, dtype: object

你能试试下面的吗

df
创建:

df = pd.DataFrame({'name':['apple','orange'],
                   'type':['red fruit with red peel that is edible','thick peel that is bitter and used dried sometimes']})
添加新列的代码:

df['peel']=df['type'].replace(regex=True,to_replace=r'.*peel(.*)',value=r'\1')

你能试试下面的吗

df
创建:

df = pd.DataFrame({'name':['apple','orange'],
                   'type':['red fruit with red peel that is edible','thick peel that is bitter and used dried sometimes']})
添加新列的代码:

df['peel']=df['type'].replace(regex=True,to_replace=r'.*peel(.*)',value=r'\1')