Pandas 熊猫-根据特定关键字提取所有内容

Pandas 熊猫-根据特定关键字提取所有内容,pandas,string,Pandas,String,我试图从数据框中提取所有内容,直到出现一个特定的单词。我试图提取整个内容,直到出现以下文字: 高、中、低 数据框中文本的示例视图: text Ticket creation dropped in last 24 hours medium range for cust_a Calls dropped in last 3 months high range for cust_x text, new_text Ticket creation dropped in last 24 hours medi

我试图从数据框中提取所有内容,直到出现一个特定的单词。我试图提取整个内容,直到出现以下文字:

高、中、低

数据框中文本的示例视图:

text
Ticket creation dropped in last 24 hours medium range for cust_a
Calls dropped in last 3 months high range for cust_x
text, new_text
Ticket creation dropped in last 24 hours medium range for cust_a, Ticket creation dropped in last 24 hours
Calls dropped in last 3 months high range for cust_x, Calls dropped in last 3 months
预期输出:

text
Ticket creation dropped in last 24 hours medium range for cust_a
Calls dropped in last 3 months high range for cust_x
text, new_text
Ticket creation dropped in last 24 hours medium range for cust_a, Ticket creation dropped in last 24 hours
Calls dropped in last 3 months high range for cust_x, Calls dropped in last 3 months

IIUC,您需要
replace
regex

这样做的目的是匹配列表中的任何单词,然后替换它和后面的任何单词

我们使用
*
匹配任何内容,直到字符串结束

words = 'high, medium, low'
match_words = '|'.join(words.split(', '))
#'high|medium|low'

df['new_text'] = df['text'].str.replace(f"({match_words}).*",'',regex=True)


print(df['text_new'])

0    Ticket creation dropped in last 24 hours 
1              Calls dropped in last 3 months 
Name: text, dtype: object