只有在python中后跟特定字符串时才更改大写_Python_Regex_Pandas

只有在python中后跟特定字符串时才更改大写

python regex pandas

只有在python中后跟特定字符串时才更改大写,python,regex,pandas,Python,Regex,Pandas,我有一些数据框中的文本，其中我想要一些大写的特定结果。正文如下输入是 one apple two oranges three bananas an apple 及所需的输出是 one Apple two Oranges three Bananas an Apple 如果数组/字符串/列表中有任何单词“一”、“二”、“三”和“一”，我可以给出循环条件（当这些…任何单词在字符串中时有条件），在该单词之后，将紧跟其后的单词首字母小写改为大写。如果有任何提示或帮助 5行数据 +++++++++

我有一些数据框中的文本，其中我想要一些大写的特定结果。正文如下输入是

one apple
two oranges
three bananas
an apple

及所需的输出是

one Apple
two Oranges
three Bananas
an Apple

如果数组/字符串/列表中有任何单词“一”、“二”、“三”和“一”，我可以给出循环条件（当这些…任何单词在字符串中时有条件），在该单词之后，将紧跟其后的单词首字母小写改为大写。

如果有任何提示或帮助

5行数据 +++++++++++++++++++++++++++++++++++++++++++++++++++

ID Reference_time Comment
0 0059 one apple box
1 0156 five oranges left
2 1859 an engineer handling issue
3 1555 two persons have eaten, three still hungry
4 2109 an apple carton is still in stock

+++++++++++++++++++++++++++++++++++++++++++++++++++

使用regex可能是更好的方法，但是使用

.str.extract

唯一的缺点是，如果一个字符串中有多个匹配项，则需要处理/编辑正则表达式

repl = ['one','two','three','five','an']
pat = '|'.join(repl)


s = df["comment"].str.extract(rf"({pat})(.*\w.*$)")

s[1] = s[1].str.strip().str.capitalize()
df['comment_new'] = s.stack().groupby(level=0).agg(' '.join)

你还没有读过文本文件并创建了一个数据框吗？我已经导入了。它已经在数据帧中了。但是它太大了。我不能把整个东西放在一起。我只能给出上面这样的小3/4行，然后共享一个示例，比如前5行，列名称，您介意分享您尝试过的内容吗？这个案例更改的条件循环，我甚至无法构建，因此请求帮助

print(df[['comment_new','comment']])

                                  comment_new  \
0                               one Apple box   
1                           five Oranges left   
2                  an Engineer handling issue   
3  two Persons have eaten, three still hungry   
4           an Apple carton is still in stock   

                                      comment  
0                               one apple box  
1                           five oranges left  
2                  an engineer handling issue  
3  two persons have eaten, three still hungry  
4           an apple carton is still in stock

keywords = ['one', 'two', 'three', 'an']

df['Comment'] = df['Comment'].apply(lambda x: toUpper(keywords, x))

def toUpper(keywords, sentence):
  word_list = sentence.split()
  for i in range(len(word_list)-1):
    if word_list[i] in keywords:
        word_list[i+1] = word_list[i+1].capitalize()

  return ' '.join(word_list)