如果字符串在Python的列表中,如何替换文本中的字符串?
dataframe有两列:句子和列表。要求将df['list']中存在的df['Session']中的字符串替换为找到的| present字符串如果字符串在Python的列表中,如何替换文本中的字符串?,python,pandas,list,dataframe,replace,Python,Pandas,List,Dataframe,Replace,dataframe有两列:句子和列表。要求将df['list']中存在的df['Session']中的字符串替换为找到的| present字符串 from pandas import DataFrame df = {'list': [['Ford','Mercedes Benz'],['ford','hyundai','toyota'],['tesla'],[]], 'sentence': ['Ford is less expensive than Mercedes Benz'
from pandas import DataFrame
df = {'list': [['Ford','Mercedes Benz'],['ford','hyundai','toyota'],['tesla'],[]],
'sentence': ['Ford is less expensive than Mercedes Benz' ,'toyota and hyundai mileage is good compared to ford','tesla is an electric car','toyota too has electric cars']
}
df = DataFrame(df,columns= ['list','sentence'])
df[‘句子’]的预期输出为:
Ford|present is less expensive than Mercedes Benz|present
toyota|present and hyundai|present mileage is good compared to ford|present
tesla|present is an electric car
toyota too has electric cars
使用正则表达式替换: (从IPython互动会话中剪切)
可以使用apply函数和regex替换apply函数中的文本
重新导入
df={'list':[['Ford'、'Mercedes-Benz']、['Ford'、'hyundai'、'toyota']、['tesla']、[],
“句子”:[“福特比奔驰便宜”,“丰田和现代的里程比福特好”,“特斯拉是一辆电动车”,“丰田也有电动车”]
}
df=DataFrame(df,columns=['list','station']))
def replace_值(行):
如果len(行列表)>0:
pat=r“(\b”+“|”。连接(行列表)+r”)(\b)”
打印(pat)
row.SENTURE=re.sub(pat,“\\1 | present\\2”,row.SENTURE)
返回行
df.应用(替换_值,轴=1)
您可以在数据框上使用自定义功能,如下所示:
代码
import pandas as pd
df = {'list': [['Ford','hyundai'],['ford','hyundai','toyota'],['tesla'],[]],
'sentence': ['Ford is expensive than hyundai' ,'toyota and hyundai mileage is good compared to ford','tesla is an electric car','toyota too has electric cars']
}
df = pd.DataFrame(df)
def rep_text(row):
if not row.list:
return row
words = row.sentence.split()
new_words = [word+'|present' \
if word in row.list else word\
for word in words]
row['sentence'] = ' '.join(new_words)
return row
df = df.apply(rep_text, axis=1)
输出
list sentence
0 [Ford, hyundai] Ford|present is expensive than hyundai|present
1 [ford, hyundai, toyota] toyota|present and hyundai|present mileage is ...
2 [tesla] tesla|present is an electric car
3 [] toyota too has electric cars
为什么最后一项
丰田也有电动汽车
没有修改?因为相应的列表没有任何字符串可替换。这是一个空列表。你尝试过什么?你的方法有什么问题?最后一项(索引=3)被替换为“| presentt | presento | presento | presentt | presenta | presentt | presento | ansev/Dev”。谢谢。如果我们有福特汽车的话。这个词被替换为福特的礼物。精确的匹配会很好。不是吗?新法规并没有取代福特,但它确实取代了福特,成为了福特的“现在”。连字符S在present之后添加。我正在寻找一个精确的字符串匹配。
list sentence
0 [Ford, hyundai] Ford|present is expensive than hyundai|present
1 [ford, hyundai, toyota] toyota|present and hyundai|present mileage is ...
2 [tesla] tesla|present is an electric car
3 [] toyota too has electric cars