Python 将for循环迭代组合到一行中,不进行匹配处理
可能是一个非常基本的问题,但希望有人能帮忙 我有以下资料:Python 将for循环迭代组合到一行中,不进行匹配处理,python,regex,for-loop,Python,Regex,For Loop,可能是一个非常基本的问题,但希望有人能帮忙 我有以下资料: query = ['whole regular milk', 'gatorade is better', 'whole almond chocolate milk', 'chocolate milk'] types = ['whole', 'regular', 'chocolate' ] new_list = [] for i in query: for k in types: regex_concat
query = ['whole regular milk', 'gatorade is better', 'whole almond chocolate
milk', 'chocolate milk']
types = ['whole', 'regular', 'chocolate' ]
new_list = []
for i in query:
for k in types:
regex_concat = r"\b" + k + r"\b"
new_regex = re.search(regex_concat,i)
if (str(new_regex)) != 'None':
print((new_regex.group()))
else:
print('no match')
whole
regular
no match
no match
no match
no match
whole
no match
chocolate
no match
no match
chocolate
世卫组织的产出产生以下结果:
query = ['whole regular milk', 'gatorade is better', 'whole almond chocolate
milk', 'chocolate milk']
types = ['whole', 'regular', 'chocolate' ]
new_list = []
for i in query:
for k in types:
regex_concat = r"\b" + k + r"\b"
new_regex = re.search(regex_concat,i)
if (str(new_regex)) != 'None':
print((new_regex.group()))
else:
print('no match')
whole
regular
no match
no match
no match
no match
whole
no match
chocolate
no match
no match
chocolate
我的理想输出是:
whole | regular
Blank
whole | chocolate
chocolate
问题:
我认为我应该能够使用以下方法将输出组合成一行:
print((new_regex.group()), end= "|", flush=True)
这将给我:
whole|regular|no match
no match
no match
no match
whole|no match
chocolate|no match
no match
chocolate|
我似乎不知道如何获得上面所要求的输出
一些补充说明-
查询列表将从pd数据帧编译。从那里,我想使用所需的输出,将其转换为列表>系列,以映射回pd数据帧。这就是为什么我希望空白行仍然存在,因为最终的输出应该是这样的:
Query Type
whole regular milk whole | regular
gatorade is better
whole almond chocolate milk whole | choclate
chocolate milk chocolate
如果您的输入已经是datarframe,那么您可以在dataframe级别执行整个操作:
import re
query = ['whole regular milk', 'gatorade is better',
'whole almond chocolate milk', 'chocolate milk', 'wholes']
types = [{'type': t, 'regex': re.compile(r'\b{}\b'.format(t))}
for t in ['whole', 'regular', 'chocolate']]
df = pd.DataFrame({'Query': query})
def check(q):
return ' | '.join(type_info['type'] for type_info in types
if type_info['regex'].findall(q))
df['Type'] = df['Query'].apply(check)
print(df)
# Query Type
# 0 whole regular milk whole | regular
# 1 gatorade is better
# 2 whole almond chocolate milk whole | chocolate
# 3 chocolate milk chocolate
# 4 wholes
我选择regex的原因是为了能够精确匹配类型,因为它们可能是未来迭代中的变体。例如,如果我有批发商、谁、洞类型,那么您的解决方案在子字符串上会匹配吗?对于python来说也是非常新的,所以如果我不熟悉,请纠正我。