Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/20.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
python中的字符串解析/替换(使用reg表达式)_Python_Regex_String_Parsing - Fatal编程技术网

python中的字符串解析/替换(使用reg表达式)

python中的字符串解析/替换(使用reg表达式),python,regex,string,parsing,Python,Regex,String,Parsing,我正在尝试编写一个解析器:将字符串转换为查询格式。并卡在字符串替换的特定点(通过匹配模式) 我无法找出正则表达式模式匹配 我有一个输入字符串,比如 ip_query_string = "CITY == 'Mumbai' & LOCATION in ['Harrys Bar & Cafe: Mumbai','Hard Rock Cafe in Mumbai']" #Mark the """& after CITY == """ and """ in after LOCA

我正在尝试编写一个解析器:将字符串转换为查询格式。并卡在字符串替换的特定点(通过匹配模式)

我无法找出正则表达式模式匹配

我有一个输入字符串,比如

ip_query_string = "CITY == 'Mumbai' & LOCATION in ['Harrys Bar & Cafe: Mumbai','Hard Rock Cafe in Mumbai']"

#Mark the """& after CITY == """ and  """ in after LOCATION""".
#Then there is another "& and a string ' in '" inside values for in-condition.

#My output should be:
op_query_string = "CITY == 'Mumbai' AND LOCATION IN ['Harrys Bar & Cafe: Mumbai','Hard Rock Cafe in Mumbai']"

#if i will find  ' & ' or a ' in ' (before and after there are spaces):: I have to replace them with ' AND ' and ' IN ' respectively.(In this case a ip_string.replace(' & ', ' AND ').replace(' in ', ' In ')) would work.BUT read next point.
#And if they are inside a in-condition values like 'Harrys Bar & Cafe: Mumbai','Hard Rock Cafe in Mumbai' then don't replace them. keep them as is.
#If you look at op_string in-condition, the & and in are not replaced.
请帮忙形成一个逻辑


或者reg模式是什么(if&或在单引号中加上其他字符,不要替换,否则替换)?

以某种奇怪的方式工作(可能不是pythonic),但它工作了

def rplc_str(s):
   sp = s.split("'")
   print('After split==',sp)
   sp1 = [x.replace(' & ', ' AND ') if ((x.startswith(' &')) or (x.startswith('] &'))) else x for x in sp]
   print('After replacing & ==',sp1)
   sp2 = [x.replace(' in ', ' IN ') if x.endswith(' [') else x for x in sp1]
   print('After replacing in ==',sp1)
   return "'".join(sp2)

ip_str = "CITY == 'Mumbai' & LOCATION in ['Harrys Bar & Cafe: Mumbai','Hard Rock Cafe in Mumbai']"
op_str = rplc_str(ip_str)
print(op_str)
#After split== ['CITY == ', 'Mumbai', ' & LOCATION in [', 'Harrys Bar & Cafe: Mumbai', ',', 'Hard Rock Cafe in Mumbai', ']']
#After replace & == ['CITY == ', 'Mumbai', ' AND LOCATION in [', 'Harrys Bar & Cafe: Mumbai', ',', 'Hard Rock Cafe in Mumbai', ']']
#After replace in == ['CITY == ', 'Mumbai', ' AND LOCATION in [', 'Harrys Bar & Cafe: Mumbai', ',', 'Hard Rock Cafe in Mumbai', ']']
#CITY == 'Mumbai' AND LOCATION IN ['Harrys Bar & Cafe: Mumbai','Hard Rock Cafe in Mumbai']

希望它能帮助某些人,但仍在等待一些类似于python的答案(我的意思是reg expr.)

以某种奇怪的方式实现了它(可能不是python),但它成功了

def rplc_str(s):
   sp = s.split("'")
   print('After split==',sp)
   sp1 = [x.replace(' & ', ' AND ') if ((x.startswith(' &')) or (x.startswith('] &'))) else x for x in sp]
   print('After replacing & ==',sp1)
   sp2 = [x.replace(' in ', ' IN ') if x.endswith(' [') else x for x in sp1]
   print('After replacing in ==',sp1)
   return "'".join(sp2)

ip_str = "CITY == 'Mumbai' & LOCATION in ['Harrys Bar & Cafe: Mumbai','Hard Rock Cafe in Mumbai']"
op_str = rplc_str(ip_str)
print(op_str)
#After split== ['CITY == ', 'Mumbai', ' & LOCATION in [', 'Harrys Bar & Cafe: Mumbai', ',', 'Hard Rock Cafe in Mumbai', ']']
#After replace & == ['CITY == ', 'Mumbai', ' AND LOCATION in [', 'Harrys Bar & Cafe: Mumbai', ',', 'Hard Rock Cafe in Mumbai', ']']
#After replace in == ['CITY == ', 'Mumbai', ' AND LOCATION in [', 'Harrys Bar & Cafe: Mumbai', ',', 'Hard Rock Cafe in Mumbai', ']']
#CITY == 'Mumbai' AND LOCATION IN ['Harrys Bar & Cafe: Mumbai','Hard Rock Cafe in Mumbai']
希望它能帮助一些人,但仍在等待一些pythonic的答案(我的意思是reg expr.)

使用函数的简短解决方案:

import re

ip_query_string = "CITY == 'Mumbai' & LOCATION in ['Harrys Bar & Cafe: Mumbai','Hard Rock Cafe in Mumbai']"
op_query_string  = re.sub(r'^([^[]+?)(in)', r'\1IN', re.sub(r'^([^[]+?)(&)', r'\1AND', ip_query_string))

print(op_query_string)
输出:

CITY == 'Mumbai' AND LOCATION IN ['Harrys Bar & Cafe: Mumbai','Hard Rock Cafe in Mumbai']
使用函数的简短解决方案:

import re

ip_query_string = "CITY == 'Mumbai' & LOCATION in ['Harrys Bar & Cafe: Mumbai','Hard Rock Cafe in Mumbai']"
op_query_string  = re.sub(r'^([^[]+?)(in)', r'\1IN', re.sub(r'^([^[]+?)(&)', r'\1AND', ip_query_string))

print(op_query_string)
输出:

CITY == 'Mumbai' AND LOCATION IN ['Harrys Bar & Cafe: Mumbai','Hard Rock Cafe in Mumbai']

为什么
'Harry's
应该变成
'Harrys
?对不起,有个打字错误,总是Harrys。编辑了这个问题。为什么哈利的应该变成哈雷的?对不起,有个打字错误,总是哈雷的。在本例中编辑了question.works(问题),但如果在正常条件的值中有“&”或“in”,则编辑失败。比如ip_query_string=“CITY==‘果阿的孟买’、‘孟买的硬石咖啡馆’、‘孟买的哈雷酒吧和咖啡馆’”……对于城市价值,我在其中加了一个“in”。然后,上述reg_expr结果为“城市==‘果阿孟买’和地点[‘孟买Harrys酒吧和咖啡馆’,‘孟买硬石咖啡馆’”…即,将‘果阿孟买’改为‘果阿孟买’。这是不应该发生的。如果在单引号中发现“&”和“in”,则不应更改。在这种情况下有效(问题),但如果在正常条件的值中有“&”或“in”,则失败。比如ip_query_string=“CITY==‘果阿的孟买’、‘孟买的硬石咖啡馆’、‘孟买的哈雷酒吧和咖啡馆’”……对于城市价值,我在其中加了一个“in”。然后,上述reg_expr结果为“城市==‘果阿孟买’和地点[‘孟买Harrys酒吧和咖啡馆’,‘孟买硬石咖啡馆’”…即,将‘果阿孟买’改为‘果阿孟买’。这是不应该发生的。如果在单引号内发现“&”和“in”,则不应更改。