(正则表达式)如何在python中删除引号和内部内容?
我想删除文件(其中所有字符串都是引号的代码文件)中的字符串,如下所示:(正则表达式)如何在python中删除引号和内部内容?,python,regex,Python,Regex,我想删除文件(其中所有字符串都是引号的代码文件)中的字符串,如下所示: text = "Hello,"+Tom+"have a nice day!" text2 = "Thank"+"you." 我想要这个(不仅仅是配额,还有里面的一切): 我可以使用正则表达式获取每个字符串,并逐行读取: readLine = re.findall("[a-zA-Z0-9]*", line) # there i
text = "Hello,"+Tom+"have a nice day!"
text2 = "Thank"+"you."
我想要这个(不仅仅是配额,还有里面的一切):
我可以使用正则表达式获取每个字符串,并逐行读取:
readLine = re.findall("[a-zA-Z0-9]*", line)
# there is some trimming I didn't show
但结果是:
['text','Hello','Tom','have', 'a', 'nice', 'day', 'text2', 'Thank', 'you']
如果正则表达式不适用,还有什么其他方法?非常感谢您的帮助。您可以在正则表达式中使用积极的前瞻,如下所示:
我试过了
re.findall(r'".*"',line)
您可以简单地修剪开头和结尾的额外引号
编辑:
要修剪它,您可以使用
[match[1:-1]用于在re.findall(r'.*'',行)中进行匹配]
给你,这就是你所需要的:
re.findall('"(.*)"', sentence)
使用
重新导入
expr=r'(:[^“\\]\\\[\s\s])*“\;(\w+”
text=r'''text=“你好,”+Tom+“祝你有愉快的一天!"
text2=“谢谢”+“你。”“”
打印(列表(过滤器(无,关于findall(expr,text)))
看
结果:['text','Tom','text2']
正则表达式解释
--------------------------------------------------------------------------------
" '"'
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
[^"\\] any character except: '"', '\\'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
\\ '\'
--------------------------------------------------------------------------------
[\s\S] any character of: whitespace (\n, \r,
\t, \f, and " "), non-whitespace (all
but \n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
" '"'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \1
用你自己的话来说,当你使用
re.findall
时,你认为findall
是什么意思?现在,试着看看re
模块的函数。你看到描述的函数与你想对输入行执行的操作相对应吗?我相信你正在尝试创建一个字符串列表。试试这个my_text=[text,text2]
你是说像这样吗?@Thefthefthbird这对我来说真的很有用,谢谢
--------------------------------------------------------------------------------
" '"'
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
[^"\\] any character except: '"', '\\'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
\\ '\'
--------------------------------------------------------------------------------
[\s\S] any character of: whitespace (\n, \r,
\t, \f, and " "), non-whitespace (all
but \n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
" '"'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \1