Python:替换多个正则表达式

Python:替换多个正则表达式,python,regex,Python,Regex,在下面的输入中,我试图将数字和\n分别替换为'和' THE SONNETS\n\n 1\n\nFrom fairest creatures we desire increase,\nThat thereby beauty’s rose might never die,\nBut as the riper should by time decease,\nHis she hies, 1189\nAnd yokes her silver

在下面的输入中,我试图将数字和
\n
分别替换为
'
'

THE SONNETS\n\n                    1\n\nFrom fairest creatures we desire increase,\nThat thereby beauty’s rose might never die,\nBut as the riper should by time decease,\nHis

she hies,             1189\nAnd yokes her silver doves; by whose swift aid\nTheir mistress mounted through the empty skies,\nIn her light chariot quickly is convey’d;           1192\n  Holding their course to Paphos, where their queen\n  Means to immure herself and not be seen.\n'
input\u var
从包含上述内容的文件中读取

file_name = 'sample.txt'
file = open(folder+file_name, mode='r', encoding='utf8')
input_var = file.read()
file.close
文件的屏幕截图附在附件中。

文件中的数据是

THE SONNETS

                    1

From fairest creatures we desire increase,
That thereby beauty’s rose might never die,
But as the riper should by time decease,
His

she hies,             1189
And yokes her silver doves; by whose swift aid
Their mistress mounted through the empty skies,
In her light chariot quickly is convey’d;           1192
  Holding their course to Paphos, where their queen
  Means to immure herself and not be seen.
为了识别数字,我使用了regex
[\s]{3,}\d{1,}\\n
(数字前必须至少有3个空格。(有关regex的测试,请参阅)

我正在使用以下代码替换从stackoverflow中的几个答案中得到的正则表达式和
\n

代码1-

# Remove the numbers in sonnets and at the end of lines
pattern = {r'[\s]{3,}\d{1,}\\n' : '',
           r'\\n' : ' '
          }

regex = re.compile('|'.join(map(re.escape, pattern.keys(  ))))
output_var = regex.sub(lambda match: pattern[match.group(0)], input_var)
代码2-

rep = dict((re.escape(k), v) for k, v in pattern.items())
pattern_test = re.compile("|".join(rep.keys()))
output_var = pattern_test.sub(lambda m: rep[re.escape(m.group(0))], input_var)
代码3-

for i, j in pattern.items():
        output_var = input_var.replace(i, j)
其中,
input\u var
包含上述文本。这三个文本均不替换任何内容

pattern = {'[\s]{3,}\d{1,}\n' : '',
           '\n' : ' '
          }
我也试过了

pattern = {r'[\s]{3,}\d{1,}\n' : '',
           r'\n' : ' '
          }
但它不能取代任何东西

pattern = {'[\s]{3,}\d{1,}\n' : '',
           '\n' : ' '
          }
仅替换
\n
,输出如下

THE SONNETS                      1  From fairest creatures we desire increase, That thereby beauty’s rose might never die, But as the riper should by time decease, His
正则表达式在字典中没有标识,我认为它被视为文字字符串而不是正则表达式。如何在字典中指定正则表达式?我在stackoverflow中找到的答案使用字符串而不是正则表达式,就像提供的答案一样

预期的结果是

THE SONNETS                       From fairest creatures we desire increase, That thereby beauty’s rose might never die, But as the riper should by time decease, His

    she hies,And yokes her silver doves; by whose swift aid  Their mistress mounted through the empty skies, In her light chariot quickly is convey’d;  Holding their course to Paphos, where their queen   Means to immure herself and not be seen. ' 

这里有一个可以运行的工作示例(如果您有bs4等)。我看到您在编号和正则表达式方面获得了帮助,但这可能有助于理解行返回等(不完全确定目标是什么).在网上找不到与您的来源编号相似的来源,因此很不幸,这不是同类的来源。如果没有其他来源,可能值得思考

from bs4 import BeautifulSoup
import re
import requests


url = 'http://www.gutenberg.org/cache/epub/1041/pg1041.txt'

page = requests.get(url)
# print(page.status_code)
soup = BeautifulSoup(page.text)

sonnet = page.text

print(sonnet[780:1500])
print()
print('------')
print()
sonnet = re.sub('\r','',sonnet)
sonnet = re.sub('\n','',sonnet)
print(sonnet[698:1500])

url2 = 'http://shakespeare.mit.edu/Poetry/VenusAndAdonis.html'

page = requests.get(url2)
# print(page.status_code)
soup = BeautifulSoup(page.text)
print()
print('------')
print('------')
print()
VenusAndAdonis = soup.text
print(type(VenusAndAdonis))
print(VenusAndAdonis[800:1500])
print()
print('------')
print()
VenusAndAdonis = re.sub('\r','',VenusAndAdonis)
VenusAndAdonis = re.sub('\n',' ',VenusAndAdonis)
print(VenusAndAdonis[800:1500])
产出:

I

  From fairest creatures we desire increase,
  That thereby beauty's rose might never die,
  But as the riper should by time decease,
  His tender heir might bear his memory:
  But thou, contracted to thine own bright eyes,
  Feed'st thy light's flame with self-substantial fuel,
  Making a famine where abundance lies,
  Thy self thy foe, to thy sweet self too cruel:
  Thou that art now the world's fresh ornament,
  And only herald to the gaudy spring,
  Within thine own bud buriest thy content,
  And tender churl mak'st waste in niggarding:
    Pity the world, or else this glutton be,
    To eat the world's due, by the grave and thee.

  II

  When forty winters shall besiege thy brow,

------

I  From fairest creatures we desire increase,  That thereby beauty's rose might never die,  But as the riper should by time decease,  His tender heir might bear his memory:  But thou, contracted to thine own bright eyes,  Feed'st thy light's flame with self-substantial fuel,  Making a famine where abundance lies,  Thy self thy foe, to thy sweet self too cruel:  Thou that art now the world's fresh ornament,  And only herald to the gaudy spring,  Within thine own bud buriest thy content,  And tender churl mak'st waste in niggarding:    Pity the world, or else this glutton be,    To eat the world's due, by the grave and thee.  II  When forty winters shall besiege thy brow,  And dig deep trenches in thy beauty's field,  Thy youth's proud livery so gazed on now,  Will be a tatter'd weed of small 

------
------

<class 'str'>
 honour to your heart's content; which I
wish may always answer your own wish and the world's hopeful
expectation.
Your honour's in all duty,
WILLIAM SHAKESPEARE.

EVEN as the sun with purple-colour'd face
Had ta'en his last leave of the weeping morn,
Rose-cheek'd Adonis hied him to the chase;
Hunting he loved, but love he laugh'd to scorn;
Sick-thoughted Venus makes amain unto him,
And like a bold-faced suitor 'gins to woo him.


'Thrice-fairer than myself,' thus she began,
'The field's chief flower, sweet above compare,
Stain to all nymphs, more lovely than a man,
More white and red than doves or roses are;
Nature that made thee, with herself at strife,
Saith that the world hath ending wit

------

 honour to your heart's content; which I wish may always answer your own wish and the world's hopeful expectation. Your honour's in all duty, WILLIAM SHAKESPEARE.  EVEN as the sun with purple-colour'd face Had ta'en his last leave of the weeping morn, Rose-cheek'd Adonis hied him to the chase; Hunting he loved, but love he laugh'd to scorn; Sick-thoughted Venus makes amain unto him, And like a bold-faced suitor 'gins to woo him.   'Thrice-fairer than myself,' thus she began, 'The field's chief flower, sweet above compare, Stain to all nymphs, more lovely than a man, More white and red than doves or roses are; Nature that made thee, with herself at strife, Saith that the world hath ending wit
I
我们渴望从最美丽的生物身上成长,
因此,美丽的玫瑰也许永远不会凋谢,
但随着时间的流逝,成熟者也会逐渐成熟,
他的温柔继承人可能会记住他:
但是你,与你自己明亮的眼睛签约,
用自足的燃料喂养你的光之火焰,
在富足的地方制造饥荒,
你自己是你的敌人,对你甜蜜的自己太残忍了:
你现在是世界上最新的装饰,
也是花哨春天的唯一使者,
在你自己的花蕾里埋藏着你的内容,
而娇嫩的小气最浪费:
可怜这个世界,否则这个贪吃的人,
吃世界上应得的东西,靠坟墓和你。
二,
四十个冬天将围困你的额头,
------
我希望从最美丽的生物中得到更多,这样美丽的玫瑰就永远不会凋谢,但随着成熟者的逝去,他温柔的继承人可能会记住他的记忆。但是你,与你自己明亮的眼睛签约,用自足的燃料喂养你光明的火焰,在富足的地方制造饥荒,你自己是你的敌人,是你甜蜜的灵魂如果你太残忍:你现在是世界上最新鲜的装饰品,是花哨春天的唯一使者,在你自己的花蕾里埋藏着你的内容,温柔的娇嫩使你浪费在吝啬上:可怜这个世界,否则这个暴食者,就在坟墓和你旁边吃世界应得的东西吧他在你美丽的田野里,你青春骄傲的服饰,现在如此凝视着,将是一株破烂的小草
------
------
这是你心满意足的荣誉;我
愿望可能永远满足你自己的愿望和世界的希望
期待。
法官大人有责任,
威廉·莎士比亚。
即使是紫色脸上的太阳
如果ta'en在哭泣的早晨最后一次离开,
玫瑰色脸颊的阿多尼斯把他挡在后面追赶;
他爱打猎,但爱却被他嘲笑;
病态而深思熟虑的维纳斯对他产生了好感,
就像一个勇敢的追求者一样去追求他。
“比我漂亮三倍,”她这样开始说,
“田野上的主要花朵,甜美无比,
对所有比男人更可爱的仙女来说,
比鸽子或玫瑰更白更红;
自然造就了你,而她自己处于冲突之中,
他说世界上没有智慧
------
威廉·莎士比亚,你的荣誉是你的全部职责。即使紫色脸的太阳在哭泣的早晨离开了他,玫瑰色脸颊的阿多尼斯也阻止了他追逐;他热爱狩猎,但他嘲笑爱情;病态的阿多尼斯如果维纳斯给他造了一朵花,像一个勇敢的求婚者一样向他求婚。“比我美丽三倍,”她开始说,“田野里的主要花朵,比任何仙女都甜美,比男人更可爱,比鸽子或玫瑰更白更红;造你的大自然,在她自己处于纷争中时,说世界已经结束了。”T

您需要在一个循环中运行
re.sub
s,但请确保
输出变量
已初始化为
输入变量
值:

output_var = input_var
for reg, repl in pattern.items():
  output_var = re.sub(reg, repl, output_var)
见:

输出:

THE SONNETS From fairest creatures we desire increase, That thereby beauty’s rose might never die, But as the riper should by time decease, His  she hies,And yokes her silver doves; by whose swift aid Their mistress mounted through the empty skies, In her light chariot quickly is convey’d;  Holding their course to Paphos, where their queen   Means to immure herself and not be seen.

看起来像(从顶部的regex101链接判断)您一直在测试字符串文字,而不是文字字符串。请在问题中添加
input\u var
声明。您的预期结果是什么。请您分享一下,看看清楚的图片!这三个选项都不能替代任何内容-因为您很可能没有针对您拥有的字符串测试regexp,而是针对stri测试regexpng文本。您的预期结果和您尝试的模式不匹配。这令人困惑!!!我明白了,但没有办法按照您的意愿执行。您需要在循环中运行
re.sub
s:
for reg,repl in pattern.items():output\var=re.sub(reg,repl,output\var)
,请参阅,我正在寻找一种解决方案,在该解决方案中,我可以将所有模式都放在字典中,比如说,替换,然后只运行一次
re.sub