在同一文本文件的不同位置重新排列文本,并对其进行一些更改,然后使用python删除某些特定文本
有一些输入在同一文本文件的不同位置重新排列文本,并对其进行一些更改,然后使用python删除某些特定文本,python,string,replace,Python,String,Replace,有一些输入 1. "Want to learn more? (link to https://www.google.com) Click here." 2 "Want to learn more? (link to https://https://www.google.com) website." 预期产出分别为: 1 "Want to learn more? [Click here] (https://www.google.com)." 2 "Want to learn more? [
1. "Want to learn more? (link to https://www.google.com) Click here."
2 "Want to learn more? (link to https://https://www.google.com) website."
预期产出分别为:
1 "Want to learn more? [Click here] (https://www.google.com)."
2 "Want to learn more? [website] (https://https://www.google.com)."
说明:
我想删除URL和文本中的
链接,在()
重新排列()
之前,使用[]
插入而不使用regex
您可以使用text.split()
拆分为以后可以重新设置范围的部分。I代码I将这些部分显示为a、b、c、d
text = '''
1. "Want to learn more? (link to https://www.google.com) Click here."
2 "Want to learn more? (link to https://https://www.google.com) website."
'''
for line in text.splitlines():
if line:
#print(line)
a, b = line.split('(link to ')
b, c = b.split(') ')
c, d = c.split('.')
print(' a:', a)
print(' b:', b)
print(' c:', c)
print(' d:', d)
print('{}[{}] ({}).{}'.format(a, c, b, d))
结果:
a: 1. "Want to learn more?
b: https://www.google.com
c: Click here
d: "
1. "Want to learn more? [Click here] (https://www.google.com)."
a: 2 "Want to learn more?
b: https://https://www.google.com
c: website
d: "
2 "Want to learn more? [website] (https://https://www.google.com)."
与re.split()相同
你试了什么?显示您的代码和完整的错误消息?我会尝试text.find()
或text.split()
与(链接到和)
以获得3个元素-在之前的文本(链接到,在之间的文本(链接到和)
,然后创建具有预期输出的新文本没有问题。或者您可以使用regex
进行此操作。将open(r'C:\Users\test.txt')作为infle导入re,将open(r'C:\test1.txt',w')作为outfile:copy=False作为infle中的行:if line.strip()=“[”:copy=True如果copy:#翻转以包括end,正如Dan H指出的那样outfile.write(line)if line.strip()=“]”:copy=false始终添加有问题的代码、数据和错误消息,而不是在注释中添加。这样会更具可读性。
text = '''
1. "Want to learn more? (link to https://www.google.com) Click here."
2 "Want to learn more? (link to https://https://www.google.com) website."
'''
import re
for line in text.splitlines():
if line:
a = re.split('(.*)\(link to (.*)\) (.*)(\.")', line)
print(a)
print('{}[{}] ({}){}'.format(a[1], a[3], a[2], a[4]))