Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/320.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/18.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python正则表达式删除\n_Python_Regex_Excel_Vba - Fatal编程技术网

Python正则表达式删除\n

Python正则表达式删除\n,python,regex,excel,vba,Python,Regex,Excel,Vba,我有个问题。我试图做的是对数据进行排序,并在某些点创建新行。目前,我的代码如下所示: from __future__ import print_function import re NDoc = raw_input("Enter name of new document ")+".txt" log = open(NDoc, 'w') file = raw_input("Enter a file to be sorted ") extfile = file+".txt" xfile = open(

我有个问题。我试图做的是对数据进行排序,并在某些点创建新行。目前,我的代码如下所示:

from __future__ import print_function
import re
NDoc = raw_input("Enter name of new document ")+".txt"
log = open(NDoc, 'w')
file = raw_input("Enter a file to be sorted ")
extfile = file+".txt"
xfile = open(file+".txt")

for line in xfile:
    l=line.strip()
    l=re.sub("\n","",l)
    n=re.sub("(\B)(?=((MTH|HST|ENG)[|]))","\n",line)

    if len(n) > 0:
        nl=n.split("\n")
        for item in nl:
                log.write(item+"\n")
                    #print(item)

print ("The data from",extfile,"has been sorted into",NDoc)
MTH|lettersandnumbersHST|lettersandnumbersENG|lettersandnumbers
MTH|lettersandnumbersHST|lettersandnumbersENG|lettersandnumbers
MTH|lettersandnumbersHST|
MTH|lettersandnumbers
HST|lettersandnumbers
ENG|lettersandnumbers
MTH|lettersandnumbers
HST|lettersandnumbers
ENG|lettersandnumbers
MTH|lettersandnumbers
HST|
一切正常,除了在第三学期(英语)之后,我的数据中出现了一行新词。例如,如果我的数据文件是这样的:

from __future__ import print_function
import re
NDoc = raw_input("Enter name of new document ")+".txt"
log = open(NDoc, 'w')
file = raw_input("Enter a file to be sorted ")
extfile = file+".txt"
xfile = open(file+".txt")

for line in xfile:
    l=line.strip()
    l=re.sub("\n","",l)
    n=re.sub("(\B)(?=((MTH|HST|ENG)[|]))","\n",line)

    if len(n) > 0:
        nl=n.split("\n")
        for item in nl:
                log.write(item+"\n")
                    #print(item)

print ("The data from",extfile,"has been sorted into",NDoc)
MTH|lettersandnumbersHST|lettersandnumbersENG|lettersandnumbers
MTH|lettersandnumbersHST|lettersandnumbersENG|lettersandnumbers
MTH|lettersandnumbersHST|
MTH|lettersandnumbers
HST|lettersandnumbers
ENG|lettersandnumbers
MTH|lettersandnumbers
HST|lettersandnumbers
ENG|lettersandnumbers
MTH|lettersandnumbers
HST|
我希望它看起来像这样:

from __future__ import print_function
import re
NDoc = raw_input("Enter name of new document ")+".txt"
log = open(NDoc, 'w')
file = raw_input("Enter a file to be sorted ")
extfile = file+".txt"
xfile = open(file+".txt")

for line in xfile:
    l=line.strip()
    l=re.sub("\n","",l)
    n=re.sub("(\B)(?=((MTH|HST|ENG)[|]))","\n",line)

    if len(n) > 0:
        nl=n.split("\n")
        for item in nl:
                log.write(item+"\n")
                    #print(item)

print ("The data from",extfile,"has been sorted into",NDoc)
MTH|lettersandnumbersHST|lettersandnumbersENG|lettersandnumbers
MTH|lettersandnumbersHST|lettersandnumbersENG|lettersandnumbers
MTH|lettersandnumbersHST|
MTH|lettersandnumbers
HST|lettersandnumbers
ENG|lettersandnumbers
MTH|lettersandnumbers
HST|lettersandnumbers
ENG|lettersandnumbers
MTH|lettersandnumbers
HST|
但它却给了我这个:

MTH|lettersandnumbers
HST|lettersandnumbers
ENG|lettersandnumbers

MTH|lettersandnumbers
HST|lettersandnumbers
ENG|lettersandnumbers

MTH|lettersandnumbers
HST|
现在我想,在添加新的\n之前,执行
l=re.sub(“\n”,”,l)
会将所有\n替换为零,那么为什么仍要添加一行,但仅在ENG之后


提前感谢您提供的任何见解。

您的线路使用了错误的名称

l=line.strip()
l=re.sub("\n","",l)
应该是

line=line.strip()
line=re.sub("\n","",line)
或者干脆

line=line.strip().replace('\n', '')

你的源数据在“ENG”之后有空格。只要去掉这些空格,你就没事了

l=re.sub(' ', '', l)

您可以使用findall来匹配以下任一模式:

s = """MTH|lettersandnumbersHST|lettersandnumbersENG|lettersandnumbers
MTH|lettersandnumbersHST|lettersandnumbersENG|lettersandnumbers
MTH|lettersandnumbersHST|"""

r= re.compile("([A-Z]+\|[0-9a-z]+|[A-Z]+\|)",)
for line in s.splitlines(True):
    print("\n".join(r.findall(line)))
输出:

MTH|lettersandnumbers
HST|lettersandnumbers
ENG|lettersandnumbers
MTH|lettersandnumbers
HST|lettersandnumbers
ENG|lettersandnumbers
MTH|lettersandnumbers
HST|

我认为你没有使用正确的工具

您可能想要:

简短说明:这将捕获任何选项
MTH
HST
ENG
,前面没有
\n
[^\n]
是“除
\n
以外的任何内容”)和前面的字符,并在它们之间添加一个
\n
。结果就是你所期望的

例如:

>>> st = """MTH|lettersandnumbersHST|lettersandnumbersENG|lettersandnumbers
... MTH|lettersandnumbersHST|lettersandnumbersENG|lettersandnumbers
... MTH|lettersandnumbersHST|"""
>>> print(re.sub("([^\n])(MTH|HST|ENG)", r"\1\n\2", st))
MTH|lettersandnumbers
HST|lettersandnumbers
ENG|lettersandnumbers
MTH|lettersandnumbers
HST|lettersandnumbers
ENG|lettersandnumbers
MTH|lettersandnumbers
HST|

l=l.replace(“\n”和“”)
我注意到您分配了一个引用
l
,然后再也不使用它了。也许应该是
?是的,就是这样,谢谢你抓住了我愚蠢的错误。你测试过这个吗?因为我觉得这不管用。