Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/310.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 删除列表中单词末尾的\n和以下字母_Python_String - Fatal编程技术网

Python 删除列表中单词末尾的\n和以下字母

Python 删除列表中单词末尾的\n和以下字母,python,string,Python,String,如何删除\n和以下字母?非常感谢 wordlist = ['Schreiben\nEs', 'Schreiben', 'Schreiben\nEventuell', 'Schreiben\nHaruki'] for x in wordlist: ...? 通过re.sub完成: >>> help(re.sub) 1 Help on function sub in module re: 2 3 sub(pattern, repl, string, coun

如何删除
\n
和以下字母?非常感谢

wordlist = ['Schreiben\nEs', 'Schreiben', 'Schreiben\nEventuell', 'Schreiben\nHaruki']
for x in wordlist:
    ...?
通过
re.sub
完成:

>>> help(re.sub)
  1 Help on function sub in module re:
  2 
  3 sub(pattern, repl, string, count=0)
  4     Return the string obtained by replacing the leftmost
  5     non-overlapping occurrences of the pattern in string by the
  6     replacement repl.  repl can be either a string or a callable;
  7     if a callable, it's passed the match object and must return
  8     a replacement string to be used.

可以使用正则表达式执行此操作:

import re
wordlist = [re.sub("\n.*", "", word) for word in wordlist]
正则表达式
\n.*
匹配第一个
\n
和后面的任何内容(
*
),并将其替换为零

[w[:w.find('\n')] fow w in wordlist]
少数测试:

$ python -m timeit -s "wordlist = ['Schreiben\nEs', 'Schreiben', 'Schreiben\nEventuell', 'Schreiben\nHaruki']" "[w[:w.find('\n')] for w in wordlist]"
100000 loops, best of 3: 2.03 usec per loop
$ python -m timeit -s "import re; wordlist = ['Schreiben\nEs', 'Schreiben', 'Schreiben\nEventuell', 'Schreiben\nHaruki']" "[re.sub('\n.*', '', w) for w in wordlist]"
10000 loops, best of 3: 17.5 usec per loop
$ python -m timeit -s "import re; RE = re.compile('\n.*'); wordlist = ['Schreiben\nEs', 'Schreiben', 'Schreiben\nEventuell', 'Schreiben\nHaruki']" "[RE.sub('', w) for w in wordlist]"
100000 loops, best of 3: 6.76 usec per loop
编辑:

上述解决方案是完全错误的(参见Peter Hansen的评论)。下面是更正的一个:

def truncate(words, s):
    for w in words:
        i = w.find(s)
        yield w[:i] if i != -1 else w

我不确定,但我猜他也想删除\n之后的连续字符,但我也想删除\n和“”之后的以下字母!谢谢你,我很高兴你能帮助我。python中有很多函数。你帮了大忙。当然,我现在将阅读更多关于re模块的内容。:)这是一个非常糟糕(即完全未经测试)的答案,因为它悄悄地截断了没有换行符的单词。str.find()在不匹配的情况下返回-1,使用[:-1]进行切片将返回所有字符,但不包括最后一个字符。请删除。@Peter Hansen:谢谢你的报告,我正在考虑如何让它一行一行,但我忘记了正确性。@mg,好的。。。现在请修复编辑部分中的for循环。“For w in in in words:”有一个额外的“in”。通过使用
RE=RE.compile(…).sub
[RE(“”,w)…]
:无需为每个单词寻找
sub()
方法,您可以获得一个小的加速(~10%)。
def truncate(words, s):
    for w in words:
        i = w.find(s)
        yield w[:i] if i != -1 else w
>>> wordlist = ['Schreiben\nEs', 'Schreiben', 'Schreiben\nEventuell', 'Schreiben\nHaruki']
>>> [ i.split("\n")[0] for i in wordlist ]
['Schreiben', 'Schreiben', 'Schreiben', 'Schreiben']