Python 拆下'\\n\\t\\t\\t'-列表中的元素_Python_Regex

Python 拆下'\\n\\t\\t\\t'-列表中的元素

python regex

Python 拆下'\\n\\t\\t\\t'-列表中的元素,python,regex,Python,Regex,我得到了下面的“电话号码”列表。我很难删除包含“\n\t\t\t”和“\n\t\t\t”的元素。我尝试了“try and except”-methode和remove（'\n\t\t\t\t'），但无法使其工作。有什么建议吗（02271）6 79“7”7，“7”7，“7”7，“7”7，”7“7”7“7”7，“7”3\t\t\t\t\t\t t，”7，“7”7，”7，“7”7，”7，”7，”7，”7，“7”7，”7，”7，“7，”7，”7，“7 7，”7，“7，”7，”7，“7，”7，”7，

我得到了下面的“电话号码”列表。我很难删除包含“\n\t\t\t”和“\n\t\t\t”的元素。我尝试了“try and except”-methode和remove（'\n\t\t\t\t'），但无法使其工作。有什么建议吗

（02271）6 79“7”7，“7”7，“7”7，“7”7，”7“7”7“7”7，“7”3\t\t\t\t\t\t t，”7，“7”7，”7，“7”7，”7，”7，”7，”7，“7”7，”7，”7，“7，”7，”7，“7 7，”7，“7，”7，”7，“7，”7，”7，”7，”7，”3，”7，”7，”7 0，”7 0，”7 0，”7，“0，”7，”7 0，”7 0，”7，”7 0，”7，”7，”7，”7，”7，”7 0，”7 0，”7 0，”7，”7 0，”7 0，”7，”7，”7，”7，”7 7，”7，”7，”7，”7，”7 0，”7 0，”7，”7，”7，”7，”7 0，”7，”7，”7，”7，”7，”7，”7，”7，”7，”7，”7，”7，”7，”7，”7，”7，”7，”7，”7，”7，”，'\n\t\t\t'，'\n\t\t\t'，'（02161）2419'，'\n\t\t\t\t'，40'，'\n\t\t\t'，'\n\t\t'，'（0231 31）66 67“，，，，，，，，，（0231）66 67 67“，，，，，，，，，，，，，，，，，，，，，，，，，（023131）66 67“，（0231）66 67“，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，，\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\t\\\\\\\\\\\\\\t\t\t\t\t\t\t\t\t\t\t\t\t\t\7'、'\n\t\t\t\t'、'\n\t\t\t'、'\n\t\t\t'、'（02173）“3-0”是一个由“\n\\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t“3-0 0 4 7 7 7 7 7 7 7 7 7 7”7，“3-0 0 0 0 0 0 0 0 0 0 0 0 4 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7”7，“7，”3-4，”3-0 0 0 0 0 0 0 0 0 0 0 0 0，”7，“4，”7，”7，”3-4，”7，”4，”3\\n\\\ \ \ \ \ \ \ \ \ \ \ \ \t\t\t\t\t\t\t\t\t\t\t\t\t\t\t","(0221)3 46 79 40","n\t\t\t","n\t\t\t","(02232)“4.23”是“4.23”的其中一个，“4.23”是，“4.23”是，“4.23”是“4.23”的，（02232）4.23，“4.23”是“4.23”是，“4.23”是，“5”是，“5”是，“5”是，“5”是，“4”是，”4，“4，”4，“4，”4，“4.3，”4 3 3 3，”4 3 3，”4.3 3 3，”4 3，“4.3，”4，”3 3 3 3 3 3，”4，”4，”4，“4，”4，”4，”3，”4，”3，”4，”4，”3，“4，”3，”3，”3\\n\\\\\\\\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\\，'\n\t\t\t\t'，47'，'\n\t\t\t'，'\n\t\t\t'，'（02181）7.11 11 11“11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11，“47 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11（0211）23 80'、'\n\t\t\t'、'\n\t\t\t'、'\n\t\t\t'、'（0211）23 80'、'\n\t\t\t\t'、'\n\t\t\t'、'\n\t\t\t'、'（02235）9 23 0'、'\n\t\t\t\t'、'4-0'、'\n\t\t\t\t\t\t'、'、'（02235）9 23 0'、'\t\t\t\t\t']

试着这样,

result = [i for i in lst if not i.endswith('\t\t')]

您可以使用

列表理解

创建

字符串列表

，其中每个字符串都必须通过

所有字符（c
）的测试在字符串中
是中的：'\t\n'
。我认为这是最有效的通用解决方案，适用于只包含选项卡
和换行符
的字符串，在Python中也非常可读：
[i for i in lst if all(c not in '\t\n' for c in i)]

这给出了以下各项的正确结果：
['(02271) 6 79', ' 70', '(02271) 6 79', ' 70', '(02181) 27 0', '3-0', '(02181) 27 0', '3-0', '(02161) 24 19', ' 40', '(02161) 24 19', ' 40', '(02131) 66 67', ' 10', '(02131) 66 67', ' 10', '(02103) 39 00', ' 93', '(02103) 39 00', ' 93', '(02173) 2 04 7', '3-0', '(02173) 2 04 7', '3-0', '(02235) 9 23 04', ' 30', '(02235) 9 23 04', ' 30', '(0221) 3 46 79 40', '(0221) 3 46 79 40', '(02232) 4 23', ' 05', '(02232) 4 23', ' 05', '(0157) 86 85 74', ' 43', '(0157) 86 85 74', ' 43', '(02181) 2 78 11', ' 47', '(02181) 2 78 11', ' 47', '(02181) 47 49 0', '0-0', '(02181) 47 49 0', '0-0', '(02202) 1 88', ' 60', '(02202) 1 88', ' 60', '(0211) 23 80', ' 70', '(0211) 23 80', ' 70', '(02235) 9 23 0', '4-0', '(02235) 9 23 0', '4-0']


您也可以使用更短的，但可能（我不是100%
sure）会稍微慢一点，因为它会检查所有空白
字符：
[i for i in lst if not i.isspace()]

这会得到同样的结果。
你可以用一个简单的表达式，比如
^\s+$

在Python中
：
import re

lst = ['(02271) 6 79', ' 70', '\n\t\t\t', '(02271) 6 79', '\n\t\t\t\t', ' 70', '\n\t\t\t', '\n\t\t\t', '(02181) 27 0', '\n\t\t\t\t', '3-0', '\n\t\t\t', '\n\t\t\t', '(02181) 27 0', '\n\t\t\t\t', '3-0', '\n\t\t\t', '\n\t\t\t', '(02161) 24 19', '\n\t\t\t\t', ' 40', '\n\t\t\t', '\n\t\t\t', '(02161) 24 19', '\n\t\t\t\t', ' 40', '\n\t\t\t', '\n\t\t\t', '(02131) 66 67', '\n\t\t\t\t', ' 10', '\n\t\t\t', '\n\t\t\t', '(02131) 66 67', '\n\t\t\t\t', ' 10', '\n\t\t\t', '\n\t\t\t', '(02103) 39 00', '\n\t\t\t\t', ' 93', '\n\t\t\t', '\n\t\t\t', '(02103) 39 00', '\n\t\t\t\t', ' 93', '\n\t\t\t', '\n\t\t\t', '(02173) 2 04 7', '\n\t\t\t\t', '3-0', '\n\t\t\t', '\n\t\t\t', '(02173) 2 04 7', '\n\t\t\t\t', '3-0', '\n\t\t\t', '\n\t\t\t', '(02235) 9 23 04', '\n\t\t\t\t', ' 30', '\n\t\t\t', '\n\t\t\t', '(02235) 9 23 04', '\n\t\t\t\t', ' 30', '\n\t\t\t', '\n\t\t\t', '\n\t\t\t\t', '(0221) 3 46 79 40', '\n\t\t\t', '\n\t\t\t', '\n\t\t\t\t', '(0221) 3 46 79 40', '\n\t\t\t', '\n\t\t\t', '(02232) 4 23', '\n\t\t\t\t', ' 05', '\n\t\t\t', '\n\t\t\t', '(02232) 4 23', '\n\t\t\t\t', ' 05', '\n\t\t\t', '\n\t\t\t', '(0157) 86 85 74', '\n\t\t\t\t', ' 43', '\n\t\t\t', '\n\t\t\t', '(0157) 86 85 74', '\n\t\t\t\t', ' 43', '\n\t\t\t', '\n\t\t\t', '(02181) 2 78 11', '\n\t\t\t\t', ' 47', '\n\t\t\t', '\n\t\t\t', '(02181) 2 78 11', '\n\t\t\t\t', ' 47', '\n\t\t\t', '\n\t\t\t', '(02181) 47 49 0', '\n\t\t\t\t', '0-0', '\n\t\t\t', '\n\t\t\t', '(02181) 47 49 0', '\n\t\t\t\t', '0-0', '\n\t\t\t', '\n\t\t\t', '(02202) 1 88', '\n\t\t\t\t', ' 60', '\n\t\t\t', '\n\t\t\t', '(02202) 1 88', '\n\t\t\t\t', ' 60', '\n\t\t\t', '\n\t\t\t', '(0211) 23 80', '\n\t\t\t\t', ' 70', '\n\t\t\t', '\n\t\t\t', '(0211) 23 80', '\n\t\t\t\t', ' 70', '\n\t\t\t', '\n\t\t\t', '(02235) 9 23 0', '\n\t\t\t\t', '4-0', '\n\t\t\t', '\n\t\t\t', '(02235) 9 23 0', '\n\t\t\t\t', '4-0', '\n\t\t\t']

rx = re.compile(r'^\s+$')

lst = [item.strip() for item in lst if not rx.match(item)]
print(lst)

这将从开始到结束生成并去除所有不只是空白的数字：
['(02271) 6 79', '70', '(02271) 6 79', '70', '(02181) 27 0', '3-0', '(02181) 27 0', '3-0', '(02161) 24 19', '40', '(02161) 24 19', '40', '(02131) 66 67', '10', '(02131) 66 67', '10', '(02103) 39 00', '93', '(02103) 39 00', '93', '(02173) 2 04 7', '3-0', '(02173) 2 04 7', '3-0', '(02235) 9 23 04', '30', '(02235) 9 23 04', '30', '(0221) 3 46 79 40', '(0221) 3 46 79 40', '(02232) 4 23', '05', '(02232) 4 23', '05', '(0157) 86 85 74', '43', '(0157) 86 85 74', '43', '(02181) 2 78 11', '47', '(02181) 2 78 11', '47', '(02181) 47 49 0', '0-0', '(02181) 47 49 0', '0-0', '(02202) 1 88', '60', '(02202) 1 88', '60', '(0211) 23 80', '70', '(0211) 23 80', '70', '(02235) 9 23 0', '4-0', '(02235) 9 23 0', '4-0']


正如@dawg所指出的，实际上并不需要正则表达式：
lst = [number for item in lst for number in [item.strip()] if number]

发布您尝试过的内容，可能有人会帮您修复。也许您不应该删除这些项目，而应该修改生成列表的代码，使其不首先插入。该列表是如何生成的？@Bryan Oakley首先使用Qt呈现页面，然后使用lxml通过tree提取列表。xpath:phonenumbers=tree.xpath（'//span[@class=“text numer\u ganz”]//text（））--网站是：str.strip（）
将删除所有的。\n\t\t\t\t'
所以你可以做[e for e in ur\u lst if e.strip（）]
来过滤掉所有空白的元素。不需要正则表达式。@dawg：很好的一点，尽管我甚至会使用lst=[number for item in lst for number in[item.strip（）]if number]
在结果列表中删除项目。我已更新了下面的答案。感谢各位的回答。我尝试了所有答案，但没有一个对我有效。可能我的列表不是“真实”列表？@Jan当我使用列表“lst”时但是当我写lst=phonenumbers时，它就不起作用了……我的列表是通过先用Qt呈现页面，然后使用lxml通过tree.xpath:phonenumbers=tree.xpath（“//span[@class=“text numer\u ganz”]//text（）”）来创建的，网站是：gelbeseiten.de/schluesselfertigbau/bergheim，，，，，，，umkreis-5000‌0/s1