Python 如何仅打印列表中包含3到5个整数和一个字母的对象_Python_Regex

Python 如何仅打印列表中包含3到5个整数和一个字母的对象

python regex

Python 如何仅打印列表中包含3到5个整数和一个字母的对象,python,regex,Python,Regex,我有名单 ['Product', '98', '100K', 'fifa15-ps-100k', 'K', 'gold', 'Product', '99', '200K', 'fifa15-ps-200k', 'K', 'gold', 'Product', '197', '300K', 'fifa15-ps-300k', 'K', 'gold', 'Product', '198', '400K', 'fifa15-ps-400k', 'K', 'gold', 'Product', '100',

我有名单

 ['Product', '98', '100K', 'fifa15-ps-100k', 'K', 'gold', 'Product', '99', '200K', 'fifa15-ps-200k', 'K', 'gold', 'Product', '197', '300K', 'fifa15-ps-300k', 'K', 'gold', 'Product', '198', '400K', 'fifa15-ps-400k', 'K', 'gold', 'Product', '100', '500K', 'fifa15-ps-500k', 'K', 'gold', 'Product', '199', '600K', 'fifa15-ps-600k', 'K', 'gold', 'Product', '200', '700K', 'fifa15-ps-700k', 'K', 'gold', 'Product', '201', '800K', 'fifa15-ps-800k', 'K', 'gold', 'Product', '202', '900K', 'fifa15-ps-900k', 'K', 'gold', 'Product', '122', '1000K', 'fifa15-ps-1000k', 'K', 'gold', 'Product', '235', '1500K', 'fifa15-ps-1500k', 'K', 'gold', 'Product', '125', '2000K', 'fifa15-ps-2000k', 'K', 'gold', 'Product', '208', '3000K', 'fifa15-ps-3000k', 'K', 'gold', 'Product', '209', '4000K', 'fifa15-ps-4000k', 'K', 'gold', 'Product', '126', '5000K', 'fifa15-ps-5000k', 'K', 'gold', 'Product', '216', '7000K', 'fifa15-ps-7000k', 'K', 'gold', 'Product', '215', '10000K', 'fifa15-ps-10000k', 'K', 'gold', 'Product']

我只想用3-5个整数和一个字母打印列表中的对象。例如：

[100k, 200k, 300k]

我试着使用正则表达式，但没有结果，我只知道真正简单的正则表达式。只要一个指向正确方向的指针就可以了

import re

# bound function; same as
# wanted = lambda s: re.match("^\d{3,5}[a-z]$", s, re.I)
wanted = re.compile("^\d{3,5}[a-z]$", re.I).match

# breaking down the regex:
#   ^        starts at the beginning of the string
#            (redundant because .match does that anyway,
#             but I like to make it explicit)
#   \d{3,5}  3 to 5 digits
#   [a-z]    any letter (re.I makes it case insensitive,
#              so it will also match A-Z)
#   $        goes right to the end of the string

data = ['Product', '98', '100K', 'fifa15-ps-100k', 'K', 'gold', 'Product', '99', '200K', 'fifa15-ps-200k', 'K', 'gold', 'Product', '197', '300K', 'fifa15-ps-300k', 'K', 'gold', 'Product', '198', '400K', 'fifa15-ps-400k', 'K', 'gold', 'Product', '100', '500K', 'fifa15-ps-500k', 'K', 'gold', 'Product', '199', '600K', 'fifa15-ps-600k', 'K', 'gold', 'Product', '200', '700K', 'fifa15-ps-700k', 'K', 'gold', 'Product', '201', '800K', 'fifa15-ps-800k', 'K', 'gold', 'Product', '202', '900K', 'fifa15-ps-900k', 'K', 'gold', 'Product', '122', '1000K', 'fifa15-ps-1000k', 'K', 'gold', 'Product', '235', '1500K', 'fifa15-ps-1500k', 'K', 'gold', 'Product', '125', '2000K', 'fifa15-ps-2000k', 'K', 'gold', 'Product', '208', '3000K', 'fifa15-ps-3000k', 'K', 'gold', 'Product', '209', '4000K', 'fifa15-ps-4000k', 'K', 'gold', 'Product', '126', '5000K', 'fifa15-ps-5000k', 'K', 'gold', 'Product', '216', '7000K', 'fifa15-ps-7000k', 'K', 'gold', 'Product', '215', '10000K', 'fifa15-ps-10000k', 'K', 'gold', 'Product']
res = [d for d in data if wanted(d)]

print(res)

给予

正如其他答案所述，您可以使用re-但是，看起来您有一个17行6列的平面列表，并且正在尝试从每行检索第3列，因此您可以对其进行切片，例如：

>>> data[2::6]
['100K', '200K', '300K', '400K', '500K', '600K', '700K', '800K', '900K', '1000K', '1500K', '2000K', '3000K', '4000K', '5000K', '7000K', '10000K']

如果你真的，真的，真的想只取那些只有3到5个数字和一个字母出现在任何序列中的东西，那么你可以对每个字符进行分类，统计这些字符，并进行适当的检查，例如：

from collections import Counter
from unicodedata import category

def matches(text):
    counts = Counter(category(ch)[0] for ch in text)
    return (
        3 <= counts['N'] <= 5 # between 3-5 numbers
        and counts['L'] == 1 # has a single letter
        and not {'N', 'L'}.difference(counts) # doesn't contain anything else
    )

在不使用re的情况下，如果可能出现“98K3”和“K100”：

from string import ascii_letters,digits
print [ele for ele in l if 3 <= len(ele) - len(ele.translate(None,digits)) <= 5 and len(ele) - len(ele.lower().translate(None, ascii_letters)) ==1 ]

['100K', '200K', '300K', '400K', '500K', '600K', '700K', '800K', '900K', '1000K', '1500K', '2000K', '3000K', '4000K', '5000K', '7000K', '10000K']

“12K34”怎么样？我不知道你们怎么会不清楚…@AshwiniChaudhary这里是[x代表x在lst中如果re.matchr'？m^？=？：\D*\D{3,5}\D*$？=[^a-zA-Z]*[a-zA-Z][^a-zA-Z]*$.*，x]@AvinashRaj我投了票，因为正如你所看到的，人们发布了两组答案：一组假设信总是跟在数字后面，另一组类似于我基于计数的评论工作。所以，这就是为什么不清楚。如果你想让我重新打开，我完全可以重来。正则表达式可以更好地作为i {3，5}[aZ]，你可能想考虑使用R.Matt，这样你就不必锚定比赛开始，而不是R.FordAlp感谢的提示。我对正则表达式仍然有点不确定，因为我不经常使用它们，我对它们非常了解，可以把它们拼凑起来。出于好奇，模式开头的？i是什么？您也可以使用filterp.match，l.@Cyber？i与将flags=re.i传递给re对象相同-它使表达式不区分大小写-尽管使用[a-zA-Z]没有错，这可能在这方面更有意义case@RushyPanchal你可以，但是列表comp将在2.x和3.x中产生所需的结果。在3.x中，过滤器不再返回列表，因此有可能产生令人惊讶的结果。可能最好的方法是matches=re.compiler'\d{3,5}[a-zA-Z]'。然后匹配[el for el in_list if matchesel]以保存一些名称查找等。如果按字面解释这个问题，这可能是最好的解决方案，regexs将无法在K300等上工作。您可能会对我更新的答案感兴趣，我在这里也做了这项工作，但方式不同。毫无疑问，这会更快。翻译速度很快，但遗憾的是，在2.x和3.x中的用法有所不同。@JonClements，我本来打算使用一个反解，但后来变懒了。看起来很整洁，但你已经有了我的+1；

from collections import Counter
from unicodedata import category

def matches(text):
    counts = Counter(category(ch)[0] for ch in text)
    return (
        3 <= counts['N'] <= 5 # between 3-5 numbers
        and counts['L'] == 1 # has a single letter
        and not {'N', 'L'}.difference(counts) # doesn't contain anything else
    )

from string import ascii_letters,digits
print [ele for ele in l if 3 <= len(ele) - len(ele.translate(None,digits)) <= 5 and len(ele) - len(ele.lower().translate(None, ascii_letters)) ==1 ]

['100K', '200K', '300K', '400K', '500K', '600K', '700K', '800K', '900K', '1000K', '1500K', '2000K', '3000K', '4000K', '5000K', '7000K', '10000K']