Python字符串，找到特定的单词，然后复制后面的单词_Python_String_Parsing

Python字符串，找到特定的单词，然后复制后面的单词

python string parsing

Python字符串，找到特定的单词，然后复制后面的单词,python,string,parsing,Python,String,Parsing,我在做这样的操作时遇到了问题，比如说我们有一个字符串 teststring = "This is a test of number, number: 525, number: 585, number2: 559" 我想将525和585存储到一个列表中，我如何才能做到这一点我用了一种非常愚蠢的方法，很有效，但肯定有更好的方法 teststring = teststring.split() found = False for word in teststring: if

我在做这样的操作时遇到了问题，比如说我们有一个字符串

teststring = "This is a test of number, number: 525, number: 585, number2: 559"

我想将525和585存储到一个列表中，我如何才能做到这一点

我用了一种非常愚蠢的方法，很有效，但肯定有更好的方法

teststring = teststring.split()
found = False
    for word in teststring:
        if found:
            templist.append(word)
            found = False
        if word is "number:":
            found = True

正则表达式有解决方案吗

后续：如果我想存储525、585和559，该怎么办？

这不是世界上最有效的代码，但它仍然可能比正则表达式更好：

tokens = teststring.split()
numlist = [val for key, val in zip(tokens, tokens[1:]) if key == 'number:']

对于后续和更一般的查询：

def find_next_tokens(teststring, test):
    tokens = teststring.split()
    return [val for key, val in zip(tokens, tokens[1:]) if test(key)]

可称之为：

find_next_tokens(teststring, lambda s: s.startswith('number') and s.endswith(':'))

如果要搜索的键来自用户输入，这将有所帮助：

find_next_tokens(teststring, lambda s: s in valid_keys)

您可以使用正则表达式组来实现这一点。下面是一些示例代码：

import re
teststring = "This is a test of number, number: 525, number: 585, number2: 559"
groups = re.findall(r"number2?: (\d{3})", teststring)

组

然后包含数字。此语法使用正则表达式组。

您可以尝试以下方法：

import re
[int(x) for x in re.findall(r' \d+', teststring)]

这将给你：

[525, 585, 559]

使用模块：

\d

是任何数字[0-9]

表示从0到无限次

（）

表示要捕获的内容

表示从1到无限次

如果需要将生成的字符串转换为

int

s，请使用：

或

我建议：

teststring = "This is a test of number, number: 525, number: 585, number2: 559"
# The following does: "This is a test of number, number: 525, number: 585, number2: 559" -> ["525, number", "585, number2", "559"]
a = teststring.split(': ')[1:]
# The following does: ["525, number", "585, number2", "559"] -> ["525", " number", "585", " number2", "559"]
b = [i.split(',') for i in a]
# The following does: [["525", " number"], ["585", " number2"], ["559"]] -> ["525", "585", "559"]
c = [i[0] for i in b]
>>> c
['525', '585', '559']

这只回答了一半的问题，除非他搜索了所有3位数字，后跟一个逗号，这也回答了最初的问题。但只需要525和585。谢谢假期，你能解释一下findall（r“number2:（\d{3}）@Guagua

number

是自解释的，

2？

匹配零次或一次出现的

，冒号和空格匹配冒号和空格，

（\d{3}）

匹配并捕获三位数字。为什么要在此处使用正则表达式？简单的列表理解会更快、更准确readable@tobyodavies你需要像拆分和解析数字这样的一些步骤吗？嗨，ovgolovin，谢谢你的回答。非常清楚。\d+表示它可以捕获任意数量的数字，对吗？@Guagua是的，

表示从1到无穷大。如果你正好需要3，请使用

{3}

而不是

。我知道这是一个愚蠢的问题，re.findall（r'这是什么？@Guagua请参阅本文顶部的第二段和第三段，关于

原始字符串是什么以及为什么需要它们。是的，是的，我不久前读过它，但忘记了。非常有用，谢谢。请解释为什么解决方案比正则表达式更好？效率还是其他什么？正则表达式很难理解调试，如果您不需要，我建议您不要使用调试。我发现这比使用正则表达式+解释匹配数据要容易得多。如果您想从用户那里获取密钥列表并使用正则表达式解析，祝您好运。这有什么用？[1:@Guagua if:a=[a，b，c，d]；a[1:=[b，c，d]
>>> map(int, ['525', '585', '559'])
[525, 585, 559]

>>> [int(s) for s in ['525', '585', '559']]
[525, 585, 559]

teststring = "This is a test of number, number: 525, number: 585, number2: 559"
# The following does: "This is a test of number, number: 525, number: 585, number2: 559" -> ["525, number", "585, number2", "559"]
a = teststring.split(': ')[1:]
# The following does: ["525, number", "585, number2", "559"] -> ["525", " number", "585", " number2", "559"]
b = [i.split(',') for i in a]
# The following does: [["525", " number"], ["585", " number2"], ["559"]] -> ["525", "585", "559"]
c = [i[0] for i in b]
>>> c
['525', '585', '559']