将行与特定字符串匹配以提取Python正则表达式的值

将行与特定字符串匹配以提取Python正则表达式的值,python,regex,Python,Regex,我在为这个任务找到正确的正则表达式时遇到了一些问题,请原谅我的初学者技能。我试图做的只是从一行中获取id值,其中的available:true not available:false。我可以通过re.findall'ID:\d{13}获得所有行的ID,行,re.DOTALL 13正好匹配13位数字,因为代码中还有其他ID少于13位,我不需要 {"id":1351572979731,"parent_pid":21741,"available":false,"lou":"678","feature"

我在为这个任务找到正确的正则表达式时遇到了一些问题,请原谅我的初学者技能。我试图做的只是从一行中获取id值,其中的available:true not available:false。我可以通过re.findall'ID:\d{13}获得所有行的ID,行,re.DOTALL 13正好匹配13位数字,因为代码中还有其他ID少于13位,我不需要

{"id":1351572979731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""},
{"id":1351572329731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""},
{"id":1351572943231,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""},
{"id":1651572973431,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""},
因此,最终结果需要是['16515729734331','1351572943231']


感谢您的帮助在此,我们可以简单地将id用作左边界,并在捕获组中收集所需的数字:

"id":([0-9]+)
然后,我们可以继续添加边界。例如,如果需要13位数字,我们可以简单地:

\"id\":([0-9]{13})

在这里,我们可以简单地将id用作左边界,并在捕获组中收集所需的数字:

"id":([0-9]+)
然后,我们可以继续添加边界。例如,如果需要13位数字,我们可以简单地:

\"id\":([0-9]{13})

这与您想要的匹配


?这可以满足您的需求


这可能不是一个好的答案——这完全取决于你拥有什么。看起来您有一个字符串列表,并且希望其中一些字符串的id。如果是这样的话,那么如果您解析JSON而不是编写拜占庭式正则表达式,就会变得更干净、更容易阅读。例如:

import json

# lines is a list of strings:

lines = ['{"id":1351572979731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}',
'{"id":1351572329731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}',
'{"id":1351572943231,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}',
'{"id":1651572973431,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}',
]

# parse it and you can use regular python to get what you want:
[line['id'] for line in map(json.loads, lines) if line['available']]
结果

如果您发布的代码是一个长字符串,则可以将其包装在[]中,然后将其解析为具有相同结果的数组:

import json

line = r'{"id":1351572979731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}, {"id":1351572329731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}, {"id":1351572943231,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""},{"id":1651572973431,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}'

lines = json.loads('[' + line + ']')
[line['id'] for line in lines if line['available']]

这可能不是一个好的答案——这完全取决于你拥有什么。看起来您有一个字符串列表,并且希望其中一些字符串的id。如果是这样的话,那么如果您解析JSON而不是编写拜占庭式正则表达式,就会变得更干净、更容易阅读。例如:

import json

# lines is a list of strings:

lines = ['{"id":1351572979731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}',
'{"id":1351572329731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}',
'{"id":1351572943231,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}',
'{"id":1651572973431,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}',
]

# parse it and you can use regular python to get what you want:
[line['id'] for line in map(json.loads, lines) if line['available']]
结果

如果您发布的代码是一个长字符串,则可以将其包装在[]中,然后将其解析为具有相同结果的数组:

import json

line = r'{"id":1351572979731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}, {"id":1351572329731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}, {"id":1351572943231,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""},{"id":1651572973431,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}'

lines = json.loads('[' + line + ']')
[line['id'] for line in lines if line['available']]


你为什么要用正则表达式呢?你不解析JSON有什么原因吗?@ggorlen请将原始代码放回原处,因为结果与代码上的不一样如果我无意中与你的意图相冲突,欢迎你将其回滚,但如果你的原始结构是字符串,请使用引号。你是说,是的,这是原始字符串?如果您要求执行字符串分析任务,请发布准确的字符串,并在其周围加上引号,以避免歧义。@sakow0我认为有一些混淆,因为不清楚上面的代码是表示单个字符串还是字符串列表。您的正则表达式看起来像是在查看名为line的变量。行是其中之一还是全部?为什么要使用正则表达式?你不解析JSON有什么原因吗?@ggorlen请将原始代码放回原处,因为结果与代码上的不一样如果我无意中与你的意图相冲突,欢迎你将其回滚,但如果你的原始结构是字符串,请使用引号。你是说,是的,这是原始字符串?如果您要求执行字符串分析任务,请发布准确的字符串,并在其周围加上引号,以避免歧义。@sakow0我认为有一些混淆,因为不清楚上面的代码是表示单个字符串还是字符串列表。您的正则表达式看起来像是在查看名为line的变量。第行是其中一行还是全部?OP只需要与特定条件匹配的行。感谢响应emma,我需要匹配可用:true条件OP只需要与特定条件匹配的行。感谢响应emma,我需要匹配可用:真实条件感谢您的努力@Mark Meyer如果数据是完整的json,这将是一个完美的答案,这是我解释得不太好的错误,但是我肯定从中学到了一些东西,非常感谢!感谢您的努力@Mark Meyer如果数据是完整的json,这将是一个完美的答案,这是我没有解释得太好的错误,但是我确实从中学到了一些东西,非常感谢!你能解释一下正则表达式的第三部分是什么吗doing@PIG-我加了更多。你到底在理解什么?你能解释一下正则表达式的第三部分是什么吗doing@PIG-我加了更多。你到底在理解什么方面有困难?