python-通过字符串中的值循环_Python_Python 2.7

python-通过字符串中的值循环

python python-2.7

python-通过字符串中的值循环,python,python-2.7,Python,Python 2.7,我正试图从一个非常复杂的字符串中得到许多值，它看起来像这样- s = '04/03 23:50:06:242[76:Health]: (mem=188094936/17146904576) Queue Size[=:+:-] : Core[Compiler:0:0:0,HighPriority:0:74:74,Default:6:1872:1874,LowPriority:0:2:2]:Special[Special:0:2:2]:Event[Event:0:0:0]:Comm[CommHigh

我正试图从一个非常复杂的字符串中得到许多值，它看起来像这样-

s = '04/03 23:50:06:242[76:Health]: (mem=188094936/17146904576) Queue Size[=:+:-] : Core[Compiler:0:0:0,HighPriority:0:74:74,Default:6:1872:1874,LowPriority:0:2:2]:Special[Special:0:2:2]:Event[Event:0:0:0]:Comm[CommHigh:0:1134:1152,CommDefault:0:4:4]'

这些是我需要扫描的值-

list = ['Compiler', 'HighPriority', 'Default', 'LowPriority', 'Special', 'Event', 'CommHigh', 'CommDefault']

我的目的是在每个字符串后面获得3个数字，因此在

高优先级

的示例中，我将获得

[0,74,74]

，然后我可以对每个项目执行一些操作

我使用了下面的公式，但它不能解释字符串末尾不是逗号的情况

def find_between( s, first, last ):
    try:
        start = s.index( first ) + len( first )
        end = s.index( last, start )
        return s[start:end]
    except ValueError:
        return ""


for l in list:
    print l
    print find_between( s, l + ':', ',' ).split(':')

这将获得所有组，前提是字符串始终格式良好：

re.findall('(\w+):(\d+):(\d+):(\d+)', s)

它还可以获取时间，您可以轻松地将其从列表中删除

或者，您可以使用字典理解来组织项目：

matches = re.findall('(\w+):(\d+:\d+:\d+)', s)
my_dict = {k : v.split(':') for k, v in matches[1:]}

我在这里使用了

匹配[1://code>来消除虚假匹配。如果你知道它永远在那里，你可以做到
 如果字符串的格式始终正确，则这将获得所有组：
re.findall('(\w+):(\d+):(\d+):(\d+)', s)

它还可以获取时间，您可以轻松地将其从列表中删除
或者，您可以使用字典理解来组织项目：
matches = re.findall('(\w+):(\d+:\d+:\d+)', s)
my_dict = {k : v.split(':') for k, v in matches[1:]}

我在这里使用了匹配[1://code>来消除虚假匹配。如果你知道它永远在那里，你可以做到
 检查以下内容：
import re
s = '04/03 23:50:06:242[76:Health]: (mem=188094936/17146904576) Queue Size[=:+:-] : Core[Compiler:0:0:0,HighPriority:0:74:74,Default:6:1872:1874,LowPriority:0:2:2]:Special[Special:0:2:2]:Event[Event:0:0:0]:Comm[CommHigh:0:1134:1152,CommDefault:0:4:4]'
search = ['Compiler', 'HighPriority', 'Default', 'LowPriority', 'Special', 'Event', 'CommHigh', 'CommDefault']
data = []
for x in search:
    data.append(re.findall(x+':([0-9]+:[0-9]+:[0-9]+)', s))

data = [map(lambda x: x.split(':'), x) for x in data] # remove :
data = [x[0] for x in data] # remove unnecessary []
data = [map(int,x) for x in data] # convert to int
print data

>>>[[0, 0, 0], [0, 74, 74], [6, 1872, 1874], [0, 2, 2], [0, 2, 2], [0, 0, 0], [0, 1134, 1152], [0, 4, 4]]

选中此项：
import re
s = '04/03 23:50:06:242[76:Health]: (mem=188094936/17146904576) Queue Size[=:+:-] : Core[Compiler:0:0:0,HighPriority:0:74:74,Default:6:1872:1874,LowPriority:0:2:2]:Special[Special:0:2:2]:Event[Event:0:0:0]:Comm[CommHigh:0:1134:1152,CommDefault:0:4:4]'
search = ['Compiler', 'HighPriority', 'Default', 'LowPriority', 'Special', 'Event', 'CommHigh', 'CommDefault']
data = []
for x in search:
    data.append(re.findall(x+':([0-9]+:[0-9]+:[0-9]+)', s))

data = [map(lambda x: x.split(':'), x) for x in data] # remove :
data = [x[0] for x in data] # remove unnecessary []
data = [map(int,x) for x in data] # convert to int
print data

>>>[[0, 0, 0], [0, 74, 74], [6, 1872, 1874], [0, 2, 2], [0, 2, 2], [0, 0, 0], [0, 1134, 1152], [0, 4, 4]]

编辑，如果你真的想避免正则表达式，你的方法只需稍加调整（我将list
重命名为l
以避免隐藏内置类型）：
这张照片是：
Compiler
['0', '0', '0']
HighPriority
['0', '74', '74']
Default
['6', '1872', '1874']
LowPriority
['0', '2', '2']
Special
['0', '2', '2']
Event
['0', '0', '0']
CommHigh
['0', '1134', '1152']
CommDefault
['0', '4', '4']

然而，这确实是正则表达式的任务，您应该尝试了解基本知识
import re

def find_between(s, word):
    # Search for your (word followed by ((:a_digit) repeated three times))
    x = re.search("(%s(:\d+){3})" % word, s)
    return x.groups()[0]

for word in l:
    print find_between(s, word).split(':', 1)[-1].split(':')

这张照片
['0', '0', '0']
['0', '74', '74']
['6', '1872', '1874']
['0', '2', '2']
['0', '2', '2']
['0', '0', '0']
['0', '1134', '1152']
['0', '4', '4']

编辑，如果你真的想避免正则表达式，你的方法只需稍加调整（我将list
重命名为l
以避免隐藏内置类型）：
这张照片是：
Compiler
['0', '0', '0']
HighPriority
['0', '74', '74']
Default
['6', '1872', '1874']
LowPriority
['0', '2', '2']
Special
['0', '2', '2']
Event
['0', '0', '0']
CommHigh
['0', '1134', '1152']
CommDefault
['0', '4', '4']

然而，这确实是正则表达式的任务，您应该尝试了解基本知识
import re

def find_between(s, word):
    # Search for your (word followed by ((:a_digit) repeated three times))
    x = re.search("(%s(:\d+){3})" % word, s)
    return x.groups()[0]

for word in l:
    print find_between(s, word).split(':', 1)[-1].split(':')

这张照片
['0', '0', '0']
['0', '74', '74']
['6', '1872', '1874']
['0', '2', '2']
['0', '2', '2']
['0', '0', '0']
['0', '1134', '1152']
['0', '4', '4']

我认为解决这个问题的最好方法是学习使用标准库的模块“re”。是的，我的re-fu很糟糕。我曾经尝试过使用re，但是当我看到像\d\w\++？\（\）
这样的代码块时，我冻结了，因为它对我来说不容易阅读：（类似r=re.search（'Compiler:（[0-9]+）：（[0-9]+）：（[0-9]+），s）
应该让你开始使用r.groups（）
获取包含数字的三个子字符串。@mkie不管我认为你的意思是re.search
而不是re.find
@whoiseearth如果你拼命想避免使用正则表达式，我真的不推荐，你可以使用takewhile
加上''。加入（见我编辑的答案）。我认为解决这个问题的最好方法是学习使用标准库的模块“re”。是的，我的re-fu很糟糕。我尝试过使用re，但当我看到一块代码，如\d\w\++\？\（\）
时，我停了下来，因为我不容易阅读：（类似r=re.search（[0-9]+）：（[0-9]+）：（[0-9]+）：（[0-9]+）：[0-9]+），，s）
应该可以开始了。使用r.groups（）
获取包含数字的三个子字符串。@mkie不管我认为你的意思是re.search
而不是re.find
@whoiseearth如果你拼命想避免使用正则表达式，我真的不推荐，你可以使用takewhile
加上''。加入（见我编辑的答案）。