Python 使用startswith和index从结构化字符串中查找子字符串_Python

Python 使用startswith和index从结构化字符串中查找子字符串

python

Python 使用startswith和index从结构化字符串中查找子字符串,python,Python,我试图创建一个代码，从结构化字符串中查找子字符串（可以是数字或其他任何内容）。字符串的结构（2种可能性）如下所示：字符串=“1x子字符串3x 4x” 字符串=“4x 3x子字符串1x” x可以是任何字符子字符串格式类似于'pos.2' 正常情况下使用下面的代码，但现在我也要考虑特殊情况。我已经尝试过： I. StastSub（（‘3’，‘4’））< /代码>，但没有用。字符串1-8应该用一个简单的例子来解释逻辑字符串9-10显示了一个复杂的示例。代码应提取位置2、5和7处的子

我试图创建一个代码，从结构化字符串中查找子字符串（可以是数字或其他任何内容）。字符串的结构（2种可能性）如下所示：

字符串=

“1x子字符串3x 4x”

字符串=

“4x 3x子字符串1x”

```
x
```
可以是任何字符
```
子字符串
```
格式类似于
```
'pos.2'
```

正常情况下使用下面的代码，但现在我也要考虑特殊情况。我已经尝试过：<代码> I. StastSub（（‘3’，‘4’））< /代码>，但没有用。字符串1-8应该用一个简单的例子来解释逻辑

字符串9-10显示了一个复杂的示例。代码应提取位置2、5和7处的子字符串

我希望您能帮助为所有字符串/特殊情况找到解决方案，使所有情况下的

clean:80

<代码>：-）

字符串9：

clean:pos2='$80'pos5='75.000 kg'pos7='22秒'


#str1-8 easy example strings

#normal string
str1 = '1x 80 3x 4x'
str2 = '4x 3x 80 1x'

# missing number/pos. 3
str3 = '1x 80 4x '
str4 = '4x 80 1x'
str3a = '1. A  $67  4. A  69.000kg  6. A  12sec  8. B  9. B'
result: `clean: pos2 = '$67'   pos5 = '69.000kg'    pos7 = '12sec'` 
str4a = '9B 8B 22 sec 6A 75.000kg 4b  $80 1b'
result: `clean:  pos2 ='$80'    pos5 = '75.000 kg'   pos7 = '22 sec'` 

# missing number/pos. 1, => number is at the start or end of the string
str5 = '80 3x 4x'
str6 = '4x 3x 80'
str5a =  '10 Mrd 3: A 4: A  50 .379 6: A   7:19   8: B 9: D ' 
result: clean: pos2= 10 Mrd, pos5= 50,379 pos7=7:19 (or just 19 in raw string without 7: if its easier)
str6a = '  9a 8b 10 6b 60000 4a 3 b 50 '
result: clean: pos2= 50, pos5= 60000 pos7=10 

# Optional (rare case)
# missing number/pos. 1 and 3
str7 = '80 4x'
str8 = '4x 80'
str7a = '10 Mrd 4: A  50 .379 6: A   7:19   8: B 9: D ' 
result: clean: pos2= 10 Mrd, pos5= 50,379 pos7=7:19 (or just 19 in raw string without 7: if its easier)
str8a = ' 9a 8b 10 6b 60000 4a  50 '
result: clean: pos2= 50, pos5= 60000 pos7=10 
# complex realistic strings
str9 = '9B 8B 22 sec 6A 75.000kg 4b 3b  $80 1b'
str10 = '1. A  $67  3. A  4. A  69.000kg  6. A  12sec  8. B  9. B'

# missing number/pos. 4 or 6 (Pos6 Optional, cause thats difficult i guess)
str11 = '1. A  $67  3. A    69.000kg  6. A  12sec  8. B  9. B'
result: `clean: pos2 = '$67'   pos5 = '69.000kg'    pos7 = '12sec'
str12 = '1. A  $67  3. A  4a  69.000kg   12sec  8. B  9. B'
result: `clean: pos2 = '$67'   pos5 = '69.000kg'    pos7 = '12sec'

x_list = [str1,str2,str3,str4,str5,str6,str7,str8, str9, str10, str11,str12]

for x in x_list:
    print ("raw             "+x)
    
    values = ['1x', '3x', '4x']
    try:
        for i in values:
            if i.startswith('3') :
                foo=i

            if  i.startswith("1") :
                baa=i 

            start=x.index(foo) + len( foo )
            end=x.index(baa)   

            if start < end:
                pass
                number = x[start:end].strip(' ')


            else:
                start=x.index(baa) + len( baa )
                end=x.index(foo) 
        
                number = x[start:end].strip(' ')
            
    except: 
        number ='0' 
    
    print ("clean           "+number)

字符串10：

clean:pos2='$67'pos5='69.000kg'pos7='12秒'


#str1-8 easy example strings

#normal string
str1 = '1x 80 3x 4x'
str2 = '4x 3x 80 1x'

# missing number/pos. 3
str3 = '1x 80 4x '
str4 = '4x 80 1x'
str3a = '1. A  $67  4. A  69.000kg  6. A  12sec  8. B  9. B'
result: `clean: pos2 = '$67'   pos5 = '69.000kg'    pos7 = '12sec'` 
str4a = '9B 8B 22 sec 6A 75.000kg 4b  $80 1b'
result: `clean:  pos2 ='$80'    pos5 = '75.000 kg'   pos7 = '22 sec'` 

# missing number/pos. 1, => number is at the start or end of the string
str5 = '80 3x 4x'
str6 = '4x 3x 80'
str5a =  '10 Mrd 3: A 4: A  50 .379 6: A   7:19   8: B 9: D ' 
result: clean: pos2= 10 Mrd, pos5= 50,379 pos7=7:19 (or just 19 in raw string without 7: if its easier)
str6a = '  9a 8b 10 6b 60000 4a 3 b 50 '
result: clean: pos2= 50, pos5= 60000 pos7=10 

# Optional (rare case)
# missing number/pos. 1 and 3
str7 = '80 4x'
str8 = '4x 80'
str7a = '10 Mrd 4: A  50 .379 6: A   7:19   8: B 9: D ' 
result: clean: pos2= 10 Mrd, pos5= 50,379 pos7=7:19 (or just 19 in raw string without 7: if its easier)
str8a = ' 9a 8b 10 6b 60000 4a  50 '
result: clean: pos2= 50, pos5= 60000 pos7=10 
# complex realistic strings
str9 = '9B 8B 22 sec 6A 75.000kg 4b 3b  $80 1b'
str10 = '1. A  $67  3. A  4. A  69.000kg  6. A  12sec  8. B  9. B'

# missing number/pos. 4 or 6 (Pos6 Optional, cause thats difficult i guess)
str11 = '1. A  $67  3. A    69.000kg  6. A  12sec  8. B  9. B'
result: `clean: pos2 = '$67'   pos5 = '69.000kg'    pos7 = '12sec'
str12 = '1. A  $67  3. A  4a  69.000kg   12sec  8. B  9. B'
result: `clean: pos2 = '$67'   pos5 = '69.000kg'    pos7 = '12sec'

x_list = [str1,str2,str3,str4,str5,str6,str7,str8, str9, str10, str11,str12]

for x in x_list:
    print ("raw             "+x)
    
    values = ['1x', '3x', '4x']
    try:
        for i in values:
            if i.startswith('3') :
                foo=i

            if  i.startswith("1") :
                baa=i 

            start=x.index(foo) + len( foo )
            end=x.index(baa)   

            if start < end:
                pass
                number = x[start:end].strip(' ')


            else:
                start=x.index(baa) + len( baa )
                end=x.index(foo) 
        
                number = x[start:end].strip(' ')
            
    except: 
        number ='0' 
    
    print ("clean           "+number)

如果我正确地理解了你的目标，那么在我看来，你似乎过于复杂化了。我编写了一个函数，将输入字符串拆分为一个列表，并检查每个段是否符合

1x

、

2x

或

3x

的格式。检查一下，如果不是你需要的，请告诉我

#我们使用regex检查与格式的匹配
进口稀土
#字符串列表
x_列表=[“1x 80 3x 4x”，“4x 3x 80 1x”]
对于x_列表中的x：
打印（查找子文件（x））
def find_substr（x）：
#将空格分成一个列表
seg=x.split（“”）
#检查每个单词的所需格式
对于范围内的i（len（seg））：
对于[1,3,4]中的j：
如果重新搜索（str（j）+”，seg[i]）为无：
#这个单词不符合格式，所以它是子字符串
返回段

将代码稍微修改了一点，以提高可读性。我不知道这是否是你想要的，但它起作用了。如果你有任何问题，请告诉我。我很乐意帮忙

for x in x_list:
    print ("raw             "+x)

    try:
        # splits the string into a list, separating on spaces (e.g ['1x', '80', '3x', '4x'])
        y = x.split(" ")

        # a is the substring that you are checking in the list
        a = '80'
        if a in y:
            index = y.index(a)
            number = y[index]

    except: 
        number ='0' 
    
    print ("clean           "+number)

这看起来像是一份工作

所以这里主要的事情是识别那些结构已知的位置标记，对于正则表达式，我们检查它是否是一个单一的数字（

[1-9]

），然后是一个字母（

或

：

或

）（

（？：[\.\：]？）

），然后是一个字母（

[a-zA-Z]

）然后是另一个空格或字符串的结尾（

（？：|$）

）。

（？：…）

表示该组不是捕获组，有关这些组的更多详细信息，请查看上面链接的文档

我们在

re.split

中使用它将文本分割为匹配部分和不匹配部分，然后从它们周围的空格中去掉字符并过滤掉那些原来是空的

如果它们是匹配的字符串，则标识它们的位置；如果不是，则标识它们的位置

然后是几个简单的检查，比如按照他们来的顺序，如果需要的话将其颠倒，所以我们总是按照相同的顺序返回，并在

final

中提取我们需要的内容，检查最终案例，并相应地调整和完成

还有一个小测试

text="""1. A  $67  4. A  69.000kg  6. A  12sec  8. B  9. B
9B 8B 22 sec 6A 75.000kg 4b  $80 1b
10 Mrd 3: A 4: A  50 .379 6: A   7:19   8: B 9: D
9a 8b 10 6b 60000 4a 3 b 50
10 Mrd 4: A  50 .379 6: A   7:19   8: B 9: D
9a 8b 10 6b 60000 4a  50
9B 8B 22 sec 6A 75.000kg 4b 3b  $80 1b
1. A  $67  3. A    69.000kg  6. A  12sec  8. B  9. B
1. A  $67  3. A  4a  69.000kg   12sec  8. B  9. B
9a 8b 6b 4a 3 b 50 1b""".splitlines()

for t in text:
    print(f"raw: {t!r}\nresult: ",extrator(t) )
    print()

这给了我们

raw: '1. A  $67  4. A  69.000kg  6. A  12sec  8. B  9. B'
result:  ['$67', '69.000kg', '12sec']

raw: '9B 8B 22 sec 6A 75.000kg 4b  $80 1b'
result:  ['$80', '75.000kg', '22 sec']

raw: '10 Mrd 3: A 4: A  50 .379 6: A   7:19   8: B 9: D'
result:  ['10 Mrd', '50 .379', '7:19']

raw: '9a 8b 10 6b 60000 4a 3 b 50'
result:  ['50', '60000', '10']

raw: '10 Mrd 4: A  50 .379 6: A   7:19   8: B 9: D'
result:  ['10 Mrd', '50 .379', '7:19']

raw: '9a 8b 10 6b 60000 4a  50'
result:  ['50', '60000', '10']

raw: '9B 8B 22 sec 6A 75.000kg 4b 3b  $80 1b'
result:  ['$80', '75.000kg', '22 sec']

raw: '1. A  $67  3. A    69.000kg  6. A  12sec  8. B  9. B'
result:  ['$67', '69.000kg', '12sec']

raw: '1. A  $67  3. A  4a  69.000kg   12sec  8. B  9. B'
result:  ['$67', '69.000kg', '12sec']

raw: '9a 8b 6b 4a 3 b 50 1b'
result:  ['50', None, None]

更新2

以下是一个版本，该版本确定了我们获得的数据，并给出了一些假设，例如：

只有位置标记和数据，数据只有位置2、5和7
前面的正则表达式可以识别这些位置标记
任何人都可能失踪
并且数据中没有空格字符，因此，如果任何相关位置标记丢失，并且发现的数据少于预期，则可以将其中一个标记分组到提取的数据点之一，从而可以安全地进行分割，如果不是这样，则相应地调整这些部分

这导致了一个相当长的逐案检查，我希望这是不言自明的，并返回一本说谁是谁的词典

当然，这是可以改进的，但没有想到改进

def extrator(rawtext):
    fil  = filter(None,map(str.strip,re.split(POSRE,rawtext)))
    proc = [(x,int(x[0]) if re.match(POSRE,x) else None) for x in fil] #process raw data
    pos  = [p for x,p in proc if p is not None ] #position markers presents
    if sorted(pos)!=pos:
        proc = list(reversed(proc))        
    data = [x for x,p in proc if p is None]
    pos = {p:i for i,(x,p) in enumerate(proc) if p is not None } #pos marker:index of it
    #print(f"{proc=}")
    if len(data)==3:
        return dict(zip((2,5,7),data))
    #from here, a,b,c will represent data in position 2,5 and 7 respectively
    elif len(data)==2:
        a,b = data
        #c = None
        if 3 in pos or 4 in pos:
            if 6 in pos:
                #one of 2, 5 or 7 is missing
                i = proc.index( (a,None) )
                i34 = pos[3] if 3 in pos else pos[4]
                if i < i34:
                    #a is 2, b is 5 or 7
                    j = proc.index( (b,None) )
                    if j < pos[6]:
                        #7 is missing
                        c = None
                    else:
                        #5 is missing
                        b,c = None,b
                else:
                    #2 is missing, a is 5 thus b is 7
                    a,b,c = None,a,b
            else:
                #a is 2, b may be 5 or 7 or both
                t = b.split()
                if len(t) == 2:
                    #b was both
                    b,c = t
                elif len(t) == 1:
                    #b is 5 or 7
                    print("either 5 or 7 is missing, picked 7 as missing")
                    c = None
                else:
                    #b was split into more than 2 parts
                    raise RuntimeError("unknow case 1")
        else:
            #3 and 4 are missing
            if 6 in pos:
                #a may be 2 or 5 or both, b is 7
                c = b
                t = a.split()
                if len(t) == 2:
                    #a was both
                    a,b = t
                elif len(t) == 1:
                    print("either 2 or 5 is missing, picked 5 as missing")
                    b = None
                else:
                    #a was split into more than 2 parts
                    raise RuntimeError("unknow case 2")
            else:
                raise RuntimeError("Fatal error: 2 data points with no marker in between")
        return dict(zip((2,5,7),(a,b,c)))
    elif len(data)==1:
        a = data[0]
        i = proc.index( (a,None) )
        #b,c = None, None
        if 3 in pos or 4 in pos:
            i34 = pos[3] if 3 in pos else pos[4]
            if 6 in pos:
                #only one of 2,5 or 7 are present
                if i < i34:
                    #a is 2 the rest is missing
                    b,c = None, None
                elif i < pos[6]:
                    #a is 5
                    a,b,c = None, a, None
                else:
                    #a is 7
                    a,b,c = None, None, a 
            else:
                #a is 2 or a is 5 or 7 or both
                if i < i34:
                    #a is 2, the rest is missing
                    b,c = None, None
                else:
                    #2 is missing, a is 5 or 7 or both 
                    a,b = None, a
                    t = b.split()
                    if len(t) == 2:
                        b,c = t
                    elif len(t) == 1:
                        print("either 5 or 7 is missing, picked 7 as missing")
                        c = None
                    else:
                        raise RuntimeError("unknow case 3")
        else:
            #3 and 4 are missing
            if 6 in pos:
                if pos[6] < i:
                    #a is 7, the rest is missing
                    a,b,c = None, None, a
                else:
                    #7 is missing, a is 2 or 5 or both
                    c = None
                    t = a.split()
                    if len(t) == 2:
                        a,b = t
                    elif len(t) == 1:
                        print("either 2 or 5 is missing, picked 5 as missing")
                        b = None
                    else:
                        raise RuntimeError("unknow case 4")
            else:
                #a is 2, 5 or 7 or any combination of them
                t = a.split()
                if len(t) == 3:
                    a,b,c = t
                elif len(t) == 2:
                    print("one of 2, 5 or 7 is missing, picked 7 as missing")
                    a,b = t
                    c = None
                elif len(t) == 1:
                    print("only one of 2, 5 or 7 is present, picked 2 as present")
                    b,c = None, None
                else:
                    raise RuntimeError("unknow case 5")
        return dict(zip((2,5,7),(a,b,c)))
    elif len(data) == 0:
        return dict.fromkeys( (2,5,7) )
    else:
        raise RuntimeError("unknow case 6: more than 3 data points")


def test():
    text="""1. A  $67  4. A  69.000kg  6. A  12sec  8. B  9. B
9B 8B 22 sec 6A 75.000kg 4b  $80 1b
10 Mrd 3: A 4: A  50 .379 6: A   7:19   8: B 9: D
9a 8b 10 6b 60000 4a 3 b 50
10 Mrd 4: A  50 .379 6: A   7:19   8: B 9: D
9a 8b 10 6b 60000 4a  50
9B 8B 22 sec 6A 75.000kg 4b 3b  $80 1b
1. A  $67  3. A    69.000kg  6. A  12sec  8. B  9. B
1. A  $67  3. A  4a  69.000kg   12sec  8. B  9. B
9a 8b 6b 4a 3 b 50 1b
9 a 8b 6 b 55 4a 3 b 1b
9a 8:b 777 6 b 4.a 3 b 1b
9a 8:b 777 6 b 4.a 3 b 55 1b
""".splitlines()

    for t in text:
        print(f"raw: {t!r}\nresult: ",extrator(t) )
        print()

谢谢你的及时回复。这是一个很好的反对意见。但是我让代码变得如此复杂，因为实际的字符串也要复杂得多。意思是：在不同但定义的位置有几个子字符串。此外，指示器的结构各不相同，即1x可以是“1a”或“1a”或“1:a”。只有号码是一样的。谢谢你的快速回复。请看我对另一个答案的评论。你的代码对于这个例子来说很好。但不幸的是，我有一个复杂的字符串在现实中。我希望你仍能找到解决办法。谢谢你的努力谢谢你的工作。我的例子似乎很容易说明我需要什么。对不起。我在问题中添加了字符串9和10作为现实字符串，以使其更加清晰。我在一步之前提取了

值=['1x'，3x'，4x']

。这就是为什么我使用startswith（）将其与字符串进行比较。也许这不是最好的办法。但对我来说，这接近最终解决方案。考虑到现实的字符串。：-）好的，你想从现实例子中得到什么结果？好的，我添加了现实例子的结果。记住，它应该像这样工作。对于字符串位置：2=>如果缺少位置3，则使用位置4作为指示。或者位置1缺少字符串的使用开始/结束。可选（如果可能）：对于位置5的字符串，类似：=>如果位置4缺失，则使用位置3作为指示，如果位置6缺失，则使用位置7。（但我想这很难解决）。感谢您的时间，您是否可以为其他每个案例及其预期结果添加现实的示例？稍后我会再试一次。是的，你说得对。我需要确定每个都是哪一个。类似结果：（$50 0 75.000kg 0）如果缺少值。但我明白你的意思。我得去肛门

def extrator(rawtext):
    fil  = filter(None,map(str.strip,re.split(POSRE,rawtext)))
    proc = [(x,int(x[0]) if re.match(POSRE,x) else None) for x in fil] #process raw data
    pos  = [p for x,p in proc if p is not None ] #position markers presents
    if sorted(pos)!=pos:
        proc = list(reversed(proc))        
    data = [x for x,p in proc if p is None]
    pos = {p:i for i,(x,p) in enumerate(proc) if p is not None } #pos marker:index of it
    #print(f"{proc=}")
    if len(data)==3:
        return dict(zip((2,5,7),data))
    #from here, a,b,c will represent data in position 2,5 and 7 respectively
    elif len(data)==2:
        a,b = data
        #c = None
        if 3 in pos or 4 in pos:
            if 6 in pos:
                #one of 2, 5 or 7 is missing
                i = proc.index( (a,None) )
                i34 = pos[3] if 3 in pos else pos[4]
                if i < i34:
                    #a is 2, b is 5 or 7
                    j = proc.index( (b,None) )
                    if j < pos[6]:
                        #7 is missing
                        c = None
                    else:
                        #5 is missing
                        b,c = None,b
                else:
                    #2 is missing, a is 5 thus b is 7
                    a,b,c = None,a,b
            else:
                #a is 2, b may be 5 or 7 or both
                t = b.split()
                if len(t) == 2:
                    #b was both
                    b,c = t
                elif len(t) == 1:
                    #b is 5 or 7
                    print("either 5 or 7 is missing, picked 7 as missing")
                    c = None
                else:
                    #b was split into more than 2 parts
                    raise RuntimeError("unknow case 1")
        else:
            #3 and 4 are missing
            if 6 in pos:
                #a may be 2 or 5 or both, b is 7
                c = b
                t = a.split()
                if len(t) == 2:
                    #a was both
                    a,b = t
                elif len(t) == 1:
                    print("either 2 or 5 is missing, picked 5 as missing")
                    b = None
                else:
                    #a was split into more than 2 parts
                    raise RuntimeError("unknow case 2")
            else:
                raise RuntimeError("Fatal error: 2 data points with no marker in between")
        return dict(zip((2,5,7),(a,b,c)))
    elif len(data)==1:
        a = data[0]
        i = proc.index( (a,None) )
        #b,c = None, None
        if 3 in pos or 4 in pos:
            i34 = pos[3] if 3 in pos else pos[4]
            if 6 in pos:
                #only one of 2,5 or 7 are present
                if i < i34:
                    #a is 2 the rest is missing
                    b,c = None, None
                elif i < pos[6]:
                    #a is 5
                    a,b,c = None, a, None
                else:
                    #a is 7
                    a,b,c = None, None, a 
            else:
                #a is 2 or a is 5 or 7 or both
                if i < i34:
                    #a is 2, the rest is missing
                    b,c = None, None
                else:
                    #2 is missing, a is 5 or 7 or both 
                    a,b = None, a
                    t = b.split()
                    if len(t) == 2:
                        b,c = t
                    elif len(t) == 1:
                        print("either 5 or 7 is missing, picked 7 as missing")
                        c = None
                    else:
                        raise RuntimeError("unknow case 3")
        else:
            #3 and 4 are missing
            if 6 in pos:
                if pos[6] < i:
                    #a is 7, the rest is missing
                    a,b,c = None, None, a
                else:
                    #7 is missing, a is 2 or 5 or both
                    c = None
                    t = a.split()
                    if len(t) == 2:
                        a,b = t
                    elif len(t) == 1:
                        print("either 2 or 5 is missing, picked 5 as missing")
                        b = None
                    else:
                        raise RuntimeError("unknow case 4")
            else:
                #a is 2, 5 or 7 or any combination of them
                t = a.split()
                if len(t) == 3:
                    a,b,c = t
                elif len(t) == 2:
                    print("one of 2, 5 or 7 is missing, picked 7 as missing")
                    a,b = t
                    c = None
                elif len(t) == 1:
                    print("only one of 2, 5 or 7 is present, picked 2 as present")
                    b,c = None, None
                else:
                    raise RuntimeError("unknow case 5")
        return dict(zip((2,5,7),(a,b,c)))
    elif len(data) == 0:
        return dict.fromkeys( (2,5,7) )
    else:
        raise RuntimeError("unknow case 6: more than 3 data points")


def test():
    text="""1. A  $67  4. A  69.000kg  6. A  12sec  8. B  9. B
9B 8B 22 sec 6A 75.000kg 4b  $80 1b
10 Mrd 3: A 4: A  50 .379 6: A   7:19   8: B 9: D
9a 8b 10 6b 60000 4a 3 b 50
10 Mrd 4: A  50 .379 6: A   7:19   8: B 9: D
9a 8b 10 6b 60000 4a  50
9B 8B 22 sec 6A 75.000kg 4b 3b  $80 1b
1. A  $67  3. A    69.000kg  6. A  12sec  8. B  9. B
1. A  $67  3. A  4a  69.000kg   12sec  8. B  9. B
9a 8b 6b 4a 3 b 50 1b
9 a 8b 6 b 55 4a 3 b 1b
9a 8:b 777 6 b 4.a 3 b 1b
9a 8:b 777 6 b 4.a 3 b 55 1b
""".splitlines()

    for t in text:
        print(f"raw: {t!r}\nresult: ",extrator(t) )
        print()

>>> test()
raw: '1. A  $67  4. A  69.000kg  6. A  12sec  8. B  9. B'
result:  {2: '$67', 5: '69.000kg', 7: '12sec'}

raw: '9B 8B 22 sec 6A 75.000kg 4b  $80 1b'
result:  {2: '$80', 5: '75.000kg', 7: '22 sec'}

raw: '10 Mrd 3: A 4: A  50 .379 6: A   7:19   8: B 9: D'
result:  {2: '10 Mrd', 5: '50 .379', 7: '7:19'}

raw: '9a 8b 10 6b 60000 4a 3 b 50'
result:  {2: '50', 5: '60000', 7: '10'}

raw: '10 Mrd 4: A  50 .379 6: A   7:19   8: B 9: D'
result:  {2: '10 Mrd', 5: '50 .379', 7: '7:19'}

raw: '9a 8b 10 6b 60000 4a  50'
result:  {2: '50', 5: '60000', 7: '10'}

raw: '9B 8B 22 sec 6A 75.000kg 4b 3b  $80 1b'
result:  {2: '$80', 5: '75.000kg', 7: '22 sec'}

raw: '1. A  $67  3. A    69.000kg  6. A  12sec  8. B  9. B'
result:  {2: '$67', 5: '69.000kg', 7: '12sec'}

raw: '1. A  $67  3. A  4a  69.000kg   12sec  8. B  9. B'
result:  {2: '$67', 5: '69.000kg', 7: '12sec'}

raw: '9a 8b 6b 4a 3 b 50 1b'
result:  {2: '50', 5: None, 7: None}

raw: '9 a 8b 6 b 55 4a 3 b 1b'
result:  {2: None, 5: '55', 7: None}

raw: '9a 8:b 777 6 b 4.a 3 b 1b'
result:  {2: None, 5: None, 7: '777'}

raw: '9a 8:b 777 6 b 4.a 3 b 55 1b'
result:  {2: '55', 5: None, 7: '777'}

>>>