Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/typescript/9.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 正则表达式重复模式_Python_Regex - Fatal编程技术网

Python 正则表达式重复模式

Python 正则表达式重复模式,python,regex,Python,Regex,我尝试使用正则表达式从下面的日志中捕获数据组。模式是 <item> : <key> = <value> , <key> = <value>, ..., <key> = <value> 20141207,07:15:52,0,>>比率:出纳#=30, 数值=2.579,单位=比率,误差=N 20141207,07:15:52,0,>>比率: 出纳#=31,值=4.509,单位=比率,误差=N 20141207,0

我尝试使用正则表达式从下面的日志中捕获数据组。模式是

<item> : <key> = <value> , <key> = <value>, ..., <key> = <value>
20141207,07:15:52,0,>>比率:出纳#=30, 数值=2.579,单位=比率,误差=N 20141207,07:15:52,0,>>比率: 出纳#=31,值=4.509,单位=比率,误差=N 20141207,07:15:52,0,>>比率:出纳#=32, 数值=3.735,单位=比率,误差=N 20141207,07:15:52,0,>>比率: 出纳员#=33,值=2.401,单位=比率,误差=N

20141207,07:15:52,0,>>客户:收银员#=30,价值=50,单位=计数 20141207,07:15:52,0,>>客户:收银员#=31,价值=6,单位=计数 20141207,07:15:52,0,>>客户:收银员#=32,价值=88,单位=计数 20141207,07:15:52,0,>>客户:收银员#=33,价值=33,单位=计数

显然,结果并非预期的那样。有人能给我一些提示吗?我最终使用python来翻译代码。谢谢。

(?>)(\w+):|([\w+)+)\s*=\s*(\s+)(?:,|\s)
(?<=>>)(\w+):|([\w#]+)\s*=\s*(\S+?)(?:,|\s)
试试这个。抓拍。看演示

节点说明
--------------------------------------------------------------------------------
(?>                       '>>'
--------------------------------------------------------------------------------
)回头看
--------------------------------------------------------------------------------
(组和捕获到\1:
--------------------------------------------------------------------------------
\w+字字符(a-z,a-z,0-9,41;)(1或
更多次(与最多金额匹配)
(可能的)
--------------------------------------------------------------------------------
)结束\1
--------------------------------------------------------------------------------
:                        ':'
--------------------------------------------------------------------------------
|或
--------------------------------------------------------------------------------
(分组并捕获到\2:
--------------------------------------------------------------------------------
[\w#]+任意字符:单词字符(a-z,
A-Z,0-9,,,“#”(1次或更多次)
(匹配尽可能多的金额)
--------------------------------------------------------------------------------
)结束\2
--------------------------------------------------------------------------------
\s*空格(\n、\r、\t、\f和“”)(0或
更多次(与最多金额匹配)
(可能的)
--------------------------------------------------------------------------------
=                        '='
--------------------------------------------------------------------------------
\s*空格(\n、\r、\t、\f和“”)(0或
更多次(与最多金额匹配)
(可能的)
--------------------------------------------------------------------------------
(分组并捕获到\3:
--------------------------------------------------------------------------------
\S+?非空白(除\n、\r、\t、\f、,
和“”)(1次或多次(与
尽可能少的金额)
--------------------------------------------------------------------------------
)结束\3
--------------------------------------------------------------------------------
(?:组,但不捕获:
--------------------------------------------------------------------------------
,                        ','
--------------------------------------------------------------------------------
|或
--------------------------------------------------------------------------------
\s空格(\n、\r、\t、\f和“”)
--------------------------------------------------------------------------------
)分组结束

您的文件是csv文件,因此您可以更轻松地使用csv模块:

import csv

f = open('data.txt', 'rb')

for row in csv.reader(f, delimiter=','):
    if row:
        item, key_and_val = row[3].split(':')
        item = item[2:]
        key, val = key_and_val.split('=')

        print item
        print '    {} => {}'.format(key.strip(), val.strip())

        for key_and_val in row[4:]:
            key, val = key_and_val.split('=')
            print '    {} => {}'.format(key.strip(), val.strip())

--output:--
RATIO
    casher# => 30
    Value => 2.579
    Units => ratio
    Error => N
RATIO
    casher# => 31
    Value => 4.509
    Units => ratio
    Error => N
RATIO
    casher# => 32
    Value => 3.735
    Units => ratio
    Error => N
RATIO
    casher# => 33
    Value => 2.401
    Units => ratio
    Error => N
CUSTOMER
    casher# => 30
    Value => 50
    Units => count
CUSTOMER
    casher# => 31
    Value => 6
    Units => count
CUSTOMER
    casher# => 32
    Value => 88
    Units => count
CUSTOMER
    casher# => 33
    Value => 33
    Units => count
您的匹配模式也匹配key=value,即使“item:”不匹配 存在,是否有任何高级技术来排除那些key=value

以下内容将跳过没有项目的行:

for row in csv.reader(f, delimiter=','):
    if row:
        if row[3].startswith('>>'):  #Check if there is an item
            item, key_and_val = row[3].split(': ')
            item = item[2:]
            key, val = key_and_val.split('=')
            print item
            print '    {} => {}'.format(key.strip(), val.strip())

            for key_and_val in row[4:]:
                key, val = key_and_val.split('=')
                print '    {} => {}'.format(key.strip(), val.strip())

f.close()

我不确定问题是什么,但不能用一个正则表达式捕获所有的
key=value
对。不管怎么说,不要分组。我爱你@vks,我很虚弱,请你解释一下这个表达好吗?非常感谢你。我接受了ans。非常感谢。谢谢你的解释和教学。您的匹配模式也匹配key=value,即使“item:”不存在,是否有任何高级技术可以排除这些key=value行?无论如何,你的表达式已经足够了。谢谢@7stud,我会在将它们解压缩到csv文件后使用你的代码^^"
import csv

f = open('data.txt', 'rb')

for row in csv.reader(f, delimiter=','):
    if row:
        item, key_and_val = row[3].split(':')
        item = item[2:]
        key, val = key_and_val.split('=')

        print item
        print '    {} => {}'.format(key.strip(), val.strip())

        for key_and_val in row[4:]:
            key, val = key_and_val.split('=')
            print '    {} => {}'.format(key.strip(), val.strip())

--output:--
RATIO
    casher# => 30
    Value => 2.579
    Units => ratio
    Error => N
RATIO
    casher# => 31
    Value => 4.509
    Units => ratio
    Error => N
RATIO
    casher# => 32
    Value => 3.735
    Units => ratio
    Error => N
RATIO
    casher# => 33
    Value => 2.401
    Units => ratio
    Error => N
CUSTOMER
    casher# => 30
    Value => 50
    Units => count
CUSTOMER
    casher# => 31
    Value => 6
    Units => count
CUSTOMER
    casher# => 32
    Value => 88
    Units => count
CUSTOMER
    casher# => 33
    Value => 33
    Units => count
for row in csv.reader(f, delimiter=','):
    if row:
        if row[3].startswith('>>'):  #Check if there is an item
            item, key_and_val = row[3].split(': ')
            item = item[2:]
            key, val = key_and_val.split('=')
            print item
            print '    {} => {}'.format(key.strip(), val.strip())

            for key_and_val in row[4:]:
                key, val = key_and_val.split('=')
                print '    {} => {}'.format(key.strip(), val.strip())

f.close()