Python正则表达式辅助
对不起,这是一般性的标题 我有以下案文:Python正则表达式辅助,python,regex,python-3.x,Python,Regex,Python 3.x,对不起,这是一般性的标题 我有以下案文: ----------------------------------------------- One Errors ------------------------------------------------------ VALUES1 64 0 0.00 VALUES2 0 0 0.00 VALUES3
----------------------------------------------- One Errors ------------------------------------------------------
VALUES1 64 0 0.00
VALUES2 0 0 0.00
VALUES3 2535 0 0.00
VALUES4 0 0 0.00
ALL 2674 2674 100.00 5.31 3.60 4.70 5.70 8.30 10.80 20.90 27.50 31.10 36.53 [Free Text]
-----------------------------------------------Two Errors ------------------------------------------------------
VALUES1 64 0 0.00
VALUES2 0 0 0.00
VALUES3 2535 0 0.00
VALUES4 0 0 0.00
ALL 2674 0 0.00
-----------------------------------------------Three Errors ------------------------------------------------------
VALUES1 64 0 0.00
VALUES2 0 0 0.00
VALUES3 2535 0 0.00
VALUES4 0 0 0.00
ALL 2674 2674 100.00 1.51 0.70 1.10 1.60 3.30 4.50 5.40 6.40 9.50 12.17 [Free Text]
-----------------------------------------------Four Errors ------------------------------------------------------
VALUES1 64 0 0.00
VALUES2 0 0 0.00
VALUES3 2535 0 0.00
VALUES4 0 0 0.00
ALL 2674 2674 100.00 0.34 0.10 0.17 0.27 0.67 1.10 1.48 1.97 2.32 3.12 [Free Text]
-----------------------------------------------Five Errors ------------------------------------------------------
VALUES1 64 0 0.00
VALUES2 0 0 0.00
VALUES3 2535 0 0.00
VALUES4 0 0 0.00
ALL 2674 0 0.00
VALUES1 64 0 0.00
VALUES2 0 0 0.00
VALUES3 2535 0 0.00
VALUES4 0 0 0.00
ALL 2674 2674 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 [Free Text]
VALUES1 64 0 0.00
VALUES2 0 0 0.00
VALUES3 2535 0 0.00
VALUES4 0 0 0.00
ALL 2674 0 0.00
正如您在某些情况下看到的,我有一行(以ALL开头)包含14列
有时只有3个(不包括“全部”值)
我需要从每一行中获取第3、6、8、10、13列,这些列以“ALL”开头,并包含节中的值
例如:
在“一个错误”部分:100.00、4.70、8.30、20.90、36.53
在“两个错误”部分:0.00,无,无,无,无
我试着用
在“五个错误”部分:100.00,0.00,0.00,0.00,0.00
我试图使用这个正则表达式:
Tow Errors[\s\S]*?ALL\s+\S+\s+\S+\s+(\S+)\s+\S+\s+\S+\s+(\S+)\s+\S+\s+(\S+)\s+\S+\s+(\S+)\s+\S+\s+\S+\s+(\S+).*?$
而且
Tow Errors[\s\S]*?ALL\s+[0-9\.]+?+\s+[0-9\.]+?\s+([0-9\.]+?)\s+[0-9\.]+?\s+[0-9\.]+?\s+([0-9\.]+?)\s+[0-9\.]+?\s+([0-9\.]+?)\s+[0-9\.]+?\s+([0-9\.]+?)\s+[0-9\.]+?\s+[0-9\.]+?\s+([0-9\.]+?).*?$
很明显,我做错了什么,需要你的建议
谢谢:)不带正则表达式。使用简单的迭代 Ex:
result = []
with open(filename) as infile:
for line in infile: #Iterate Each line
line = line.strip() #Strip start and end space
if line.startswith("ALL"): #Check if line starts with "ALL"
temp = []
val = line.split() #Split by space
temp.append(val[3]) #Get Required values
for i in [6, 8, 10, 13]:
try:
temp.append(val[i])
except:
temp.append(None)
result.append(temp)
print(result)
[['100.00', '4.70', '8.30', '20.90', '36.53'],
['0.00', None, None, None, None],
['100.00', '1.10', '3.30', '5.40', '12.17'],
['100.00', '0.17', '0.67', '1.48', '3.12'],
['0.00', None, None, None, None],
['100.00', '0.00', '0.00', '0.00', '0.00'],
['0.00', None, None, None, None]]
输出:
result = []
with open(filename) as infile:
for line in infile: #Iterate Each line
line = line.strip() #Strip start and end space
if line.startswith("ALL"): #Check if line starts with "ALL"
temp = []
val = line.split() #Split by space
temp.append(val[3]) #Get Required values
for i in [6, 8, 10, 13]:
try:
temp.append(val[i])
except:
temp.append(None)
result.append(temp)
print(result)
[['100.00', '4.70', '8.30', '20.90', '36.53'],
['0.00', None, None, None, None],
['100.00', '1.10', '3.30', '5.40', '12.17'],
['100.00', '0.17', '0.67', '1.48', '3.12'],
['0.00', None, None, None, None],
['100.00', '0.00', '0.00', '0.00', '0.00'],
['0.00', None, None, None, None]]
试试这个:
regex = 'ALL(?:[ \t]+\S+){2}([ \t]+\S+)?(?:(?:[ \t]+\S+){2}([ \t]+\S+)?)?(?:(?:[
\t]+\S+)([ \t]+\S+)?)?(?:(?:[ \t]+\S+)([ \t]+\S+)?)?(?:(?:[ \t]+\S+){2}([ \t]+\S+)?)?'
re.findall(regex, string)