Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/string/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 按分词对字符串排序_Python_String_File_Sorting - Fatal编程技术网

Python 按分词对字符串排序

Python 按分词对字符串排序,python,string,file,sorting,Python,String,File,Sorting,该程序使用如下文本处理日志文件。 请帮助了解如何打印组件列表(在日期和时间之后),并根据日志中消息的重要性(第一个单词)进行排列。 例如,组件A应该在组件B之前的列表中,如果它有更多具有最重要级别的消息 ERROR - 2015 Dec 28 14:48:30 - unfulminating_deacon - 55 - airtightly unintelligently appropriable arlen INFO - 2015 Dec 28 02:02:56 - mangie

该程序使用如下文本处理日志文件。 请帮助了解如何打印组件列表(在日期和时间之后),并根据日志中消息的重要性(第一个单词)进行排列。

例如,组件A应该在组件B之前的列表中,如果它有更多具有最重要级别的消息

   ERROR - 2015 Dec 28 14:48:30 - unfulminating_deacon - 55 - airtightly unintelligently appropriable arlen
   INFO - 2015 Dec 28 02:02:56 - mangiest_ima - 144 - overrealistically decadently unfierce edris
   CRITICAL - 2015 Dec 27 20:04:02 - unanticipated_konnor - 44 - amusively sensationally turbanlike rico
   INFO - 2015 Dec 28 08:12:06 - unfulminating_deacon - 123 - eruptively nonmodally sebacic shavonda
   CRITICAL - 2015 Dec 28 08:04:27 - unanticipated_konnor - 1213 - unchastely priorly monophyletic cullen
   ERROR - 2015 Dec 28 07:39:36 - furnacelike_marlene - 1414 - healthfully flinchingly unbombastic slyvia
   DEBUG - 2015 Dec 27 16:44:47 - mangiest_ima - 144 - questingly substitutionally uncompensative jen
   ERROR - 2015 Dec 26 17:49:26 - furnacelike_marlene - 1414 - healthfully flinchingly unbombastic slyvia

EXPECTED OUTPUT:
unanticipated_konnor 
furnacelike_marlene 
unfulminating_deacon 
mangiest_ima 
我已经编写了一些代码来计算组件的消息频率,但我不确定它是否有帮助:

from collections import Counter
file = open('C:\\Users\\User\\Downloads\\tasks\\logs\\1.txt', "r+")
warnList = []
for line in file:
    warnList.append(line.split(' - ')[2])
res1 = dict(Counter(warnList))
print "Frequency of messages for components: {} \n".format(res1)
file.close()
我们将高度赞赏每一项建议

希望得到你的帮助或建议

提前感谢,


关于

我不确定我是否正确理解了您的问题,但如果您想按重要性对日志文件进行排序,请尝试以下方法:

from __future__ import print_function

import re
import operator
import collections
import pprint as pp

importance = {
    'CRITICAL': 0,
    'ERROR': 100,
    'INFO': 200,
    'DEBUG': 300
}

with open('log.log', 'r') as f:
    data = f.read().splitlines() 

parsed = collections.OrderedDict()

for line in data:
    cols = re.split(r'\s+\-\s+', line)
    parsed[line] = importance[cols[0]]

for k,v in sorted(parsed.items(), key=operator.itemgetter(1)):
    print(k)    
输出:

CRITICAL - 2015 Dec 27 20:04:02 - unanticipated_konnor - 44 - amusively sensationally turbanlike rico
CRITICAL - 2015 Dec 28 08:04:27 - unanticipated_konnor - 1213 - unchastely priorly monophyletic cullen
ERROR - 2015 Dec 28 14:48:30 - unfulminating_deacon - 55 - airtightly unintelligently appropriable arlen
ERROR - 2015 Dec 28 07:39:36 - furnacelike_marlene - 1414 - healthfully flinchingly unbombastic slyvia
ERROR - 2015 Dec 26 17:49:26 - furnacelike_marlene - 1414 - healthfully flinchingly unbombastic slyvia
INFO - 2015 Dec 28 02:02:56 - mangiest_ima - 144 - overrealistically decadently unfierce edris
INFO - 2015 Dec 28 08:12:06 - unfulminating_deacon - 123 - eruptively nonmodally sebacic shavonda
DEBUG - 2015 Dec 27 16:44:47 - mangiest_ima - 144 - questingly substitutionally uncompensative jen
unanticipated_konnor
furnacelike_marlene
unfulminating_deacon
mangiest_ima
如果不是你想要的,请说明你需要什么

如果只需要第三列:

from __future__ import print_function

import re
import operator
import collections
import pprint as pp

importance = {
    'CRITICAL': 0,
    'ERROR': 100,
    'INFO': 200,
    'DEBUG': 300
}

with open('log.log', 'r') as f:
    data = f.read().splitlines() 

parsed = collections.OrderedDict()

for line in data:
    cols = re.split(r'\s+\-\s+', line)
    parsed[cols[2]] = importance[cols[0]]

for k,v in sorted(parsed.items(), key=operator.itemgetter(1)):
    print(k)    
输出:

CRITICAL - 2015 Dec 27 20:04:02 - unanticipated_konnor - 44 - amusively sensationally turbanlike rico
CRITICAL - 2015 Dec 28 08:04:27 - unanticipated_konnor - 1213 - unchastely priorly monophyletic cullen
ERROR - 2015 Dec 28 14:48:30 - unfulminating_deacon - 55 - airtightly unintelligently appropriable arlen
ERROR - 2015 Dec 28 07:39:36 - furnacelike_marlene - 1414 - healthfully flinchingly unbombastic slyvia
ERROR - 2015 Dec 26 17:49:26 - furnacelike_marlene - 1414 - healthfully flinchingly unbombastic slyvia
INFO - 2015 Dec 28 02:02:56 - mangiest_ima - 144 - overrealistically decadently unfierce edris
INFO - 2015 Dec 28 08:12:06 - unfulminating_deacon - 123 - eruptively nonmodally sebacic shavonda
DEBUG - 2015 Dec 27 16:44:47 - mangiest_ima - 144 - questingly substitutionally uncompensative jen
unanticipated_konnor
furnacelike_marlene
unfulminating_deacon
mangiest_ima

这正是我所需要的。非常感谢你的帮助!