Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/357.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 使用多行匹配条件改进日志文件解析器_Java_Java 8_Text Parsing_Linkedhashmap_Logfile Analysis - Fatal编程技术网

Java 使用多行匹配条件改进日志文件解析器

Java 使用多行匹配条件改进日志文件解析器,java,java-8,text-parsing,linkedhashmap,logfile-analysis,Java,Java 8,Text Parsing,Linkedhashmap,Logfile Analysis,给定一个有点奇怪的日志文件,由以下代码段表示: FILE (insert): file=Templates\xyz_EN_0615.pdf key=KEY_EN_AP_PAID FILE (insert): file=Templates\xyz_DE_0615.pdf key=KEY_DE_STD_PAID FILE (insert): file=Templates\xyz_DE_0615_free.pdf key=KEY_DE_STD_FREE FILE (insert): file=Temp

给定一个有点奇怪的日志文件,由以下代码段表示:

FILE (insert): file=Templates\xyz_EN_0615.pdf key=KEY_EN_AP_PAID
FILE (insert): file=Templates\xyz_DE_0615.pdf key=KEY_DE_STD_PAID
FILE (insert): file=Templates\xyz_DE_0615_free.pdf key=KEY_DE_STD_FREE
FILE (insert): file=Templates\xyz_IT_0615.pdf key=KEY_IT_STD_PAID
FILE (insert): file=Templates\xyz_IT_0615_free.pdf key=KEY_IT_STD_FREE
DEBUG: Opening Migration\abc_1.pdf
DEBUG: Opening Templates\xyz_DE_0615_kostenlos.pdf
Jul 31, 2015 5:07:54 PM java.util.prefs.WindowsPreferences <init>
WARNUNG: Could not open/create prefs root node Software\JavaSoft\Prefs at root 0x80000002. Windows RegCreateKeyEx(...) returned error code 5.
Jul 31, 2015 5:07:55 PM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNUNG: Using fallback font ArialMT for base font ZapfDingbats
DEBUG: Writing Migration\abc_1-migrated.pdf
PERFORMANCE: [OVERALL completed in 2303ms]
DEBUG: Opening Migration\abc_2_DE.pdf
DEBUG: Opening Templates\xyz_DE_0615_free.pdf
Field not available: Reset_1
Field not available: Print
DEBUG: Writing Migration\abc_2_DE-migrated.pdf
PERFORMANCE: [OVERALL completed in 756ms]
DEBUG: Opening Migration\abc_3_DE.pdf
DEBUG: Opening Templates\xyz_DE_0615_free.pdf
DEBUG: Writing Migration\abc_3-migrated.pdf
PERFORMANCE: [OVERALL completed in 660ms]
DEBUG: Opening Migration\abc_4.pdf
DEBUG: Opening Templates\xyz_EN_0615_free.pdf
null
DEBUG: Opening Migration\abc_5.pdf
DEBUG: Opening Templates\xyz_EN_0615_free.pdf
null
DEBUG: Opening Migration\abc_6_DE.pdf
DEBUG: Opening Templates\xyz_DE_0615_free.pdf
Field not available: Text6
Field not available: Text7
Field not available: Text8
Field not available: Text9
Field not available: Text10
Field not available: Text11
DEBUG: Writing Migration\abc_6-migrated.pdf
PERFORMANCE: [OVERALL completed in 686ms]
null
%EOF
在最后的4元组之间可以有任意数量的行,可以跳过这些行,也可以将它们添加到无效日志项列表中。简单的选择标准硬编码到下面的代码中

接下来,日志文件应随后被拆分为有效条目和无效条目,包括行号。针对上述示例运行的当前程序的输出将输出:

Statistics: Valid[tuples]=4 Valid[lines]=16 Invalid[lines]=8 Skipped[lines]=17 Total[lines]=41
----------------------[VALID]----------------------
key=6 value=DEBUG: Opening Migration\abc_1.pdf
key=7 value=DEBUG: Opening Templates\xyz_DE_0615_kostenlos.pdf
key=12 value=DEBUG: Writing Migration\abc_1-migrated.pdf
key=13 value=PERFORMANCE: [OVERALL completed in 2303ms]
key=14 value=DEBUG: Opening Migration\abc_2_DE.pdf
key=15 value=DEBUG: Opening Templates\xyz_DE_0615_free.pdf
key=18 value=DEBUG: Writing Migration\abc_2_DE-migrated.pdf
key=19 value=PERFORMANCE: [OVERALL completed in 756ms]
key=20 value=DEBUG: Opening Migration\abc_3_DE.pdf
key=21 value=DEBUG: Opening Templates\xyz_DE_0615_free.pdf
key=22 value=DEBUG: Writing Migration\abc_3-migrated.pdf
key=23 value=PERFORMANCE: [OVERALL completed in 660ms]
key=30 value=DEBUG: Opening Migration\abc_6_DE.pdf
key=31 value=DEBUG: Opening Templates\xyz_DE_0615_free.pdf
key=38 value=DEBUG: Writing Migration\abc_6-migrated.pdf
key=39 value=PERFORMANCE: [OVERALL completed in 686ms]
----------------------[VALID]----------------------
----------------------[INVALID]----------------------
key=24 value=DEBUG: Opening Migration\abc_4.pdf
key=25 value=DEBUG: Opening Templates\xyz_EN_0615_free.pdf
key=26 value=null
key=27 value=DEBUG: Opening Migration\abc_5.pdf
key=28 value=DEBUG: Opening Templates\xyz_EN_0615_free.pdf
key=29 value=null
key=40 value=null
key=41 value=%EOF
----------------------[INVALID]----------------------
我的做法如下:

import org.testng.annotations.Test;

import java.io.*;
import java.util.ArrayList;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;

public class AnalyseMigrationLog {

    public class RingMap<K, V> extends LinkedHashMap<K, V> {
        private int cacheSize;

        public RingMap(int cacheSize) {
            super(cacheSize);
            this.cacheSize = cacheSize;
        }

        @Override
        protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
            return size() > cacheSize;
        }
    }

    @Test
    public void doAnalysis() throws IOException {
        final String logfile = "./run-simple.log";
        final int ringSize = 4;
        int lc = 0;
        int skipped = 0;
        Long count;
        String line;
        Map<Integer, String> circularFifo = new RingMap<>(ringSize);
        Map<Integer, String> validTuples = new LinkedHashMap<>();
        Map<Integer, String> invalidTuples = new LinkedHashMap<>();

        FileReader     fre = new FileReader(logfile);
        BufferedReader bre = new BufferedReader(fre);
        while ((line = bre.readLine ()) != null) {
            lc++;
            if (line.matches("^(FILE \\(insert\\):|WARNUNG|Field not available).*") || line.endsWith("<init>")) {
                skipped++;
                continue;
            }
            circularFifo.put(lc, line);
            if (circularFifo.size() < ringSize)
                continue;

            count = circularFifo.values().stream().
                    filter(p -> p.matches("^(DEBUG: Opening|DEBUG: Writing|PERFORMANCE:).*")).count();

            // Get the LRU entry in the circular fifo
            List<Map.Entry<Integer, String>> entryList = new ArrayList<>(circularFifo.entrySet());
            Map.Entry<Integer, String> lastEntry = entryList.get(entryList.size() - 1);

            if (count == ringSize && lastEntry.getValue().startsWith("PERFORMANCE:")) {
                validTuples.putAll(circularFifo);
                // Remove already pushed entries from invalidTuples list to avoid duplicate entries
                circularFifo.forEach((key, value) -> invalidTuples.remove(key));
                circularFifo.clear();
            } else {
                invalidTuples.putAll(circularFifo);
            }
        }
        // Put in the last entries that didn't fill up the circular fifo anymore.
        invalidTuples.putAll(circularFifo);
        bre.close();
        fre.close();

        System.out.printf("Statistics: Valid[tuples]=%s Valid[lines]=%s Invalid[lines]=%s Skipped[lines]=%s Total[lines]=%s%n",
                validTuples.size()/ringSize, validTuples.size(), invalidTuples.size(), skipped, lc);

        System.out.printf("----------------------[VALID]----------------------%n");
        validTuples.forEach((key, value) -> System.out.printf("key=%s value=%s%n", key, value));
        System.out.printf("----------------------[VALID]----------------------%n");

        System.out.printf("----------------------[INVALID]----------------------%n");
        invalidTuples.forEach((key, value) -> System.out.printf("key=%s value=%s%n", key, value));
        System.out.printf("----------------------[INVALID]----------------------%n");
    }
}
import org.testng.annotations.Test;
导入java.io.*;
导入java.util.ArrayList;
导入java.util.LinkedHashMap;
导入java.util.List;
导入java.util.Map;
公共类分析日志{
公共类RingMap扩展了LinkedHashMap{
私有int缓存大小;
公共环映射(int cacheSize){
超级(缓存大小);
this.cacheSize=cacheSize;
}
@凌驾
受保护的布尔重构(Map.Entry最早){
返回size()>cacheSize;
}
}
@试验
public void doAnalysis()引发IOException{
最后一个字符串logfile=“./run simple.log”;
最终整数环大小=4;
int lc=0;
int=0;
长计数;
弦线;
Map circularFifo=新环映射(环大小);
Map validTuples=新建LinkedHashMap();
Map invalidTuples=新LinkedHashMap();
FileReader fre=新的FileReader(日志文件);
BufferedReader bre=新的BufferedReader(fre);
而((line=bre.readLine())!=null){
lc++;
if(line.matches(“^(文件\\(插入\\):|警告|字段不可用)。*”)| | line.endsWith(“”){
跳过++;
继续;
}
循环投标书(信用证,行);
if(circularFifo.size()p.matches(^(调试:打开|调试:写入|性能:)*)).count();
//在循环fifo中获取LRU条目
List entryList=newarraylist(circularFifo.entrySet());
Map.Entry lastEntry=entryList.get(entryList.size()-1);
if(count==ringSize&&lastEntry.getValue().startsWith(“性能:”){
有效数。putAll(circularFifo);
//从invalidTuples列表中删除已推送的条目,以避免重复条目
forEach((键,值)->invalidTuples.remove(键));
循环ifo.clear();
}否则{
无效元组。putAll(circularFifo);
}
}
//输入最后一个不再填充循环fifo的条目。
无效元组。putAll(circularFifo);
bre.close();
fre.close();
System.out.printf(“统计信息:有效[元组]=%s有效[行]=%s无效[行]=%s跳过[行]=%s总计[行]=%s%n”,
validTuples.size()/ringSize,validTuples.size(),invalidTuples.size(),跳过,lc);
System.out.printf(“-------------------------[VALID]-------------------------%n”);
validTuples.forEach((键,值)->System.out.printf(“键=%s值=%s%n”,键,值));
System.out.printf(“-------------------------[VALID]-------------------------%n”);
System.out.printf(“-------------------------[无效]--------------------------------------------------%n”);
invalidTuples.forEach((键,值)->System.out.printf(“键=%s值=%s%n”,键,值));
System.out.printf(“-------------------------[无效]--------------------------------------------------%n”);
}
}
基本技巧是为此任务引入循环fifo。虽然简短、快速且运行良好,但我想知道是否可以将其更充分地转换为Java-8特性,比如使用NIO2和适当的流技术。我不想使用Guava或任何其他过度设计的库来完成如此简单的任务

现在,我特别不喜欢像上面那样获取LRU条目的解决方案。我将如何扩展和使用内部类,并使用以下内容:

public class RingMap<K, V> extends LinkedHashMap<K, V> {
    private int cacheSize;

    public RingMap(int cacheSize) {
        super(cacheSize);
        this.cacheSize = cacheSize;
    }

    @Override
    protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
        return size() > cacheSize;
    }

    //TODO: how exactly would this work?
    public <K, V> Map.Entry<K,V> getLast(LinkedHashMap<K, V> map) {
        Map.Entry<K, V> result = null;
        for (Map.Entry<K, V> kvEntry : map.entrySet()) {
            result = kvEntry;
        }
        return result;
    }
}
@Test
public void doAnalysisNIO2() throws IOException {
    final String logfile = "./run-simple.log";

    Path path = Paths.get(logfile);
    try (Stream<String> filteredLines = Files.lines(path, StandardCharsets.UTF_8)
            .onClose(() -> System.out.println("Stream has been closed!"))
            .filter(s -> !(s.matches("^(FILE \\(insert\\):|WARNUNG|Field not available).*") ||
                           s.endsWith("<init>")))) {
        // Do the same thing as in the other code
        filteredLines.forEach((l) -> System.out.printf("line = %s%n", l));
    }
}
公共类RingMap扩展了LinkedHashMap{
私有int缓存大小;
公共环映射(int cacheSize){
超级(缓存大小);
this.cacheSize=cacheSize;
}
@凌驾
受保护的布尔重构(Map.Entry最早){
返回size()>cacheSize;
}
//托多:这到底是怎么回事?
publicmap.Entry getLast(LinkedHashMap映射){
Map.Entry result=null;
对于(Map.Entry kvEntry:Map.entrySet()){
结果=kvEntry;
}
返回结果;
}
}
接下来,我真的很想利用NIO2特性,但是我不明白如何才能最好地将它们集成到我的解决方案中。大致如下:

public class RingMap<K, V> extends LinkedHashMap<K, V> {
    private int cacheSize;

    public RingMap(int cacheSize) {
        super(cacheSize);
        this.cacheSize = cacheSize;
    }

    @Override
    protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
        return size() > cacheSize;
    }

    //TODO: how exactly would this work?
    public <K, V> Map.Entry<K,V> getLast(LinkedHashMap<K, V> map) {
        Map.Entry<K, V> result = null;
        for (Map.Entry<K, V> kvEntry : map.entrySet()) {
            result = kvEntry;
        }
        return result;
    }
}
@Test
public void doAnalysisNIO2() throws IOException {
    final String logfile = "./run-simple.log";

    Path path = Paths.get(logfile);
    try (Stream<String> filteredLines = Files.lines(path, StandardCharsets.UTF_8)
            .onClose(() -> System.out.println("Stream has been closed!"))
            .filter(s -> !(s.matches("^(FILE \\(insert\\):|WARNUNG|Field not available).*") ||
                           s.endsWith("<init>")))) {
        // Do the same thing as in the other code
        filteredLines.forEach((l) -> System.out.printf("line = %s%n", l));
    }
}
@测试
public void doAnalysisNIO2()引发IOException{
最后一个字符串logfile=“./run simple.log”;
Path Path=Path.get(日志文件);
try(streamfilteredlines=Files.lines(路径,StandardCharsets.utf8)
.onClose(()->System.out.println(“流已关闭!”)
.filter(s->!(s.matches(^(文件\\(插入\\)):|警告|字段不可用)。*)||
s、 endsWith(“”))){
//执行与其他代码中相同的操作
filteredLines.forEach((l)->System.out.printf(“行=%s%n”,l));
}
}