hadoop中的数据包计数(使用Mapreduce)

hadoop中的数据包计数(使用Mapreduce),hadoop,mapreduce,packet-capture,snort,hping,Hadoop,Mapreduce,Packet Capture,Snort,Hping,事情已经完成: 从以下链接安装Hadoop: 已安装Hping3以使用以下命令生成洪水请求: sudo hping3 -c 10000 -d 120 -S -w 64 -p 8000 --flood --rand-source 192.168.1.12 已安装snort以使用以下命令记录上述请求: sudo snort -ved -h 192.168.1.0/24 -l . 这将生成日志文件snort.Log.1427021231 我可以用它来读 sudo snort -r snor

事情已经完成:


从以下链接安装Hadoop:


已安装Hping3以使用以下命令生成洪水请求:

sudo hping3 -c 10000 -d 120 -S -w 64 -p 8000 --flood --rand-source 192.168.1.12

已安装snort以使用以下命令记录上述请求:

sudo snort -ved -h 192.168.1.0/24 -l .
这将生成日志文件snort.Log.1427021231

我可以用它来读

sudo snort -r snort.log.1427021231
它给出了表单的输出:

=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+

03/22-16:17:14.259633192.168.1.12:8000->117.247.194.105:46639 TCP TTL:64 TOS:0x0 ID:0 IpLen:20 DgmLen:44 DF AS序列:0x6EEE4A6B确认:0x6DF6015B赢:0x7210 TcpLen:24 TCP选项(1)=>MSS:1460 =+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+


我曾经

hdfs dfs -put <localsrc> ... <dst>


谢谢。

快速搜索后,您可能需要自定义MapReduce作业来完成此任务

该算法类似于以下伪代码:

Parse the file line by line (or parse every n lines if logs are more than one line long).

in the mapper, use regex to figure out if something is a source IP, destination IP etc.

output these with key value structure of <Type, count> 
    type is the type of text that was matched (ex. source IP)
    count is the number of times it was matched in the record

have reducer sum all of the values from the mappers, and get global totals for each type of information you want

write to file in desired format.
逐行解析文件(如果日志长度超过一行,则每n行解析一次)。
在映射程序中,使用正则表达式来确定某个对象是否是源IP、目标IP等。
以的键值结构输出这些
type是匹配的文本类型(例如源IP)
count是它在记录中匹配的次数
让reducer对映射器中的所有值求和,并获取所需每种类型信息的全局总计
以所需格式写入文件。
Parse the file line by line (or parse every n lines if logs are more than one line long).

in the mapper, use regex to figure out if something is a source IP, destination IP etc.

output these with key value structure of <Type, count> 
    type is the type of text that was matched (ex. source IP)
    count is the number of times it was matched in the record

have reducer sum all of the values from the mappers, and get global totals for each type of information you want

write to file in desired format.