sed或awk:按段落分组,每个段落由第2行到第n行+1行组成

sed或awk:按段落分组,每个段落由第2行到第n行+1行组成,awk,sed,Awk,Sed,我需要计算线程转储中相同子段落的数量。我无法使用sed提取每个分段的第2行到第n+1行。awk也可以使用 例如,给定以下示例threaddump.txt "RMI TCP Accept-0" Id=11 RUNNABLE (in native) at java.net.PlainSocketImpl.socketAccept(Native Method) at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketI

我需要计算线程转储中相同子段落的数量。我无法使用sed提取每个分段的第2行到第n+1行。awk也可以使用

例如,给定以下示例threaddump.txt

"RMI TCP Accept-0" Id=11 RUNNABLE (in native)
    at java.net.PlainSocketImpl.socketAccept(Native Method)
    at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
    at java.net.ServerSocket.implAccept(ServerSocket.java:545)
    at java.net.ServerSocket.accept(ServerSocket.java:513)
    at sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:52)
    at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:400)
    at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:372)
    at java.lang.Thread.run(Thread.java:745)

"AMQP Connection 10.170.10.128:5672" Id=227 RUNNABLE (in native)
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:171)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
    at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:288)
    at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95)
    at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:139)
    at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:536)
    at java.lang.Thread.run(Thread.java:745)

"http-bio-10.104.42.237-16210-exec-12" Id=90 RUNNABLE (in native)
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:171)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at org.apache.coyote.http11.InternalInputBuffer.fill(InternalInputBuffer.java:534)
    at org.apache.coyote.http11.InternalInputBuffer.fill(InternalInputBuffer.java:519)
    at org.apache.coyote.http11.Http11Processor.setRequestLineReadTimeout(Http11Processor.java:174)
    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1048)
    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:637)
    at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:318)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:745)

"Signal Dispatcher" Id=6 RUNNABLE

"kafcli-poller-10" Id=277 RUNNABLE (in native)
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
    at org.apache.kafka.common.network.Selector.select(Selector.java:686)
    at org.apache.kafka.common.network.Selector.poll(Selector.java:408)
    at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:460)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:261)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233)
    at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1171)
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1115)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

"localhost-startStop-1-SendThread(zk0007.svc.prod.wd1.wd:2181)" Id=59 RUNNABLE (in native)
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
    at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:345)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214)
如果n=3,则输出为。请注意每个子堆栈开头的计数:

2   at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:171)

2   at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)

1   at java.net.PlainSocketImpl.socketAccept(Native Method)
    at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
    at java.net.ServerSocket.implAccept(ServerSocket.java:545)
因为

at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
在线程转储中出现两次;等等等等

这是一个三步过程:

提取所有可运行段落,也称为可运行堆栈。这可以通过以下方法成功实现: cat threaddump.txt | sed-e'/./{H;$!d;}'-e'x;/可运行/!d、 “>RUNNABLE.txt

对于每个堆栈或段落,提取第2行到第n+1行。我尝试了以下多种不同的组合,尝试使用sed的q选项来选择行,但没有效果。我不会列出基于这些的所有其他尝试。awk也可以工作,但无法将持有模式从sed转换为awk。 cat RUNNABLE.txt | sed-e'/./{H;$!d;}'-e'x/{2q}/!d、 "

最后,按分段分组。我还没走那么远。但我的计划是通过删除换行符将每个子堆栈折叠成一行,然后使用排序,后跟uniq-c。 以下是:

# extract first fields from each group
awk -v RS='' -v FS='\n' -v n=3 'NF > n { for (i = 2; i <= n + 1; ++i) print $i; printf "%c", "\0" }' |
# sort and uniq
sort -z | uniq -zc | sort -zrnk1 |
# some messy output formatting
sed 's/\x00//g; s/^ *\([0-9]\+\) */#\n\1#/; 1s/^#\n//; s/^ *at/#at/' | column -t -s'#' -o '   '
记录分隔符设置为空行。通过这种方式,我让awk立即阅读每一段,因为它们之间用空行分隔。将字段分隔符设置为一行。因此,在每个段落中,每一行都可以通过单独的$num变量轻松访问。然后我只输出从2到n+1的行,从每个段落中提取行。这些行的后缀为零字节。 sort-z | uniq-zc然后计算计数。 sort-zrnk1然后使用uniq输出的数字对其进行排序。 然后,通过管道传输到列的凌乱sed被用来进行良好的列化输出。
我试过awk;但是我无法将sed的持有模式输出到awk中。你有GNU-sed/awk/tools吗?是的。我愿意。顺便说一句,在这种情况下,按“键”分组是多行的。因此,计数表示这种多行模式出现的次数。谢谢!就像你说的那样。我现在要通过它来了解你做了什么
2   at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)

2   at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:171)

1   at java.net.PlainSocketImpl.socketAccept(Native Method)
    at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
    at java.net.ServerSocket.implAccept(ServerSocket.java:545)