Gnuplot直方图簇（条形图），每个类别一行直方图聚类/条形图_Gnuplot_Histogram_Bar Chart

Gnuplot直方图簇（条形图），每个类别一行直方图聚类/条形图

gnuplot

Gnuplot直方图簇（条形图），每个类别一行直方图聚类/条形图,gnuplot,histogram,bar-chart,Gnuplot,Histogram,Bar Chart,我试图用gnuplot从该数据文件中生成以下直方图簇，其中每个类别在数据文件中每年的一行中表示为： # datafile year category num_of_events 2011 "Category 1" 213 2011 "Category 2" 240 2011 "Category 3" 220 2012 "Category 1" 222 2012 "Category 2" 238 ... 但我不知道如何做到每类

我试图用gnuplot从该数据文件中生成以下直方图簇，其中每个类别在数据文件中每年的一行中表示为：

# datafile year category num_of_events 2011 "Category 1" 213 2011 "Category 2" 240 2011 "Category 3" 220 2012 "Category 1" 222 2012 "Category 2" 238 ...

但我不知道如何做到每类一行。如果有人知道如何使用gnuplot实现这一点，我将非常高兴
堆积直方图聚类/堆积条形图更好的方法是如下所示的堆叠的直方图簇，其中堆叠的子类别由数据文件中的独立列表示：

# datafile year category num_of_events_for_A num_of_events_for_B 2011 "Category 1" 213 30 2011 "Category 2" 240 28 2011 "Category 3" 220 25 2012 "Category 1" 222 13 2012 "Category 2" 238 42 ...

提前多谢
经过一些研究，我提出了两种不同的解决方案
必需：拆分数据文件这两种解决方案都需要将数据文件拆分为按列分类的多个文件。因此，我创建了一个简短的脚本，可以在以下要点中找到：

此脚本的使用方式如下：要将数据文件
data.csv
拆分为
data.Category1.csv
和
data.Category2.csv
，请调用：

# bash ruby categorize_csv.rb --column 2 data.csv # data.csv # year category num_of_events_for_A num_of_events_for_B "2011";"Category1";"213";"30" "2011";"Category2";"240";"28" "2012";"Category1";"222";"13" "2012";"Category2";"238";"42" ... # data.Category1.csv # year category num_of_events_for_A num_of_events_for_B "2011";"Category1";"213";"30" "2012";"Category1";"222";"13" ... # data.Category2.csv # year category num_of_events_for_A num_of_events_for_B "2011";"Category2";"240";"28" "2012";"Category2";"238";"42" ...
解决方案1：堆叠箱型图策略：每个类别一个数据文件。每个堆栈一列。直方图的条形图是使用gnuplot的“with box”参数“手动”绘制的
上部：在酒吧大小、瓶盖、颜色等方面具有充分的灵活性
缺点：必须手动放置钢筋

# solution1.gnuplot reset set terminal postscript eps enhanced 14 set datafile separator ";" set output 'stacked_boxes.eps' set auto x set yrange [0:300] set xtics 1 set style fill solid border -1 num_of_categories=2 set boxwidth 0.3/num_of_categories dx=0.5/num_of_categories offset=-0.1 plot 'data.Category1.csv' using ($1+offset):($3+$4) title "Category 1 A" linecolor rgb "#cc0000" with boxes, \ '' using ($1+offset):3 title "Category 2 B" linecolor rgb "#ff0000" with boxes, \ 'data.Category2.csv' using ($1+offset+dx):($3+$4) title "Category 2 A" linecolor rgb "#00cc00" with boxes, \ '' using ($1+offset+dx):3 title "Category 2 B" linecolor rgb "#00ff00" with boxes
结果如下所示：

解决方案2：本机Gnuplot直方图策略：每年一个数据文件。每个堆栈一列。直方图是使用gnuplot的规则直方图机制生成的
上部：更易于使用，因为定位不需要手动完成

# solution1.gnuplot reset set terminal postscript eps enhanced 14 set datafile separator ";" set output 'stacked_boxes.eps' set auto x set yrange [0:300] set xtics 1 set style fill solid border -1 num_of_categories=2 set boxwidth 0.3/num_of_categories dx=0.5/num_of_categories offset=-0.1 plot 'data.Category1.csv' using ($1+offset):($3+$4) title "Category 1 A" linecolor rgb "#cc0000" with boxes, \ '' using ($1+offset):3 title "Category 2 B" linecolor rgb "#ff0000" with boxes, \ 'data.Category2.csv' using ($1+offset+dx):($3+$4) title "Category 2 A" linecolor rgb "#00cc00" with boxes, \ '' using ($1+offset+dx):3 title "Category 2 B" linecolor rgb "#00ff00" with boxes
缺点：由于所有类别都在一个文件中，因此每个类别都有相同的颜色

# solution2.gnuplot reset set terminal postscript eps enhanced 14 set datafile separator ";" set output 'histo.eps' set yrange [0:300] set style data histogram set style histogram rowstack gap 1 set style fill solid border -1 set boxwidth 0.5 relative plot newhistogram "2011", \ 'data.2011.csv' using 3:xticlabels(2) title "A" linecolor rgb "red", \ '' using 4:xticlabels(2) title "B" linecolor rgb "green", \ newhistogram "2012", \ 'data.2012.csv' using 3:xticlabels(2) title "" linecolor rgb "red", \ '' using 4:xticlabels(2) title "" linecolor rgb "green", \ newhistogram "2013", \ 'data.2013.csv' using 3:xticlabels(2) title "" linecolor rgb "red", \ '' using 4:xticlabels(2) title "" linecolor rgb "green"
结果如下所示：

工具书类

非常感谢@fiedl！根据您的解决方案#1，我可以使用两个以上的堆叠子类别来制作自己的堆叠/聚集直方图
这是我的密码：

set terminal pngcairo transparent enhanced font "arial,10" fontscale 1.0 size 600, 400 set output 'runtimes.png' set xtics("1" 1, "2" 2, "4" 3, "8" 4) set yrange [0:100] set style fill solid border -1 set key invert set grid num_of_ksptypes=2 set boxwidth 0.5/num_of_ksptypes dx=0.5/num_of_ksptypes offset=-0.12 set xlabel "threads" set ylabel "seconds" plot 'data1.dat' using ($1+offset):($2+$3+$4+$5) title "SDO" linecolor rgb "#006400" with boxes, \ '' using ($1+offset):($3+$4+$5) title "BGM" linecolor rgb "#FFFF00" with boxes, \ '' using ($1+offset):($4+$5) title "TSQR" linecolor rgb "#FFA500 " with boxes, \ '' using ($1+offset):5 title "SpMV" linecolor rgb "#FF0000" with boxes, \ 'data2.dat' using ($1+offset+dx):($2+$3) title "MGS" linecolor rgb "#8B008B" with boxes, \ '' using ($1+offset+dx):3 title "SpMV" linecolor rgb "#0000FF" with boxes
数据1.dat：

nr SDO BGM TSQR SpMV 1 10 15 20 25 2 10 10 10 10 3 10 10 10 10 4 10 10 10 10
数据2.dat：

nr MGS SpMV 1 23 13 2 23 13 3 23 13 4 23 13
结果图：

这些实际上只是垂直条形图。除非我有一个项目数组，并且代码必须对项目进行分类和计数，否则我不会将某个项目称为直方图。可能相关：谢谢@Paul，我已经相应地修改了标题。也许你感兴趣。我还认为，分割数据文件是创建这些直方图的唯一可行方法。顺便说一句：你也可以接受你自己的答案
：）
。谢谢@Christoph，我想我应该等几天，也许有人可以添加一个更好的方法。看看：也许解决方案1的颜色可以不同。