R";draw.quad.venn中出错,不可能:产生负面积“;尽管数字是正确的

R";draw.quad.venn中出错,不可能:产生负面积“;尽管数字是正确的,r,venn-diagram,R,Venn Diagram,我试图在R中的VennDiagram包中使用draw.quad.Venn生成一个四向Venn图,但它不断抛出错误消息: ERROR [2019-05-14 11:28:24] Impossible: a7 <- n234 - a6 produces negative area Error in draw.quad.venn(length(gene_lists[[1]]), length(gene_lists[[2]]), : Impossible: a7 <- n234

我试图在R中的VennDiagram包中使用draw.quad.Venn生成一个四向Venn图,但它不断抛出错误消息:

ERROR [2019-05-14 11:28:24] Impossible: a7  <- n234 - a6 produces negative area
Error in draw.quad.venn(length(gene_lists[[1]]), length(gene_lists[[2]]),  : 
  Impossible: a7  <- n234 - a6 produces negative area

ERROR[2019-05-14 11:28:24]不可能:a7因此我尝试的任何方法都无法解决Venniagram包中draw.quad.venn的错误。它的书写方式有问题。只要4个省略号中的所有数字加起来等于该特定列表中的元素总数,维恩图就是有效的。出于某些原因,Vendiagram仅接受交叉口较少导致数量较多的数据,例如,第1、2和3组的交叉口必须高于所有4组的交叉口。这并不代表真实世界的数据。第1组、第2组和第3组完全可能不相交,而所有4组都相交。在维恩图中,所有数字都是独立的,表示每个交叉点共有的元素总数。它们之间不必有任何关系

我查看了eulerr包,但实际上发现了一种非常简单的方法,即使用gplots中的venn绘制venn图,如下所示:

# simple 4 way Venn diagram using gplots
# get some mock data
values <- c(1:20000)
list_1 <- sample(values, size = 5000, replace = FALSE)
list_2 <- sample(values, size = 4000, replace = FALSE)
list_3 <- sample(values, size = 3000, replace = FALSE)
list_4 <- sample(values, size = 2000, replace = FALSE)
lists <- list(list_1, list_2, list_3, list_4)
# name thec list (required for gplots)
names(lists) <- c("G1", "G2", "G3", "G4")
# get the venn table
v.table <- venn(lists)
# show venn table
print(v.table)
# plot Venn diagram
plot(v.table)
#使用gplots的简单四向维恩图
#获取一些模拟数据

values我已经查看了包的源代码。如果您仍然对错误原因感兴趣,有两种方法可以将数据发送到
venn.diagram
。一个是
nxxx
(例如,n134)表单,另一个是
an
(例如,a5)表单。在示例中,
n134
表示“哪些元素至少属于第1、3和4组”。另一方面,
a5
表示“哪些元素仅属于第1、3和4组”。这两种形式之间的关系确实错综复杂,例如a6对应于
n1234
。这意味着
n134=a5+a6
。 问题是
calculate.overlap
的形式给出数字,而默认情况下
draw.quad.venn
要求以
nxxx
的形式给出数字。要使用
calculate.overlap
中的值,可以将
direct.area
设置为
true
,并在
area.vector
参数中提供
calculate.overlap
的结果。比如说,

tmp <- calculate.overlap(list(a=c(1, 2, 3, 4, 10), b=c(3, 4, 5, 6), c=c(4, 6, 7, 8, 9), d=c(4, 8, 1, 9)))
overlap_values <- lapply(tmp, function(x) length(x))
draw.quad.venn(area.vector = c(overlap_values$a1, overlap_values$a2, overlap_values$a3, overlap_values$a4, 
                               overlap_values$a5, overlap_values$a6, overlap_values$a7, overlap_values$a8, 
                               overlap_values$a9, overlap_values$a10, overlap_values$a11, overlap_values$a12, 
                               overlap_values$a13, overlap_values$a14, overlap_values$a15), direct.area = T, category = c('a', 'b', 'c', 'd'))
故意重复最后一个命令。结果是:

然后,您可以探索交叉点:

> getVennRegion(myV, c('g1', 'g2', 'g4'))
[1] "NM_000139" "NM_000173" "NM_000208" "NM_000316" "NM_000318" "NM_000450" "NM_000539"

有一个包含更多信息的程序包。

使用其他程序包进行测试,如果它有效,则说明数据有问题,我更喜欢使用
eulerr
,请参见示例:
库(eulerr);f1根据错误此
n234-a6
导致负值,你能检查一下你的
n234
a6
值吗?如果不使用我的基因列表,很难准确复制错误。至于数字,n234是0,a6(n1234)是1。这仅仅意味着没有只在第2、3和4组中发现的基因,但在所有4组中都发现了1个基因。但是,各个区域之间不应该相互影响。每个圆圈中的总数应该加起来等于每个基因列表的大小,他们都是这样做的,我已经把这些添加到图片中了。就像前面提到的,如果我们不能得到相同的错误,那就无能为力了。建议尝试使用其他软件包,如果可行,那么可能会向作者提交一个问题。我在一个工作示例中添加了一个仅使用数字1:10000的问题,这也会引发此错误,尽管针对不同的组。这可能是包装本身的问题吗?我将尝试在我的数据上使用eulerr包。
library(nVennR)
g1 <- c('AF029684', 'M28825', 'M32074', 'NM_000139', 'NM_000173', 'NM_000208', 'NM_000316', 'NM_000318', 'NM_000450', 'NM_000539', 'NM_000587', 'NM_000593', 'NM_000638', 'NM_000655', 'NM_000789', 'NM_000873', 'NM_000955', 'NM_000956', 'NM_000958', 'NM_000959', 'NM_001060', 'NM_001078', 'NM_001495', 'NM_001627', 'NM_001710', 'NM_001716')
g2 <- c('NM_001728', 'NM_001835', 'NM_001877', 'NM_001954', 'NM_001992', 'NM_002001', 'NM_002160', 'NM_002162', 'NM_002258', 'NM_002262', 'NM_002303', 'NM_002332', 'NM_002346', 'NM_002347', 'NM_002349', 'NM_002432', 'NM_002644', 'NM_002659', 'NM_002997', 'NM_003032', 'NM_003246', 'NM_003247', 'NM_003248', 'NM_003259', 'NM_003332', 'NM_003383', 'NM_003734', 'NM_003830', 'NM_003890', 'NM_004106', 'AF029684', 'M28825', 'M32074', 'NM_000139', 'NM_000173', 'NM_000208', 'NM_000316', 'NM_000318', 'NM_000450', 'NM_000539')
g3 <- c('NM_000655', 'NM_000789', 'NM_004107', 'NM_004119', 'NM_004332', 'NM_004334', 'NM_004335', 'NM_004441', 'NM_004444', 'NM_004488', 'NM_004828', 'NM_005214', 'NM_005242', 'NM_005475', 'NM_005561', 'NM_005565', 'AF029684', 'M28825', 'M32074', 'NM_005567', 'NM_003734', 'NM_003830', 'NM_003890', 'NM_004106', 'AF029684', 'NM_005582', 'NM_005711', 'NM_005816', 'NM_005849', 'NM_005959', 'NM_006138', 'NM_006288', 'NM_006378', 'NM_006500', 'NM_006770', 'NM_012070', 'NM_012329', 'NM_013269', 'NM_016155', 'NM_018965', 'NM_021950', 'S69200', 'U01351', 'U08839', 'U59302')
g4 <- c('NM_001728', 'NM_001835', 'NM_001877', 'NM_001954', 'NM_005214', 'NM_005242', 'NM_005475', 'NM_005561', 'NM_005565', 'ex1', 'ex2', 'NM_003890', 'NM_004106', 'AF029684', 'M28825', 'M32074', 'NM_000139', 'NM_000173', 'NM_000208', 'NM_000316', 'NM_000318', 'NM_000450', 'NM_000539')
myV <- plotVenn(list(g1=g1, g2=g2, g3=g3, g4=g4))
myV <- plotVenn(nVennObj = myV)
myV <- plotVenn(nVennObj = myV)
> getVennRegion(myV, c('g1', 'g2', 'g4'))
[1] "NM_000139" "NM_000173" "NM_000208" "NM_000316" "NM_000318" "NM_000450" "NM_000539"