Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/69.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 成对t.测试数据表中的多个组合_R_Data.table - Fatal编程技术网

R 成对t.测试数据表中的多个组合

R 成对t.测试数据表中的多个组合,r,data.table,R,Data.table,我有一个数据表,有三列:卖方、产品和价格 示例数据: seller product price 1: A banana 56 2: A lemon 94 3: A orange 84 4: A banana 11 5: A lemon 86 --- 166: C orange 162 167: C ba

我有一个数据表,有三列:卖方、产品和价格

示例数据:

      seller product price
  1:      A  banana    56
  2:      A   lemon    94
  3:      A  orange    84
  4:      A  banana    11
  5:      A   lemon    86
---                     
166:      C  orange   162
167:      C  banana   109
168:      C  orange    61
169:      C  banana   141
170:      C  orange    22
数据的代码

require (data.table)
DT <- data.table(seller = c(rep(c("A"),60),rep(c("B"),62),rep(c("C"),48)), product = c(rep(c("banana", "lemon", "orange"), 20), rep(c("banana", "lemon"), 31), rep(c("banana", "orange"), 24)), 
             price = c(56, 94, 84, 11, 86, 103, 151, 51, 117, 71, 63, 101, 45, 147, 135, 93, 26, 164, 90, 67, 12, 34, 14, 131, 92, 145, 48, 74, 62, 57, 20, 80, 113, 46, 88, 102, 134, 98, 137, 123, 169, 133, 146, 
                       160, 58, 42, 52, 158, 170, 2, 152, 10, 130, 30, 33, 144, 73, 41, 139, 107, 163, 9, 66, 81, 79, 127, 40, 165, 106, 161, 16, 1, 112, 70, 115, 138, 76, 105, 17, 118, 114, 121, 25, 39, 15, 155, 50, 166, 
                       100, 159, 5, 19, 29, 24, 64, 149, 120, 35, 119, 53, 21, 7, 72, 132, 154, 168, 156, 38, 3, 148, 69, 44, 6, 28, 140, 77, 104, 153, 59, 142, 116, 150, 97, 31, 91, 43, 47, 27, 143, 99, 37, 54, 49, 4, 111, 
                       32, 23, 85, 167, 136, 78, 129, 83, 124, 36, 96, 110, 13, 65, 108, 8, 18, 157, 87, 82, 60, 122, 89, 125, 68, 75, 126, 128, 55, 95, 162, 109, 61, 141, 22))

您首先需要按
产品
进行分组。然后,在
j
参数中,您需要计算该
产品的
seller
组合,并获得
seller.x
seller.y
之间
price
t.test
p.value

DT[
  , {
    sellercomb <- data.table(t(combn(unique(seller), 2)))
    names(sellercomb) <- c("seller.x", "seller.y")
    sellercomb[
      , {
        data.table(p.value = t.test(price[seller == seller.x], price[seller == seller.y])$p.value)
      }
      , by = .(seller.x, seller.y)
    ]
  }
  , by = .(product)
]

pairwise.t.test(df$price,interaction(df[,c(“卖方”,“产品”))))
@mtoto这只是我真实数据的一个例子。我的真实数据是数以百万计的数据和数百个提供者。我们需要检查各组之间的显著性。@Zelazny7你的演讲非常有趣。然而:*1)我纠正了你的语法错误,因为你建议的语法错误。我使用的新sintaxis是:pairwise.t.test(DT$price,interaction(DT[,seller],DT[,product]))。2) 来自sintaxis的结果输出不正确。我更新了示例以获得更真实的数据。请检查我是否犯了什么错误。
DT[
  , {
    sellercomb <- data.table(t(combn(unique(seller), 2)))
    names(sellercomb) <- c("seller.x", "seller.y")
    sellercomb[
      , {
        data.table(p.value = t.test(price[seller == seller.x], price[seller == seller.y])$p.value)
      }
      , by = .(seller.x, seller.y)
    ]
  }
  , by = .(product)
]
   product seller.x seller.y   p.value
1:  banana        A        B 0.9384329
2:  banana        A        C 0.2413946
3:  banana        B        C 0.2154216
4:   lemon        A        B 0.7282811
5:  orange        A        C 0.0354320