R 使用lappy时出错?
我想将我的数据框“data”的每一列除以另一个称为“benchmark”的数据框的每一列。但是,我使用lappy和手动分割得到不同的结果。我的代码中的错误在哪里 我使用的代码是:R 使用lappy时出错?,r,dataframe,lapply,R,Dataframe,Lapply,我想将我的数据框“data”的每一列除以另一个称为“benchmark”的数据框的每一列。但是,我使用lappy和手动分割得到不同的结果。我的代码中的错误在哪里 我使用的代码是: div <- data.frame(lapply(data, function(x) x[col(benchmark)]/benchmark)) 。。。手动将第一列“数据”除以前两列“基准”得到: A.B1 A.B2 1 0.7200000 0.7200000 2 0.824742
div <- data.frame(lapply(data, function(x) x[col(benchmark)]/benchmark))
。。。手动将第一列“数据”除以前两列“基准”得到:
A.B1 A.B2
1 0.7200000 0.7200000
2 0.8247423 0.8163265
3 0.7653061 0.7575758
4 0.7525773 0.7604167
5 0.9473684 0.9574468
6 0.8709677 0.8804348
7 0.8804348 0.8617021
8 0.9347826 0.9247312
9 1.0989011 1.0989011
10 0.9090909 0.8791209
“数据”的一些示例数据:
至于“基准”:
我想你可能想试试purrr,它有一些函数可以让你映射到多个列表,这对这个例子很有帮助。在这种情况下,您可以使用
map2\u df(数据,基准,~.x/.y)
您可以使用外部
:
data <- read.table(text = " A1 A2
1 72 11
2 80 20
3 75 15
4 73 17
5 90 13
6 81 18
7 81 22
8 86 30
9 100 20
10 80 22", header = TRUE)
benchmark <- read.table(text = " B1 B2
1 100 100
2 97 98
3 98 99
4 97 96
5 95 94
6 93 92
7 92 94
8 92 93
9 91 91
10 88 91", header = TRUE)
res <- outer(seq_along(data), seq_along(benchmark),
function(i, j, DF1, DF2) DF1[,i] / DF2[, j],
DF1 = data, DF2 = benchmark)
names(res) <- outer(names(data), names(benchmark), paste, sep = ".")
# A1.B1 A2.B1 A1.B2 A2.B2
#1 0.7200000 0.1100000 0.7200000 0.1100000
#2 0.8247423 0.2061856 0.8163265 0.2040816
#3 0.7653061 0.1530612 0.7575758 0.1515152
#4 0.7525773 0.1752577 0.7604167 0.1770833
#5 0.9473684 0.1368421 0.9574468 0.1382979
#6 0.8709677 0.1935484 0.8804348 0.1956522
#7 0.8804348 0.2391304 0.8617021 0.2340426
#8 0.9347826 0.3260870 0.9247312 0.3225806
#9 1.0989011 0.2197802 1.0989011 0.2197802
#10 0.9090909 0.2500000 0.8791209 0.2417582
数据您可以尝试:
A=data; B=benchmark
matrix(apply(A, 2, function(x, y) apply(y, 2, function(z, x) x/z, x), B), nrow(A), ncol(A)*ncol(B), byrow = F)
[,1] [,2]
[1,] 0.7200000 0.7200000
[2,] 0.8247423 0.8163265
[3,] 0.7653061 0.7575758
[4,] 0.7525773 0.7604167
[5,] 0.9473684 0.9574468
[6,] 0.8709677 0.8804348
[7,] 0.8804348 0.8617021
[8,] 0.9347826 0.9247312
[9,] 1.0989011 1.0989011
[10,] 0.9090909 0.8791209
背后的思想是两个嵌套的apply函数。使用matrix()
函数适当转换结果。
或者用罗兰的数据。请注意订购的是A1B1、A1B2、A2B1、A2B2
matrix(apply(data, 2, function(x,y) apply(y, 2, function(z,x) x/z, x), benchmark), nrow(data) , ncol(data)*ncol(benchmark), byrow = F)
[,1] [,2] [,3] [,4]
[1,] 0.7200000 0.7200000 0.1100000 0.1100000
[2,] 0.8247423 0.8163265 0.2061856 0.2040816
[3,] 0.7653061 0.7575758 0.1530612 0.1515152
[4,] 0.7525773 0.7604167 0.1752577 0.1770833
[5,] 0.9473684 0.9574468 0.1368421 0.1382979
[6,] 0.8709677 0.8804348 0.1935484 0.1956522
[7,] 0.8804348 0.8617021 0.2391304 0.2340426
[8,] 0.9347826 0.9247312 0.3260870 0.3225806
[9,] 1.0989011 1.0989011 0.2197802 0.2197802
[10,] 0.9090909 0.8791209 0.2500000 0.2417582
或者,结合zx8754的答案,您可以得到一个分区列表,这些分区可以与do.call
绑定在一起:
do.call("cbind", apply(data, 2, function(x,y) x/y, benchmark))
如何使用df1/df2
,请参见示例:
#dummy data
df1 <- mtcars[1:5, 1, drop = FALSE]
df2 <- mtcars[1:5, 4:6]
df1; df2
# mpg
# Mazda RX4 21.0
# Mazda RX4 Wag 21.0
# Datsun 710 22.8
# Hornet 4 Drive 21.4
# Hornet Sportabout 18.7
# hp drat wt
# Mazda RX4 110 3.90 2.620
# Mazda RX4 Wag 110 3.90 2.875
# Datsun 710 93 3.85 2.320
# Hornet 4 Drive 110 3.08 3.215
# Hornet Sportabout 175 3.15 3.440
df1$mpg/df2
# hp drat wt
# Mazda RX4 0.1909091 5.384615 8.015267
# Mazda RX4 Wag 0.1909091 5.384615 7.304348
# Datsun 710 0.2451613 5.922078 9.827586
# Hornet 4 Drive 0.1945455 6.948052 6.656299
# Hornet Sportabout 0.1068571 5.936508 5.436047
#虚拟数据
df1这里是一个使用扩展.grid
的解决方案:
e <- do.call(expand.grid, list(1:ncol(data),1:ncol(benchmark)))
# e will give you all possible permutations of columns on which you can apply division
# Var1 Var2
# 1 1 1
# 2 2 1
# 3 1 2
# 4 2 2
r <- apply(e, 1, function(x) data[,x[1]]/benchmark[,x[2]])
# to make descriptive column names for r
colnames(r) <- apply(expand.grid(names(data), names(benchmark)), 1, paste, collapse="/")
# A1/B1 A2/B1 A1/B2 A2/B2
# [1,] 0.7200000 0.1100000 0.7200000 0.1100000
# [2,] 0.8247423 0.2061856 0.8163265 0.2040816
# [3,] 0.7653061 0.1530612 0.7575758 0.1515152
# [4,] 0.7525773 0.1752577 0.7604167 0.1770833
# [5,] 0.9473684 0.1368421 0.9574468 0.1382979
# [6,] 0.8709677 0.1935484 0.8804348 0.1956522
# [7,] 0.8804348 0.2391304 0.8617021 0.2340426
# [8,] 0.9347826 0.3260870 0.9247312 0.3225806
# [9,] 1.0989011 0.2197802 1.0989011 0.2197802
# [10,] 0.9090909 0.2500000 0.8791209 0.2417582
edf1/df2
怎么样?
do.call("cbind", apply(data, 2, function(x,y) x/y, benchmark))
#dummy data
df1 <- mtcars[1:5, 1, drop = FALSE]
df2 <- mtcars[1:5, 4:6]
df1; df2
# mpg
# Mazda RX4 21.0
# Mazda RX4 Wag 21.0
# Datsun 710 22.8
# Hornet 4 Drive 21.4
# Hornet Sportabout 18.7
# hp drat wt
# Mazda RX4 110 3.90 2.620
# Mazda RX4 Wag 110 3.90 2.875
# Datsun 710 93 3.85 2.320
# Hornet 4 Drive 110 3.08 3.215
# Hornet Sportabout 175 3.15 3.440
df1$mpg/df2
# hp drat wt
# Mazda RX4 0.1909091 5.384615 8.015267
# Mazda RX4 Wag 0.1909091 5.384615 7.304348
# Datsun 710 0.2451613 5.922078 9.827586
# Hornet 4 Drive 0.1945455 6.948052 6.656299
# Hornet Sportabout 0.1068571 5.936508 5.436047
e <- do.call(expand.grid, list(1:ncol(data),1:ncol(benchmark)))
# e will give you all possible permutations of columns on which you can apply division
# Var1 Var2
# 1 1 1
# 2 2 1
# 3 1 2
# 4 2 2
r <- apply(e, 1, function(x) data[,x[1]]/benchmark[,x[2]])
# to make descriptive column names for r
colnames(r) <- apply(expand.grid(names(data), names(benchmark)), 1, paste, collapse="/")
# A1/B1 A2/B1 A1/B2 A2/B2
# [1,] 0.7200000 0.1100000 0.7200000 0.1100000
# [2,] 0.8247423 0.2061856 0.8163265 0.2040816
# [3,] 0.7653061 0.1530612 0.7575758 0.1515152
# [4,] 0.7525773 0.1752577 0.7604167 0.1770833
# [5,] 0.9473684 0.1368421 0.9574468 0.1382979
# [6,] 0.8709677 0.1935484 0.8804348 0.1956522
# [7,] 0.8804348 0.2391304 0.8617021 0.2340426
# [8,] 0.9347826 0.3260870 0.9247312 0.3225806
# [9,] 1.0989011 0.2197802 1.0989011 0.2197802
# [10,] 0.9090909 0.2500000 0.8791209 0.2417582
data <- structure(list(A1 = c(72L, 80L, 75L, 73L, 90L, 81L, 81L, 86L,
100L, 80L), A2 = c(11L, 20L, 15L, 17L, 13L, 18L, 22L, 30L, 20L,
22L)), .Names = c("A1", "A2"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))
benchmark <- structure(list(B1 = c(100L, 97L, 98L, 97L, 95L, 93L, 92L, 92L,
91L, 88L), B2 = c(100L, 98L, 99L, 96L, 94L, 92L, 94L, 93L, 91L,
91L)), .Names = c("B1", "B2"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))