tidyr重塑tibble而不扩展.grid
我想在不使用expand.grid的情况下重塑TIBLE。虽然expand.grid+delete missing obs+delete“fliped duplicates”(即a,b与b,a相同)可以工作,但如果我有许多组合,计算速度会非常慢 这是我想要实现的虚拟版本:tidyr重塑tibble而不扩展.grid,r,dplyr,tidyr,R,Dplyr,Tidyr,我想在不使用expand.grid的情况下重塑TIBLE。虽然expand.grid+delete missing obs+delete“fliped duplicates”(即a,b与b,a相同)可以工作,但如果我有许多组合,计算速度会非常慢 这是我想要实现的虚拟版本: library(dplyr) library(tidyr) initial_data <- tibble(x = c("east","east","east"), y = c("a","b","c"), z = c(0
library(dplyr)
library(tidyr)
initial_data <- tibble(x = c("east","east","east"), y = c("a","b","c"), z = c(0.1,0.2,0.3))
> initial_data
# A tibble: 3 x 3
x y z
<chr> <chr> <dbl>
1 east a 0.1
2 east b 0.2
3 east c 0.3
final_data <- tibble(x = c("east","east","east"), y1 = c("a","a","b"), y2 = c("b","c","c"), z1 = c(0.1,0.1,0.2), z2 = c(0.2,0.3,0.3))
> final_data
# A tibble: 3 x 5
x y1 y2 z1 z2
<chr> <chr> <chr> <dbl> <dbl>
1 east a b 0.1 0.2
2 east a c 0.1 0.3
3 east b c 0.2 0.3
库(dplyr)
图书馆(tidyr)
初始数据初始数据
#一个tibble:3x3
x y z
1东a 0.1
2东b 0.2
3东c 0.3
最终数据最终数据
#一个tibble:3x5
x y1 y2 z1 z2
1东a b 0.1 0.2
2东a c 0.1 0.3
3东b c 0.2 0.3
这是可行的,但效率极低:
expand_data <- as_tibble(expand.grid(initial_data$x, initial_data$y, initial_data$y)) %>%
filter(Var2 != Var3) %>%
distinct()
index <- !duplicated(t(apply(expand_data, 1, sort)))
expand_data <- expand_data[index, ] %>%
left_join(initial_data, by = c("Var1" = "x", "Var2" = "y")) %>%
left_join(initial_data, by = c("Var1" = "x", "Var3" = "y"))
> expand_data
# A tibble: 3 x 5
Var1 Var2 Var3 z.x z.y
<chr> <chr> <chr> <dbl> <dbl>
1 east b a 0.2 0.1
2 east c a 0.3 0.1
3 east c b 0.3 0.2
扩展_数据%
过滤器(Var2!=Var3)%>%
不同的()
指数%
左联接(初始数据,by=c(“Var1”=“x”,“Var3”=“y”))
>扩展数据
#一个tibble:3x5
Var1 Var2 Var3 z.x z.y
1东b 0.2 0.1
2东c 0.30.1
3东c b 0.3 0.2
非常感谢 如何进行
内部连接
,然后筛选唯一的组合
library(dplyr)
inner_join(initial_data, initial_data,
suffix = c('1', '2'), by = 'x') %>%
filter(y1 < y2) %>%
select(x, y1, y2, z1, z2)
# x y1 y2 z1 z2
# 1 east a b 0.1 0.2
# 2 east a c 0.1 0.3
# 3 east b c 0.2 0.3
库(dplyr)
内部连接(初始数据、初始数据、,
后缀=c('1','2'),by='x')%>%
过滤器(y1%
选择(x、y1、y2、z1、z2)
#x y1 y2 z1 z2
#1东a b 0.1 0.2
#2东a c 0.1 0.3
#3东b c 0.2 0.3
此基本R解决方案适合您吗
data.frame(x = rep("east", 3),
matrix(rep(initial_data$y, each = 2), 3),
matrix(rep(initial_data$z, each = 2), 3))
# x X1 X2 X1.1 X2.1
# 1 east a b 0.1 0.2
# 2 east a c 0.1 0.3
# 3 east b c 0.2 0.3
我会尝试一下
combn
和purr::map
你的数据
initial_data <- tibble(x = c("east","east","east"), y = c("a","b","c"), z = c(0.1,0.2,0.3))
输出
# A tibble: 3 x 5
# x y1 y2 z1 z2
# <chr> <chr> <chr> <dbl> <dbl>
# 1 east a b 0.1 0.2
# 2 east a c 0.1 0.3
# 3 east b c 0.2 0.3
#一个tible:3 x 5
#x y1 y2 z1 z2
#
#1东a b 0.1 0.2
#2东a c 0.1 0.3
#3东b c 0.2 0.3
我会尽快检查可行性:)你能更详细地解释一下你的问题吗?我不明白你想要什么样的最终格式。
# A tibble: 3 x 5
# x y1 y2 z1 z2
# <chr> <chr> <chr> <dbl> <dbl>
# 1 east a b 0.1 0.2
# 2 east a c 0.1 0.3
# 3 east b c 0.2 0.3