R 函数中包含%in%的子集
我有以下数据帧R 函数中包含%in%的子集,r,function,subset,R,Function,Subset,我有以下数据帧 head(graph_data, n = 15) source target 1 Ohrid СКОПЈЕ 2 Ohrid СКОПЈЕ 3 Ohrid СКОПЈЕ 4 Ohrid СКОПЈЕ 5 Ohrid СКОПЈЕ 6 Ohrid СКОПЈЕ 7 Ohrid СКОПЈЕ 8 Ohrid СКОПЈЕ 9 Ohrid СКОПЈЕ 10 Ohrid СКОПЈЕ 11 Ohrid СКОПЈЕ 12 Ohrid СКО
head(graph_data, n = 15)
source target
1 Ohrid СКОПЈЕ
2 Ohrid СКОПЈЕ
3 Ohrid СКОПЈЕ
4 Ohrid СКОПЈЕ
5 Ohrid СКОПЈЕ
6 Ohrid СКОПЈЕ
7 Ohrid СКОПЈЕ
8 Ohrid СКОПЈЕ
9 Ohrid СКОПЈЕ
10 Ohrid СКОПЈЕ
11 Ohrid СКОПЈЕ
12 Ohrid СКОПЈЕ
13 Ohrid СКОПЈЕ
14 Ohrid СКОПЈЕ
15 Ohrid СКОПЈЕ
我编写了以下函数来自动过滤与源代码匹配的最大数量的过程
top_connections <- function(data, city, top_n) {
temp <- filter(data, source == city)
temp2 <- as.data.frame(table(temp$target))
temp2 <- arrange(temp2, desc(Freq))
temp2 <- temp2[1:top_n, ]
temp3 <- as.data.frame(unique(temp2$Var1))
colnames(temp3)[1] <- "top_connecitons"
#works fine until here
temp4 <- subset(temp, source %in% temp3[, "top_connecitons"])
return(temp4)
}
您正在使用其他软件包吗?看起来您正在使用
dplyr
中的函数。不要忘记library()
调用,这一点很清楚。在您的示例中,“temp3”包含data$target
中的变量。然后将其与数据$source
进行比较。是否可能唯一(数据%目标)!=唯一(数据$source)
?请查看@Istrel yes-唯一(数据%target)!=唯一(数据$source)。但我们并不是在寻求平等。我们计算数据$target中每个唯一模态的频率。对于前n个最常见的模式,我们从数据中对它们进行子集。我仍然得到相同的结果。零行。在任何情况下,有问题的部分是temp4。。子集行。这行之前的所有结果对我来说都很好。你能帮我发布/重定向到你的数据吗?如何调用函数?isI更新了哪些参数。原始帖子中包含的数据/环境/函数调用。谢谢你的帮助。编码混乱:“1949”,“斯科普里”���ϣ�"我发现了我的错误,应该是temp4
test1 <- top_connections(graph_data, "Skopje", top_n = 15)
search()
[1] ".GlobalEnv" "package:networkD3"
[3] "package:data.table" "package:DT"
[5] "package:corrplot" "package:scales"
[7] "package:dplyr" "package:purrr"
[9] "package:readr" "package:tidyr"
[11] "package:tibble" "package:tidyverse"
[13] "package:ggthemes" "package:ggplot2"
[15] "package:readxl" "package:lubridate"
[17] "tools:rstudio" "package:stats"
[19] "package:graphics" "package:grDevices"
[21] "package:utils" "package:datasets"
[23] "package:methods" "Autoloads"
[25] "package:base"
graph_data < data.frame(source=c("Paris","Berlin","Paris","London","Munich"),target=c("Amsterdam","Paris","Paris","Brighton","Paris"),stringsAsFactors = F)
top_connections <- function(data, city, top_n) {
temp <- dplyr::filter(data,source==city)
temp2 <- as.data.frame(table(temp$target))
temp2 <- dplyr::arrange(temp2, desc(Freq))
temp2 <- temp2[1:top_n, ]
temp3 <- as.data.frame(unique(temp2$Var1))
colnames(temp3)[1] <- "top_connecitons"
temp4 <- subset(temp, source %in% temp3[, "top_connecitons"])
return(temp4)
}
top_connections(graph_data,"Paris",2)
source target
1 Paris Amsterdam
2 Paris Paris