r-查找最大长度“;连锁店;数对的数字递增

r-查找最大长度“;连锁店;数对的数字递增,r,R,我有一个数字对的两列数据帧: ODD <- c(1,1,1,3,3,3,5,7,7,9,9) EVEN <- c(10,8,2,2,6,4,2,6,8,4,8) dfPairs <- data.frame(ODD, EVEN) > dfPairs ODD EVEN 1 1 10 2 1 8 3 1 2 4 3 2 5 3 6 6 3 4 7 5 2 8 7 6 9

我有一个数字对的两列数据帧:

ODD <- c(1,1,1,3,3,3,5,7,7,9,9)
EVEN <- c(10,8,2,2,6,4,2,6,8,4,8)

dfPairs <- data.frame(ODD, EVEN)

> dfPairs
   ODD EVEN
1    1   10
2    1    8
3    1    2
4    3    2
5    3    6
6    3    4
7    5    2
8    7    6
9    7    8
10   9    4
11   9    8
ODD编辑以处理不从1开始并返回最大链而不是链长度的df

使用
igraph

您的数据,
dfPairs

ODD <- c(1,1,1,3,3,3,5,7,7,9,9)
EVEN <- c(10,8,2,2,6,4,2,6,8,4,8)
dfPairs <- data.frame(ODD, EVEN)
制作数据图表。我的解决方案的关键是将数据帧反向(
rev(dfPairs)
)绑定到原始数据帧。这将允许构建从奇数到偶数的方向边。图可以很容易地用来构造定向路径

library(igraph)
library(dplyr)
GPairs <- graph_from_data_frame(dplyr::arrange(rbind(setNames(dfPairs, c("X1", "X2")), setNames(rev(dfPairs), c("X1", "X2"))), X1))
GTest <- graph_from_data_frame(dplyr::arrange(rbind(setNames(dfTest, c("X1", "X2")), setNames(rev(dfTest), c("X1", "X2"))), X1))
我创建了一个函数,用于1)将所有简单路径转换为数值向量列表,2)仅过滤满足左->右递增的元素的每个数值向量,以及3)返回左->右递增数值向量的最大链

max_chain_only_increasing <- function(gpath) {
                            list_vec <- lapply(gpath, function(v) as.numeric(names(unclass(v))))    # convert to list of numeric vector
                            only_increasing <- lapply(list_vec, function(v) v[1:min(which(v >= dplyr::lead(v, default=tail(v, 1))))])   # subset vector for only elements that are left->right increasing
                            return(unique(only_increasing[lengths(only_increasing) == max(lengths(only_increasing))]))                     # return maximum chain length
                        }
现在,我将从
dfPairs
中的每个唯一元素开始输出最大链的(标题),即原始数据

start_vals <- sort(unique(unlist(dfPairs)))
# [1]  1  2  3  4  5  6  7  8  9 10
max_chains <- sapply(seq_len(length(start_vals)), function(i) max_chain_only_increasing(all_simple_paths(GPairs, i)))
names(max_chains) <- start_vals
# $`1`
# [1] 1 2 3 6 7 8 9

# $`2`
# [1] 2 3 6 7 8 9

# $`3`
# [1] 3 6 7 8 9

# $`4`
# [1] 4 9

# $`5`
# [1] 5
# etc

尽管Cpak做出了努力,我最终还是编写了自己的函数来解决这个问题。本质上,我意识到我可以通过使用Cpak答案中的这段代码从左到右创建从右到左的链链接:

output <- arrange(rbind(setNames(dfPairs, c("X1", "X2")), setNames(rev(dfPairs), c("X1", "X2")))`, X1)
总的来说,我对此感到高兴。我想它可能更优雅一点,但它适用于任何东西,它适用于一些相当庞大和复杂的数据。这将从700对数据集中产生约241700个解决方案

我还使用了在stackoverflow上找到的moveme函数(见下文)。我使用它来移动NA值,以实现join_shift函数的移位方面

moveme <- function (invec, movecommand) {
  movecommand <- lapply(strsplit(strsplit(movecommand, ";")[[1]], 
                                 ",|\\s+"), function(x) x[x != ""])
  movelist <- lapply(movecommand, function(x) {
    Where <- x[which(x %in% c("before", "after", "first", 
                              "last")):length(x)]
    ToMove <- setdiff(x, Where)
    list(ToMove, Where)
  })
  myVec <- invec
  for (i in seq_along(movelist)) {
    temp <- setdiff(myVec, movelist[[i]][[1]])
    A <- movelist[[i]][[2]][1]
    if (A %in% c("before", "after")) {
      ba <- movelist[[i]][[2]][2]
      if (A == "before") {
        after <- match(ba, temp) - 1
      }
      else if (A == "after") {
        after <- match(ba, temp)
      }
    }
    else if (A == "first") {
      after <- 0
    }
    else if (A == "last") {
      after <- length(myVec)
    }
    myVec <- append(temp, values = movelist[[i]][[1]], after = after)
  }
  myVec
}

moveme不会将“链”1->8(第2行)移动9@42-,从技术上说是的,但这不是问题所在。我希望在一个链中尽可能多的链接(即成对链接)。正如我其余的评论所暗示的,很难确定问题是什么。@42-,很抱歉。我在寻找由最多链接数表示的最长链。不是由最长链接表示的最长链。包含最长链链接的最终数据帧存储在哪里?我在max_chainlength_only_递增函数中注释掉了以“return”开头的行,该函数随后返回包含名为max_chainlength的链接的向量列表。我可以用这个。明亮的非常感谢。请更正我最后的评论。它实际上是一个列表。第一个列表是起点列表。每个起始点都有一个可以从该点创建的可能向量列表。另一个非常相关的点是,max_chainlength_only_递增函数仅在从1开始时起作用。如果删除示例数据中的前三对,将得到“无效起始向量”错误。确保你相应地修改了你的数据。尽管我努力了…卢兹
start_vals <- sort(unique(unlist(dfPairs)))
# [1]  1  2  3  4  5  6  7  8  9 10
max_chains <- sapply(seq_len(length(start_vals)), function(i) max_chain_only_increasing(all_simple_paths(GPairs, i)))
names(max_chains) <- start_vals
# $`1`
# [1] 1 2 3 6 7 8 9

# $`2`
# [1] 2 3 6 7 8 9

# $`3`
# [1] 3 6 7 8 9

# $`4`
# [1] 4 9

# $`5`
# [1] 5
# etc
start_vals <- sort(unique(unlist(dfTest)))
max_chains <- sapply(seq_len(length(start_vals)), function(i) max_chain_only_increasing(all_simple_paths(GTest, i)))
names(max_chains) <- start_vals
# $`2`
# [1] 2 3 6 7 8 9

# $`3`
# [1] 3 6 7 8 9

# $`4`
# [1] 4 9

# $`5`
# [1] 5

# $`6`
# [1] 6 7 8 9
output <- arrange(rbind(setNames(dfPairs, c("X1", "X2")), setNames(rev(dfPairs), c("X1", "X2")))`, X1)
output$increase <- with(output, ifelse(X2>X1, "Greater", "Less"))
output <- filter(output, increase == "Greater")
output <- select(output, -increase)
out_split <- split(output, output$X1)
df_final <-  Reduce(join_shift, out_split)
join_shift <- function(dtf1,dtf2){
  abcd <- full_join(dtf1, dtf2, setNames(colnames(dtf2)[1], colnames(dtf1)[ncol(dtf1)]))
  abcd[is.na(abcd)]<-0
  colnames(abcd)[ncol(abcd)] <- "end"
  # print(abcd)
  abcd_na <- filter(abcd, end==0)
  # print(abcd_na)
  abcd <- filter(abcd, end != 0)
  abcd_na <- abcd_na[moveme(names(abcd_na), "end first")]
  # print(abcd_na)

  names(abcd_na) <- names(abcd)
  abcd<- rbind(abcd, abcd_na)
  z <- length(colnames(abcd))
  colnames(abcd)<- c(paste0("X", 1:z))
  # print(abcd)
  return(abcd)

}
df_final_trim = df_final[,colSums(df_final) > 0]
moveme <- function (invec, movecommand) {
  movecommand <- lapply(strsplit(strsplit(movecommand, ";")[[1]], 
                                 ",|\\s+"), function(x) x[x != ""])
  movelist <- lapply(movecommand, function(x) {
    Where <- x[which(x %in% c("before", "after", "first", 
                              "last")):length(x)]
    ToMove <- setdiff(x, Where)
    list(ToMove, Where)
  })
  myVec <- invec
  for (i in seq_along(movelist)) {
    temp <- setdiff(myVec, movelist[[i]][[1]])
    A <- movelist[[i]][[2]][1]
    if (A %in% c("before", "after")) {
      ba <- movelist[[i]][[2]][2]
      if (A == "before") {
        after <- match(ba, temp) - 1
      }
      else if (A == "after") {
        after <- match(ba, temp)
      }
    }
    else if (A == "first") {
      after <- 0
    }
    else if (A == "last") {
      after <- length(myVec)
    }
    myVec <- append(temp, values = movelist[[i]][[1]], after = after)
  }
  myVec
}