Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/73.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 将具有间隔的单个行转换为等于间隔的多行_R_Dataframe_Rows - Fatal编程技术网

R 将具有间隔的单个行转换为等于间隔的多行

R 将具有间隔的单个行转换为等于间隔的多行,r,dataframe,rows,R,Dataframe,Rows,假设你有这样的东西: Col1 Col2 a odd from 1 to 9 b even from 2 to 14 c even from 30 to 50 ... 我想通过将间隔分隔为单独的行来扩展行,因此: Col1 Col2 a 1 a 3 a 5 ... b 2 b 4 b 6 ... c 30 c 32 c 34 ... 请注意,当它表示“偶数从”时,下限和上限也是偶数,奇数也是如此。将Col2分隔为单独的列

假设你有这样的东西:

Col1 Col2
a    odd from 1 to 9
b    even from 2 to 14
c    even from 30 to 50
...
我想通过将间隔分隔为单独的行来扩展行,因此:

Col1 Col2
a    1
a    3
a    5
...
b    2
b    4
b    6
...
c    30
c    32
c    34
...

请注意,当它表示“偶数从”时,下限和上限也是偶数,奇数也是如此。

将Col2分隔为单独的列,然后为每一行创建序列:

library(dplyr)
library(tidyr)
DF %>% 
   separate(Col2, into = c("parity", "X1", "from", "X2", "to")) %>% 
   group_by(Col1) %>% 
   do(data.frame(Col2 = seq(.$from, .$to, 2))) %>%
   ungroup
附注1 假设可复制形式的输入
DF

DF <- structure(list(Col1 = c("a", "b", "c"), Col2 = c("odd from 1 to 9", 
"even from 2 to 14", "even from 30 to 50")), .Names = c("Col1", 
"Col2"), row.names = c(NA, -3L), class = "data.frame")

使用
tidyverse

library(tidyverse)    
df %>% mutate(Col2 = map(str_split(Col2," "),
                         ~seq(as.numeric(.[3]),as.numeric(.[5]),2))) %>%
  unnest
或者从@g-grothendieck的解决方案中借用
separate
,使其更具可读性:

df %>%
  separate(Col2,as.character(1:5),convert=TRUE) %>%
  transmute(Col1,Col2 = map2(`3`,`5`,seq,2)) %>%
  unnest

这里有一个使用
base R
的选项。我们使用
gregexpr/regmatches
将“Col2”中的数字元素提取到
列表中,然后使用
seq
stack
将元素序列按2进行排序,并将其添加到
data.frame

res <- stack(setNames(lapply(regmatches(DF$Col2, gregexpr("\\d+", DF$Col2)), function(x)
     seq(as.numeric(x[1]), as.numeric(x[2]), by = 2)), DF$Col1))[2:1]
colnames(res) <- colnames(DF)
head(res)
#  Col1 Col2
#1    a    1
#2    a    3
#3    a    5
#4    a    7
#5    a    9
#6    b    2

res
unnest
来自
tidyr
,管道和变异自
dplyr
完全准确%>%由dplyr从magrittr导入,然后导出。看,是的,我想使用strsplit:),我会让它保持这样,这样它更整洁
res <- stack(setNames(lapply(regmatches(DF$Col2, gregexpr("\\d+", DF$Col2)), function(x)
     seq(as.numeric(x[1]), as.numeric(x[2]), by = 2)), DF$Col1))[2:1]
colnames(res) <- colnames(DF)
head(res)
#  Col1 Col2
#1    a    1
#2    a    3
#3    a    5
#4    a    7
#5    a    9
#6    b    2