Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/71.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/ajax/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
仅当R中为数值时才提取字符串的最后一个分量_R_Data Cleaning - Fatal编程技术网

仅当R中为数值时才提取字符串的最后一个分量

仅当R中为数值时才提取字符串的最后一个分量,r,data-cleaning,R,Data Cleaning,我有一个数据帧,它有多个分隔符。我希望删除上次出现的之后的字符,但仅当它是数字时。因此,在下面的示例中,a.b.c将保持不变,但a.b.1将成为两个值:a.b和1。我想我已经很接近了,但是我想不出最后一件能把它组合起来的东西 have <- data.frame(x = c("a.b", "a.b.c", "a.b.1", "a.b.2", "9.a.b.c")) want <- d

我有一个数据帧,它有多个
分隔符。我希望删除上次出现的
之后的字符,但仅当它是数字时。因此,在下面的示例中,
a.b.c
将保持不变,但
a.b.1
将成为两个值:
a.b
1
。我想我已经很接近了,但是我想不出最后一件能把它组合起来的东西

have <- data.frame(x = c("a.b", "a.b.c", "a.b.1", "a.b.2", "9.a.b.c"))

want <- data.frame(x = c("a.b", "a.b.c", "a.b", "a.b", "9.a.b.c"),
                   y = c(0, 0, 1, 2, 0))
        
# attempt 1
have %>% mutate(y = sub('.*\\.', '', x))
        
# attempt 2
have %>% separate(x, c('y', 'z'), sep = '.*\\.', extra = 'merge', remove = FALSE)
有%separate(x,c('y','z'),sep='.\\\.',extra='merge',remove=FALSE)

试试这种
基本R
方法:

#Data
have <- data.frame(x = c("a.b", "a.b.c", "a.b.1", "a.b.2", "9.a.b.c"),stringsAsFactors = F)
#Index 1
have$y <- as.numeric(sub('.*\\.', '', have$x))
#Index 2
have$x <- ifelse(!is.na(have$y),sub("^(.*)[.].*", "\\1", have$x),have$x)
#Replace NA by zero
have$y[is.na(have$y)]<-0

这里有一个单独的tidyverse解决方案

library("tidyr")

have %>%
  separate(x, c("x", "y"), "\\.(?=\\d+$)", fill="right") %>%
  replace_na(list(y=0))

        x y
1     a.b 0
2   a.b.c 0
3     a.b 1
4     a.b 2
5 9.a.b.c 0
你可以这样试试

library(tidyverse)
library(stringr)

want2 <- have %>% 
  mutate(y = str_extract(x, "\\d+$")) %>% 
  mutate(y = replace_na(y,0))
#         x y
# 1     a.b 0
# 2   a.b.c 0
# 3   a.b.1 1
# 4   a.b.2 2
# 5 9.a.b.c 0
库(tidyverse)
图书馆(stringr)
want2%
突变(y=str\u extract(x,“\\d+$”)%%>%
突变(y=replace_na(y,0))
#xy
#1 a.b.0
#2 a.b.c.0
#3 a.b.1 1
#4 a.b.2 2
#5.9 a.b.c.0

带有
stringi的选项

library(stringi)
have$y <- as.integer(stri_extract_last_regex(have$x, "\\d+$"))
have$y[is.na(have$y)] <- 0
库(stringi)

have$y这适用于y,但不会删除第3行和第4行x中的数字后缀。请参阅
want
@pyll中的所需输出。我已更新了解决方案。很抱歉给你带来了困惑。请检查并让我知道这是否有效!对这很好用。向上投票,但选择了tidyverse答案,因为它更符合我当前的流程。谢谢
library(stringi)
have$y <- as.integer(stri_extract_last_regex(have$x, "\\d+$"))
have$y[is.na(have$y)] <- 0