Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/70.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R:使用strsplit()在两个字符之间拆分字符串_R_Split_Strsplit - Fatal编程技术网

R:使用strsplit()在两个字符之间拆分字符串

R:使用strsplit()在两个字符之间拆分字符串,r,split,strsplit,R,Split,Strsplit,假设我有以下字符串: s <- "ID=MIMAT0027618;Alias=MIMAT0027618;Name=hsa-miR-6859-5p;Derives_from=MI0022705" 我可以将strsplit()与多个拆分元素一起使用吗?1)strsplit与矩阵一起使用尝试以下操作: > matrix(strsplit(s, "[;=]")[[1]], 2)[2,] [1] "MIMAT0027618" "MIMAT0027618" "hsa-miR-685

假设我有以下字符串:

s <- "ID=MIMAT0027618;Alias=MIMAT0027618;Name=hsa-miR-6859-5p;Derives_from=MI0022705"
我可以将strsplit()与多个拆分元素一起使用吗?

1)strsplit与矩阵一起使用尝试以下操作:

> matrix(strsplit(s, "[;=]")[[1]], 2)[2,]
[1] "MIMAT0027618"    "MIMAT0027618"    "hsa-miR-6859-5p" "MI0022705"   
2)带gsub的strsplit或使用带gsub的
strsplit

> strsplit(gsub("[^=;]+=", "", s), ";")[[1]]
[1] "MIMAT0027618"    "MIMAT0027618"    "hsa-miR-6859-5p" "MI0022705"     
3)带sub的strsplit或使用带sub的
strsplit

> sub(".*=", "", strsplit(s, ";")[[1]])
[1] "MIMAT0027618"    "MIMAT0027618"    "hsa-miR-6859-5p" "MI0022705"   
4)Straplyc或在等号后提取连续非分号的:

> library(gsubfn)
> strapplyc(s, "=([^;]+)", simplify = unlist)
[1] "MIMAT0027618"    "MIMAT0027618"    "hsa-miR-6859-5p" "MI0022705"  

添加了额外的strplit解决方案。

我知道这是一个老问题,但我发现lookaround正则表达式的用法对于这个问题非常优雅:

library(stringr)
your_string <- '/this/file/name.txt'
result <- str_extract(string = your_string, pattern = "(?<=/)[^/]*(?=\\.)")
result
或者,在您的情况下:

your_string <- "ID=MIMAT0027618;Alias=MIMAT0027618;Name=hsa-miR-6859-5p;Derives_from=MI0022705" 
result <- str_extract(string = your_string, pattern = "(?<=;Alias=)[^;]*(?=;)") 
result # Outputs 'MIMAT0027618'

your_string Cool-基于此,我找到了一种方法,使用:
unlist(apply(B,MARGIN=1,FUN=function(x)矩阵(strsplit(x[列号],“[;=”)[[1]],2,][3]),为数据帧内的每一行提取拆分的第三个元素
请注意,您评论中的代码简化为
矩阵(strsplit(“[;=”)[[1]],2][2,3]
library(dplyr)
strings <- c('/this/file/name1.txt', 'tis/other/file/name2.csv')
df <- as.data.frame(strings) %>% 
  mutate(name = str_extract(string = strings, pattern = "(?<=/)[^/]*(?=\\.)"))
# Optional
names <- df %>% pull(name)
your_string <- "ID=MIMAT0027618;Alias=MIMAT0027618;Name=hsa-miR-6859-5p;Derives_from=MI0022705" 
result <- str_extract(string = your_string, pattern = "(?<=;Alias=)[^;]*(?=;)") 
result # Outputs 'MIMAT0027618'