Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/74.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/list/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何替换R中条件的特定字符串?_R_Conditional Statements_Bioinformatics - Fatal编程技术网

如何替换R中条件的特定字符串?

如何替换R中条件的特定字符串?,r,conditional-statements,bioinformatics,R,Conditional Statements,Bioinformatics,我有数据,我压缩了重复的基因结果,每一个都在一行中。这使得一些行填充了逗号,我试图用NA替换只包含逗号的行。然而,我也有一些行带有逗号和定性数据,我正试图保留它们。例如: Gene Condition Gene1 Name=Asymmetrical dimethylarginine level, Name=Bipolar disorder and schizophrenia, Name=3-hydroxypropylmercapturic acid levels in smoker

我有数据,我压缩了重复的基因结果,每一个都在一行中。这使得一些行填充了逗号,我试图用NA替换只包含逗号的行。然而,我也有一些行带有逗号和定性数据,我正试图保留它们。例如:

Gene     Condition
Gene1    Name=Asymmetrical dimethylarginine level, Name=Bipolar disorder and schizophrenia, Name=3-hydroxypropylmercapturic acid levels in smoker
Gene2    Name=blood pressure, Name=diabetes
Gene3    Name=heart disease, , , , , 
Gene4    , , , , , , , , ,
Gene5    NA
Gene6    , , ,
预期产出:

Gene     Condition
Gene1    Name=Asymmetrical dimethylarginine level, Name=Bipolar disorder and schizophrenia, Name=3-hydroxypropylmercapturic acid levels in smoker
Gene2    Name=blood pressure, Name=diabetes
Gene3    Name=heart disease, , , , , 
Gene4    NA
Gene5    NA
Gene6    NA
#ideally I would get rid of Gene3's extra commas but this is not necessary
我试图为一条语句编写代码,比如“if the row在condition column replace to NA中只有逗号”,并尝试使用一条语句,比如
data$condition[if(“,”&![a-Z]|[a-Z]|[=])]a选项

grepl(pattern=“^[,]+$”

当该行只包含空格和逗号时,此函数将返回TRUE

DF <-
structure(list(Gene = c("Gene1", "Gene2", "Gene3", "Gene4", "Gene5", 
"Gene6"), Condition= c("Name=Asymmetrical dimethylarginine leve,l Name=Bipolar disorder and schizophrenia, Name=3-hydroxypropylmercapturic acid levels in smoker", 
"Name=blood pressure, Name=diabetes", "Name=heart disease, , , , ,", 
", , , , , , , , ,", NA, "Name=kidney disease, , ,")), 
row.names = c(NA, -6L), class = "data.frame")

DF[which(grepl("^[ ,]+$",DF$Condition)==T),2]<-NA


DF如果我理解正确,您可以尝试在单元格仅包含逗号时删除逗号

DF$condition <- gsub('^(,\\s*)+$',NA, DF$Condition)
第二次输出

> gsub('^$', NA, gsub('(,\\s*)+$','', DF$Condition))
[1] "Name=Asymmetrical dimethylarginine leve,l Name=Bipolar disorder and schizophrenia, Name=3-hydroxypropylmercapturic acid levels in smoker"
[2] "Name=blood pressure, Name=diabetes"                                                                                                      
[3] "Name=heart disease"                                                                                                                      
[4] NA                                                                                                                                        
[5] NA                                                                                                                                        
[6] "Name=kidney disease" 

您可以像下面那样尝试
grepl
,其中只有
的行将被设置为
NA

DF <- within(DF,Condition <-replace(Condition,!grepl("[[:alnum:]]",Condition),NA))
DF
> gsub('^(,\\s*)+$',NA, DF$Condition)
[1] "Name=Asymmetrical dimethylarginine leve,l Name=Bipolar disorder and schizophrenia, Name=3-hydroxypropylmercapturic acid levels in smoker"
[2] "Name=blood pressure, Name=diabetes"                                                                                                      
[3] "Name=heart disease, , , , ,"                                                                                                             
[4] NA                                                                                                                                        
[5] NA                                                                                                                                        
[6] "Name=kidney disease, , ," 
> gsub('^$', NA, gsub('(,\\s*)+$','', DF$Condition))
[1] "Name=Asymmetrical dimethylarginine leve,l Name=Bipolar disorder and schizophrenia, Name=3-hydroxypropylmercapturic acid levels in smoker"
[2] "Name=blood pressure, Name=diabetes"                                                                                                      
[3] "Name=heart disease"                                                                                                                      
[4] NA                                                                                                                                        
[5] NA                                                                                                                                        
[6] "Name=kidney disease" 
DF <- within(DF,Condition <-replace(Condition,!grepl("[[:alnum:]]",Condition),NA))