Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
从R中的列名中删除特定的最后字符_R_String - Fatal编程技术网

从R中的列名中删除特定的最后字符

从R中的列名中删除特定的最后字符,r,string,R,String,如果列名中的最后几个字符符合特定条件,我需要帮助删除它们,或者调整我的当前代码,以便从一开始就这样做 我正在处理关于共同核心评估的学生测试数据,列名称的格式不一致。数据框的结构如下所示: >names(df) [1] Student.ID [2] State.ID [3] "X2.MD.A.1.Select.and.Use.Appropriate.Tools.to.Measure.Length.Percent.Correct" [4] "X2.MD.A.3.Estimate.Length.

如果列名中的最后几个字符符合特定条件,我需要帮助删除它们,或者调整我的当前代码,以便从一开始就这样做

我正在处理关于共同核心评估的学生测试数据,列名称的格式不一致。数据框的结构如下所示:

>names(df)
[1] Student.ID
[2] State.ID
[3] "X2.MD.A.1.Select.and.Use.Appropriate.Tools.to.Measure.Length.Percent.Correct"
[4] "X2.MD.A.3.Estimate.Length.Percent.Correct"                                   
[5] "X2.MD.A.4.Measurement.Difference.Percent.Correct"                             
[6] "X2.MD.B.5.Addition.and.Subtraction.Word.Problems..Lengths.Percent.Correct"   
[7] "X2.NBT.A.1.Understand.Place.Value.Percent.Correct"                            
[8] "X2.NBT.A.1.a.Understand.Place.Value..Bundles.of.Tens.Percent.Correct"        
[9] "X2.NBT.A.1.b.Understand.Place.Value..Bundles.of.Hundreds.Percent.Correct"     
[10] "X2.NBT.A.3.Read.and.Write.Numbers.to.1.000.Percent.Correct"   
这是我想要的结果:

>name(df)
[1] Student.ID
[2] State.ID
[3] A1_2.MD.A.1
[4] A1_2.MD.A.3
[5] A1_2.MD.A.4
[6] A1_2.MD.B.5
[7] A1_2.NBT.A.1
[8] A1_2.NBT.A.1.a
[9] A1_2.NBT.A.1.b
[10] A1_2.NBT.A.3
这是迄今为止我掌握的代码,但它只让我了解了一部分:

library(reshape2)
library(reshape)
library(stringr)
library(dplyr)
library(qdap)

for (column in c(3:ncol(df))) {
  colnames(df)[column] <- substr(colnames(df[column],4,nchar(colnames(df)[column]))
}

## reduce column names to only the letter and number (strip the description)
for (column in c(3:ncol(df))) {
if (nchar(beg2char(colnames(df)[column],".")) < 3) {
  colnames(df)[column] <- substr(colnames(df[column],1,8)
  } else if (nchar(beg2char(colnames(df)[column],".")) > 2){
  colnames(df)[column] <- substr(colnames(df)[column],1,9)
  }
}
## add screening number indicator to start of percent scores
for (column in c(3:ncol(df))) {
  colnames(df)[column] <- paste("A1_2", colnames(df)[column], sep=".")
}
提前感谢您的帮助

您可以使用

names <- c(your_col_names_here)
names <- gsub("^X2\\.((?:[^.]+\\.){2}[^.]+(?:\\.[a-z])?).*",
              "A1_2.\\1", names)
names(df) <- names

请考虑接受和/或支持它(左边的绿色记号)。所以我对ReXEX还是很新的。如果我想对格式为“NBT.2.3.Using.math”和“MD.2.3.a.applicating.functions”的列进行调整,您能告诉我以下代码的错误吗:
name
names <- c(your_col_names_here)
names <- gsub("^X2\\.((?:[^.]+\\.){2}[^.]+(?:\\.[a-z])?).*",
              "A1_2.\\1", names)
names(df) <- names
# create a dummy df to test with
df <- as.data.frame(matrix(0, ncol = 10, nrow = 1))
names <- c("Student.ID", "State.ID",
           "X2.MD.A.1.Select.and.Use.Appropriate.Tools.to.Measure.Length.Percent.Correct",
           "X2.MD.A.3.Estimate.Length.Percent.Correct",
           "X2.MD.A.4.Measurement.Difference.Percent.Correct",
           "X2.MD.B.5.Addition.and.Subtraction.Word.Problems..Lengths.Percent.Correct",
           "X2.NBT.A.1.Understand.Place.Value.Percent.Correct",
           "X2.NBT.A.1.a.Understand.Place.Value..Bundles.of.Tens.Percent.Correct",
           "X2.NBT.A.1.b.Understand.Place.Value..Bundles.of.Hundreds.Percent.Correct",
           "X2.NBT.A.3.Read.and.Write.Numbers.to.1.000.Percent.Correct")

names(df) <- gsub(pattern = "^X2\\.((?:[^.]+\\.){2}[^.]+(?:\\.[a-z])?).*", "A1_2.\\1", names)
df