如何在R中删除和保留txt文件的某些列？_R_File_Calculated Columns

如何在R中删除和保留txt文件的某些列？

r file

如何在R中删除和保留txt文件的某些列？,r,file,calculated-columns,R,File,Calculated Columns,我有一个txt格式的文件，由选项卡分隔，下面是一个摘录： id 1 2 4 15 18 20 1_at 100 200 89 189 299 788 2_at 8 78 33 89 90 99 3_xt 300 45 53 234 89 34 4_dx 49 34 88 8 9

我有一个txt格式的文件，由选项卡分隔，下面是一个摘录：

id 1 2 4 15 18 20 1_at 100 200 89 189 299 788 2_at 8 78 33 89 90 99 3_xt 300 45 53 234 89 34 4_dx 49 34 88 8 9 15
现在我有一个文件，也是txt格式的，用逗号分隔，数据如下：

18,1,4,20
因此，基于该文件，我希望读取该文件并仅从第一个列表数据中提取列，以便存储在另一个文件中，如下所示：
（重要提示：我需要根据csv文件保留要存储的数据顺序）
如果我想提取的数据是行，那就更容易了，因为我可以逐行读取它，并与我的txt文件进行比较（我已经这样做了），但我仍然坚持使用这个专栏内容
我想知道是否有任何方法可以使用子索引函数直接提取列

任何帮助都将不胜感激
如果您不想读取整个文件，然后进行筛选，这是一种执行您想要执行的操作的方法。如果您之前不知道每个列的类，这种方法也会起作用。本例假设输入文件的列名不以数字开头，因为R中不允许这样做。它也可以使用数字作为列名，但要小心，因为它可能在其他操作中失败

> txt = 'id 1 2 4 15 18 20 + 1_at 100 200 89 189 299 788 + 2_at 8 78 33 89 90 99 + 3_xt 300 45 53 234 89 34 + 4_dx 49 34 88 8 9 15' > > df <- read.table(textConnection(txt), header=T, nrows=1, check.names=F) > > df id 1 2 4 15 18 20 1 1_at 100 200 89 189 299 788 > > #Lets say f is the column filter you read from other file > f <- c("id", "18", "1", "4", "20") > > f [1] "id" "18" "1" "4" "20" > > #Get Column Classes > CC <- sapply(df, class) > CC id 1 2 4 15 18 20 "factor" "integer" "integer" "integer" "integer" "integer" "integer" > #Specify columns that you don't want to read as "NULL" > CC[!names(CC) %in% f] <- "NULL" > CC id 1 2 4 15 18 20 "factor" "integer" "NULL" "integer" "NULL" "integer" "integer" > > #Read whole data frame again > df <- read.table(textConnection(txt), header=T, colClasses=CC, check.names=F) > > #get columns in desired order > df[,f] id 18 1 4 20 1 1_at 299 100 89 788 2 2_at 90 8 33 99 3 3_xt 89 300 53 34 4 4_dx 9 49 88 15

>txt='id 1 2 4 15 18 20 +1_100 200 89 189 299 788 +2_在8 78 33 89 90 99 +3_xt 300 45 53 234 89 34 +4_dx 49 34 88 8 9 15' > >df >df 身份证号码1 2 4 15 18 20 1 1_100 200 89 189 299 788 > >#假设f是从其他文件读取的列过滤器 >f >f [1] id“18”1“4”20 > >#获取列类 >抄送身份证号码1 2 4 15 18 20 “因子”“整数”“整数”“整数”“整数”“整数”“整数”“整数” >#指定不希望读取为“NULL”的列 >抄送[！名称（抄送）%in%f]CC 身份证号码1 2 4 15 18 20 “因子”“整数”“空”“整数”“空”“整数”“整数” > >#再次读取整个数据帧 >df >#按所需顺序获取列 >df[，f] 身份证号码1814200 1 1_电话：299 100 89 788 2 2_在90 8 33 99 3 3_xt 89 300 53 34 4 4_dx 9 49 88 15
列名不能以数字开头。@Jack Maney我有一个类似的问题，我想从行中提取什么，所以我使用了两个循环，一个读取txt文件，另一个读取表格格式。比较是按顺序进行的，当数据与我存储在另一个文件中的数据匹配时。有了这些专栏，我不知道怎么才能做到。也许只是先转置数据，然后处理它，如果它是行的，然后再转置，但是文件是巨大的，所以我不认为这是一个好消息。solution@geektrader这就是我需要修改的文件格式extract@Manuel-文件格式不相关。R将不遵守以数字开头的列名。谢谢，但是如果数据在文件中，我不能使用read.table，我如何处理它？是的@geektrader，我尝试了解决方案，但我需要保持数据的顺序。如果您在我的示例中看到，我需要根据id 18 1 4 20之类的标签存储数据，而不是按顺序存储fashion@Manuel您可以使用
df[，f]

> txt = 'id 1 2 4 15 18 20 + 1_at 100 200 89 189 299 788 + 2_at 8 78 33 89 90 99 + 3_xt 300 45 53 234 89 34 + 4_dx 49 34 88 8 9 15' > > df <- read.table(textConnection(txt), header=T, nrows=1, check.names=F) > > df id 1 2 4 15 18 20 1 1_at 100 200 89 189 299 788 > > #Lets say f is the column filter you read from other file > f <- c("id", "18", "1", "4", "20") > > f [1] "id" "18" "1" "4" "20" > > #Get Column Classes > CC <- sapply(df, class) > CC id 1 2 4 15 18 20 "factor" "integer" "integer" "integer" "integer" "integer" "integer" > #Specify columns that you don't want to read as "NULL" > CC[!names(CC) %in% f] <- "NULL" > CC id 1 2 4 15 18 20 "factor" "integer" "NULL" "integer" "NULL" "integer" "integer" > > #Read whole data frame again > df <- read.table(textConnection(txt), header=T, colClasses=CC, check.names=F) > > #get columns in desired order > df[,f] id 18 1 4 20 1 1_at 299 100 89 788 2 2_at 90 8 33 99 3 3_xt 89 300 53 34 4 4_dx 9 49 88 15