循环并比较向量和数据帧(在R中)
如果我正确理解了你的问题,你可以省去for循环,因为R在你的仪器列表上是向量安全的。使用循环并比较向量和数据帧(在R中),r,dataframe,vector,R,Dataframe,Vector,如果我正确理解了你的问题,你可以省去for循环,因为R在你的仪器列表上是向量安全的。使用tidyverse您的代码可能如下所示: # load tidyverse library(tidyverse) # set vector of instruments instru = c("Accordian", "Clarinet", "Trumpet", "DoubleBass", "Oboe", "Piano", "Saxophone", "Violin", "Cello", "Tuba", "V
tidyverse
您的代码可能如下所示:
# load tidyverse
library(tidyverse)
# set vector of instruments
instru = c("Accordian", "Clarinet", "Trumpet", "DoubleBass", "Oboe", "Piano", "Saxophone", "Violin", "Cello", "Tuba", "Viola", "Bassoon", "EnglishHorn", "French horn", "Flute", "Piccolo", "SynthBass", "Trombone")
# create dummy train data.frame (more exactly a "tibble")
train <- tibble(mix1_instrument = c("a", "b", "Clarinet"),
mix2_instrument = c("a", "Clarinet", "c"),
xxx = c("Clarinet", "b", "c"))
#> train
## A tibble: 3 x 3
#mix1_instrument mix2_instrument xxx
#<chr> <chr> <chr>
#1 a a Clarinet
#2 b Clarinet b
#3 Clarinet c c
# add column "instruments" to train
train <- train %>%
mutate(instruments = case_when(
mix1_instrument %in% instru ~ "1",
mix2_instrument %in% instru ~ "1",
TRUE ~"0"
))
#> train
## A tibble: 3 x 4
# mix1_instrument mix2_instrument xxx instruments
# <chr> <chr> <chr> <chr>
#1 a a Clarinet 0
#2 b Clarinet b 1
#3 Clarinet c c 1
#加载tidyverse
图书馆(tidyverse)
#仪器的集合向量
instru=c(“手风琴”、“单簧管”、“小号”、“低音提琴”、“双簧管”、“钢琴”、“萨克斯管”、“小提琴”、“大提琴”、“大号”、“中提琴”、“大管”、“英语角”、“法国角”、“长笛”、“短笛”、“低音提琴”、“长号”)
#创建虚拟列车数据帧(更准确地说是“TIBLE”)
火车
##一个tibble:3x3
#mix1_仪器mix2_仪器xxx
#
#单簧管
#2 b单簧管b
#3单簧管c
#将“仪器”列添加到培训中
列车%
变异(仪器=情况)(
在%instru~“1”中混合1_仪器%,
混合2_仪器%in%instru~“1”,
真~“0”
))
#>训练
##一个tibble:3x4
#mix1_仪器mix2_仪器xxx仪器
#
#1 a单簧管0
#2 b单簧管b 1
#3单簧管c 1
如果您熟悉dplyr
,您可以使用mutate来实现这一点
instru = c("Accordian", "Clarinet", "Trumpet", "DoubleBass", "Oboe", "Piano", "Saxophone", "Violin", "Cello", "Tuba", "Viola",
"Bassoon", "EnglishHorn", "French horn", "Flute", "Piccolo", "SynthBass", "Trombone")
mix1_instruments = c("Accordion", "Trumpet", "Violin", "Cello", "Triangle")
mix2_instruments = c("Bassoon", "Saxophone", "Flute", "French horn", "Washboard")
train = data.frame(mix1_instruments, mix2_instruments)
train <- train %>%
mutate(instruments = (mix1_instruments %in% instru) | (mix2_instruments %in% instru))
编辑:刚刚看到我在写我的回复时被抢走了(比这要好得多!),但是存在可伸缩性问题
以下内容将插入名为\u instruments
的新列,该列中的每个条目都在instru中,如果任何列中的任何值都包含instru中的条目,则将它们合并到一个包含逻辑for的列中:
instru = c("Accordian", "Clarinet", "Trumpet", "DoubleBass", "Oboe", "Piano", "Saxophone", "Violin", "Cello", "Tuba", "Viola",
"Bassoon", "EnglishHorn", "French horn", "Flute", "Piccolo", "SynthBass", "Trombone")
mix1_instruments = c("Clarinet", "Flute", "Clarinet", "English Horn", "Washboard", "Saxophone", "Washboard")
mix2_instruments = c("French Horn", "French Horn", "French Horn", "Flute", "Flute", "Triangle", "Triangle")
train = data.frame(mix1_instruments, mix2_instruments)
train %<>%
mutate_all(funs(instruments = . %in% instru)) %>%
unite(col = instruments,
ends_with('_instruments_instruments'), # optional, iterates only over columns added by unite in this particular dataset
remove=T) %>%
mutate(instruments = as.numeric(grepl('TRUE', instruments)))
注意:%%
来自magrittr
,只是替换了x%…
语法
要以csv格式输出,您可以:
write.csv(train, "/path/to/dir/filename.csv", row.names=F)
您能否提供
系列$mix1\u仪器的示例。我认为,仅仅通过编辑代码,您就需要grepl()
而不是%
中的%,然后您还可以在条件下消除。但如果不知道$mix1\U仪器中的内容,就很难进行评估。您可以尝试dput()
。抱歉,是的,我现在添加:)只需编辑您的代码并提供一个示例数据。framejust将其添加到底部供参考:嗯,如果您有,比如说,100列,这将不容易缩放。您是对的,Sotos。但我不明白可伸缩性是一项要求。是吗?总是这样。我们对这些问题的回答从来都不过分。由于其他人(他们的数据集中可能有更多的列)会检查这个答案来解决相同的问题,所以我们尽可能地将它们概括化。您好,谢谢您的帮助,但是我如何针对18种不同的工具运行此工具,我已经在上面的问题中添加了一些额外的细节以提供帮助。提前感谢:)我想你不再需要答案了,因为你删除了你的问题。还是这是一个错误?您好,谢谢您的帮助,但我似乎不知道如何将此写入csv文件?我在上面的问题中增加了一些额外的细节。提前感谢:)我在最后介绍了如何将数据帧写入文件
train
# mix1_instruments mix2_instruments instruments
#1 Clarinet French Horn 1
#2 Flute French Horn 1
#3 Clarinet French Horn 1
#4 English Horn Flute 1
#5 Washboard Flute 1
#6 Saxophone Triangle 1
#7 Washboard Triangle 0
write.csv(train, "/path/to/dir/filename.csv", row.names=F)