Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/64.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
循环并比较向量和数据帧(在R中)_R_Dataframe_Vector - Fatal编程技术网

循环并比较向量和数据帧(在R中)

循环并比较向量和数据帧(在R中),r,dataframe,vector,R,Dataframe,Vector,如果我正确理解了你的问题,你可以省去for循环,因为R在你的仪器列表上是向量安全的。使用tidyverse您的代码可能如下所示: # load tidyverse library(tidyverse) # set vector of instruments instru = c("Accordian", "Clarinet", "Trumpet", "DoubleBass", "Oboe", "Piano", "Saxophone", "Violin", "Cello", "Tuba", "V

如果我正确理解了你的问题,你可以省去for循环,因为R在你的仪器列表上是向量安全的。使用
tidyverse
您的代码可能如下所示:

# load tidyverse
library(tidyverse)

# set vector of instruments
instru = c("Accordian", "Clarinet", "Trumpet", "DoubleBass", "Oboe", "Piano", "Saxophone", "Violin", "Cello", "Tuba", "Viola", "Bassoon", "EnglishHorn", "French horn", "Flute", "Piccolo", "SynthBass", "Trombone")

# create dummy train data.frame (more exactly a "tibble")
train <- tibble(mix1_instrument = c("a", "b", "Clarinet"),
                mix2_instrument = c("a", "Clarinet", "c"),
                xxx = c("Clarinet", "b", "c"))

#> train
## A tibble: 3 x 3
#mix1_instrument mix2_instrument xxx     
#<chr>           <chr>           <chr>   
#1 a               a               Clarinet
#2 b               Clarinet        b       
#3 Clarinet        c               c       


# add column "instruments" to train
train <- train %>% 
  mutate(instruments = case_when(
    mix1_instrument %in% instru ~ "1",
    mix2_instrument %in% instru ~ "1",
    TRUE ~"0"
  ))

#>     train
## A tibble: 3 x 4
# mix1_instrument mix2_instrument xxx      instruments
# <chr>           <chr>           <chr>    <chr>      
#1 a               a               Clarinet 0          
#2 b               Clarinet        b        1          
#3 Clarinet        c               c        1       
#加载tidyverse
图书馆(tidyverse)
#仪器的集合向量
instru=c(“手风琴”、“单簧管”、“小号”、“低音提琴”、“双簧管”、“钢琴”、“萨克斯管”、“小提琴”、“大提琴”、“大号”、“中提琴”、“大管”、“英语角”、“法国角”、“长笛”、“短笛”、“低音提琴”、“长号”)
#创建虚拟列车数据帧(更准确地说是“TIBLE”)
火车
##一个tibble:3x3
#mix1_仪器mix2_仪器xxx
#                         
#单簧管
#2 b单簧管b
#3单簧管c
#将“仪器”列添加到培训中
列车%
变异(仪器=情况)(
在%instru~“1”中混合1_仪器%,
混合2_仪器%in%instru~“1”,
真~“0”
))
#>训练
##一个tibble:3x4
#mix1_仪器mix2_仪器xxx仪器
#                                 
#1 a单簧管0
#2 b单簧管b 1
#3单簧管c 1

如果您熟悉
dplyr
,您可以使用mutate来实现这一点

instru = c("Accordian", "Clarinet", "Trumpet", "DoubleBass", "Oboe", "Piano", "Saxophone", "Violin", "Cello", "Tuba", "Viola",
           "Bassoon", "EnglishHorn", "French horn", "Flute", "Piccolo", "SynthBass", "Trombone")

mix1_instruments = c("Accordion", "Trumpet", "Violin", "Cello", "Triangle")
mix2_instruments = c("Bassoon", "Saxophone", "Flute", "French horn", "Washboard")

train = data.frame(mix1_instruments, mix2_instruments)

train <- train %>%
  mutate(instruments = (mix1_instruments %in% instru) | (mix2_instruments %in% instru))
编辑:刚刚看到我在写我的回复时被抢走了(比这要好得多!),但是存在可伸缩性问题

以下内容将插入名为
\u instruments
的新列,该列中的每个条目都在instru中,如果任何列中的任何值都包含instru中的条目,则将它们合并到一个包含逻辑for的列中:

instru = c("Accordian", "Clarinet", "Trumpet", "DoubleBass", "Oboe", "Piano", "Saxophone", "Violin", "Cello", "Tuba", "Viola",
           "Bassoon", "EnglishHorn", "French horn", "Flute", "Piccolo", "SynthBass", "Trombone")

mix1_instruments = c("Clarinet", "Flute", "Clarinet", "English Horn", "Washboard", "Saxophone", "Washboard")
mix2_instruments = c("French Horn", "French Horn", "French Horn", "Flute", "Flute", "Triangle", "Triangle")

train = data.frame(mix1_instruments, mix2_instruments)

train %<>%
  mutate_all(funs(instruments = . %in% instru)) %>%
  unite(col = instruments,
        ends_with('_instruments_instruments'), # optional, iterates only over columns added by unite in this particular dataset
        remove=T) %>%
  mutate(instruments = as.numeric(grepl('TRUE', instruments)))
注意:
%%
来自
magrittr
,只是替换了
x%…
语法

要以csv格式输出,您可以:

write.csv(train, "/path/to/dir/filename.csv", row.names=F)

您能否提供
系列$mix1\u仪器的示例。我认为,仅仅通过编辑代码,您就需要
grepl()
而不是%
中的
%,然后您还可以在
条件下消除
。但如果不知道$mix1\U仪器中的内容,就很难进行评估。您可以尝试
dput()
。抱歉,是的,我现在添加:)只需编辑您的代码并提供一个示例数据。framejust将其添加到底部供参考:嗯,如果您有,比如说,100列,这将不容易缩放。您是对的,Sotos。但我不明白可伸缩性是一项要求。是吗?总是这样。我们对这些问题的回答从来都不过分。由于其他人(他们的数据集中可能有更多的列)会检查这个答案来解决相同的问题,所以我们尽可能地将它们概括化。您好,谢谢您的帮助,但是我如何针对18种不同的工具运行此工具,我已经在上面的问题中添加了一些额外的细节以提供帮助。提前感谢:)我想你不再需要答案了,因为你删除了你的问题。还是这是一个错误?您好,谢谢您的帮助,但我似乎不知道如何将此写入csv文件?我在上面的问题中增加了一些额外的细节。提前感谢:)我在最后介绍了如何将数据帧写入文件
train
#  mix1_instruments mix2_instruments instruments
#1         Clarinet      French Horn           1
#2            Flute      French Horn           1
#3         Clarinet      French Horn           1
#4     English Horn            Flute           1
#5        Washboard            Flute           1
#6        Saxophone         Triangle           1
#7        Washboard         Triangle           0
write.csv(train, "/path/to/dir/filename.csv", row.names=F)