Ruby on rails 词频计数效率非常低

Ruby on rails 词频计数效率非常低,ruby-on-rails,time,word-cloud,Ruby On Rails,Time,Word Cloud,这是我计算词频的代码 word_arr= ["I", "received", "this", "in", "email", "and", "found", "it", "a", "good", "read", "to", "share......", "Yes,", "Dr", "M.", "Bakri", "Musa", "seems", "to", "know", "what", "is", "happening", "in", "Malaysia.", "Some", "of", "y

这是我计算词频的代码

  word_arr= ["I", "received", "this", "in", "email", "and", "found", "it", "a", "good", "read", "to", "share......", "Yes,", "Dr", "M.", "Bakri", "Musa", "seems", "to", "know", "what", "is", "happening", "in", "Malaysia.", "Some", "of", "you", "may", "know.", "He", "is", "a", "Malay",  "extra horny", "horny nor", "nor their", "their babes", "babes are", "are extra", "extra SEXY..", "SEXY.. .", ". .", ". .It's", ".It's because", "because their", "their CONDOMS", "CONDOMS are", "are Made", "Made In", "In China........;)", "China........;) &&"]

arr_stop_kwd=["a","and"] 

 frequencies = Hash.new(0)
   word_arr.each { |word|
      if !arr_stop_kwd.include?(word.downcase) && !word.match('&&')
        frequencies["#{word.downcase}"] += 1
      end
   }
当我有100k数据时,需要9.03秒,这是我可以用其他方法计算的时间

Thx提前

看一看

您可以使用


请注意,停止字可以从
字中减去。请参阅。

先生,我正在使用ruby 1.8.7,当我需要“facets”时,我发现错误堆栈级别太深,如何解决此问题您需要安装gem。尝试运行
gem安装facets
或将“facets”添加到您的
.gemfile
中,如果您使用的是bundlerStrange,facets文档说-“facets的2.x版本设计用于Ruby 1.8.6及以上版本。facets与Ruby 1.9完全兼容”。
require 'facets'
frequencies = (word_arr-arr_stop_kwd).frequency