Ruby-regex的性能_Ruby_Regex_Performance_Search_Full Text Search

Ruby-regex的性能

ruby regex performance search

Ruby-regex的性能,ruby,regex,performance,search,full-text-search,Ruby,Regex,Performance,Search,Full Text Search,我想看看是否有一种更好的方法来为字符串中的一个单词找到精确匹配。我正在数据库表中查找字段“title”的单词。唱片的数量差别很大，我看到的表现相当可怕以下是我对结果进行基准测试的3种方法 title.split.include(search_string) /\b#{search_string }\b/ =~ title title.include?(search_string) 最好的性能是title.include？（搜索字符串）它不进行精确的单词搜索（我正在寻找精确的单词搜索）

我想看看是否有一种更好的方法来为字符串中的一个单词找到精确匹配。我正在数据库表中查找字段“title”的单词。唱片的数量差别很大，我看到的表现相当可怕

以下是我对结果进行基准测试的3种方法

title.split.include(search_string)
/\b#{search_string }\b/ =~ title
title.include?(search_string)

最好的性能是

title.include？（搜索字符串）

它不进行精确的单词搜索（我正在寻找精确的单词搜索）

有什么方法可以获得更好的性能和精确的字符串匹配结果吗？

对字符串上的空格进行拆分，检查拆分字符串中的每个单词，然后对照

运算符进行检查

您需要对模型字段进行全文搜索。这不是通过正则表达式扫描，而是通过专门的全文检索索引来实现的。我建议您使用以下选项之一，而不是自己滚动：

作为索引
斯芬克斯
雪貂
夏平
Lucene/Solr

以下是一些链接，其中包含有关选项的更多详细信息：

这基本上就是“你好，世界”。split.includefoo'是的。谢谢马克的指点。我做了一个快速而肮脏的黑客，但显然在压力下站不住脚。我会查看你发布的链接。

  def do_benchmark(search_results, search_string)
    n=1000

    Benchmark.bm do |x|
      x.report("\b word search \b:")           {
        n.times {
          search_results.each {|search_result|
          title = search_result.title         
          /\b#{search_string}\b/ =~ title         
        }
      }
     }
  end

    Benchmark.bm do |x|
      search_string = search.search_string
      x.report("split.include? search:") {
        n.times {
          search_results.each {|search_result|
            title = search_result.title
            title.split.include?(search_string)
          }

        }
      }
    end

   Benchmark.bm do |x|
     search_string = search.search_string
     x.report("string include? search:") {
     n.times {
       search_results.each {|search_result|
       title = search_result.title
       title.include?(search_string)
     }

    }
  }
end

"processing: 6234 records"
"Looking for term: red ferrari"
 user     system      total        real
 word search: 50.380000   2.600000  52.980000 ( 57.019927)
 user     system      total        real
 split.include? search: 54.600000   0.260000  54.860000 ( 57.854837)
 user     system      total        real
 string include? search: 21.600000   0.060000  21.660000 ( 21.949715)