JavaScript模糊搜索

JavaScript模糊搜索,javascript,fuzzy-search,regex,Javascript,Fuzzy Search,Regex,我正在做这个过滤的事情,我有大约50-100个列表项。每个项目都有如下标记: <li> <input type="checkbox" name="services[]" value="service_id" /> <span class="name">Restaurant in NY</span> <span class="filters"><!-- hidden area --> <span cl

我正在做这个过滤的事情,我有大约50-100个列表项。每个项目都有如下标记:

<li>
  <input type="checkbox" name="services[]" value="service_id" />
  <span class="name">Restaurant in NY</span>
  <span class="filters"><!-- hidden area -->
    <span class="city">@city: new york</span>
    <span class="region">@reg: ny</span>
    <span class="date">@start: 02/05/2012</span>
    <span class="price">@price: 100</span>
  </span>
</li>
  • 纽约餐厅 @城市:纽约 @注册:纽约 @开始日期:2012年5月2日 @价格:100
  • 我创建这样的标记是因为我最初使用

    所以,你可能已经猜到了,我想做的是这样的搜索:
    @region:LA@price:124
    等等。问题是,我还想显示多个项目,以便选择多个。。。一:)

    我想这需要模糊搜索,但问题是我没有找到任何功能

    有什么想法或出发点吗


    //编辑:因为我有相当少的项目,所以我想要一个客户端解决方案。

    我有一个小功能,在数组中搜索字符串 (至少对我来说,它比levenshtein产生更好的结果):

    功能模糊(项目,arr){
    职能oc(a){
    var o={};for(var i=0;i我在javascript中寻找“模糊搜索”,但在这里没有找到解决方案,所以我编写了自己的函数来满足我的需要

    算法非常简单:循环遍历针形字母,检查它们在草堆中是否以相同的顺序出现:

    String.prototype.fuzzy=函数{
    var hay=this.toLowerCase(),i=0,n=-1,l;
    s=s.toLowerCase();
    对于(;l=s[i++];)如果(!~(n=hay.indexOf(l,n+1)),则返回false;
    返回true;
    };
    
    e、 g:


    一年后,List.js获得了一个很好的插件,效果非常好。

    我自己制作了一个插件。它使用和服务更像是一个概念证明,因为它完全没有经过压力测试

    享受javascript模糊搜索/模糊匹配

    我对list.js不满意,所以我创建了自己的。这可能不完全是模糊搜索,但我不知道该怎么称呼它。我只是想让它与查询匹配,而不考虑我的单词在查询中的顺序

    考虑以下场景:

    • 内存中存在一个项目集合
    • 查询词的出现顺序并不重要(例如“hello world”与“world hello”)
    • 代码应该易于阅读
    以下是一个例子:

    var articles = [{
      title: '2014 Javascript MVC Frameworks Comparison',
      author: 'Guybrush Treepwood'
    }, {
      title: 'Javascript in the year 2014',
      author: 'Herman Toothrot'
    },
    {
      title: 'Javascript in the year 2013',
      author: 'Rapp Scallion'
    }];
    
    var fuzzy = function(items, key) {
      // Returns a method that you can use to create your own reusable fuzzy search.
    
      return function(query) {
        var words  = query.toLowerCase().split(' ');
    
        return items.filter(function(item) {
          var normalizedTerm = item[key].toLowerCase();
    
          return words.every(function(word) {
            return (normalizedTerm.indexOf(word) > -1);
          });
        });
      };
    };
    
    
    var searchByTitle = fuzzy(articles, 'title');
    
    searchByTitle('javascript 2014') // returns the 1st and 2nd items
    
    好吧,我希望这能帮助其他人。

    另一个(简单)解决方案。不区分大小写,忽略字母顺序

    它对搜索词的每个字母都执行检查。如果原始字符串包含该字母,它将向上计数(如果不包含该字母,则向下计数)。根据匹配项/字符串长度的比率,它将返回true或false

    String.prototype.fuzzy = function(term, ratio) {
        var string = this.toLowerCase();
        var compare = term.toLowerCase();
        var matches = 0;
        if (string.indexOf(compare) > -1) return true; // covers basic partial matches
        for (var i = 0; i < compare.length; i++) {
            string.indexOf(compare[i]) > -1 ? matches += 1 : matches -=1;
        }
        return (matches/this.length >= ratio || term == "")
    };
    

    这里提供的解决方案返回
    true/false
    ,并且没有关于哪个部分匹配,哪个部分不匹配的信息

    在某些情况下,您可能需要知道它,例如,在搜索结果中将部分输入加粗

    我已经在typescript中创建了自己的解决方案(如果您想使用它,我已经在这里发布了),并在这里演示

    它的工作原理如下:

    fuzzyString('liolor', 'lorem ipsum dolor sit');
    
    // returns
    {
      parts: [
        { content: 'l', type: 'input' },
        { content: 'orem ', type: 'fuzzy' },
        { content: 'i', type: 'input' },
        { content: 'psum d', type: 'fuzzy' },
        { content: 'olor', type: 'input' },
        { content: ' sit', type: 'suggestion' },
      ],
      score: 0.87,
    }
    
    下面是完整的实现(Typescript)


    检查:-可能会有所帮助。还可以查看您的需求是否允许您将模糊搜索部分移动到服务器端(使用AJAX)-如果是这样,那么使用solr将是最简单的事情。此外,您可以在短时间内搜索数千个项目。Techfoobar:谢谢,但yeti似乎更像java而不是javascript。我不知道如何在现有代码中使用它。此外,solr似乎也是java。我需要客户端或PHP。我可以“看不出模糊搜索适用于何处。你没有提到任何会让我认为你需要模糊搜索的东西。或者我遗漏了什么?模糊搜索使用模糊“类别”,没有严格定义“边界”。在您的案例中,我看到的是严格搜索,它将匹配多个属性。@Matjaz我不太确定这是如何命名的。这只是我的假设:)感谢澄清,希望我能够进行更集中的搜索。它进行搜索,但实际上不是“模糊搜索”。例如,在他们的演示“模糊”查询中:“bruwo”找不到“Guybrush Threepwood”…您可以指定搜索的模糊程度。您可以键入bruw并显示“Guybrush Treepwood”。只有从第二个单词中获得两个字符时,结果才会被过滤掉。为什么要调用
    arr[i].toLowerCase().indexOf(test[r].toLowerCase().split(“*”)
    一遍又一遍?!这很好(除了字符串原型操作),将适合某些用例,但更复杂的模糊搜索也需要返回最相关的结果。我猜这将基于连续字符的数量。这太棒了。我发现自己几乎在构建的所有内容中都使用了它。感谢分享@tborychowski!
    String.prototype.fuzzy = function(term, ratio) {
        var string = this.toLowerCase();
        var compare = term.toLowerCase();
        var matches = 0;
        if (string.indexOf(compare) > -1) return true; // covers basic partial matches
        for (var i = 0; i < compare.length; i++) {
            string.indexOf(compare[i]) > -1 ? matches += 1 : matches -=1;
        }
        return (matches/this.length >= ratio || term == "")
    };
    
    ("Test").fuzzy("st", 0.5) // returns true
    ("Test").fuzzy("tes", 0.8) // returns false cause ratio is too low (0.75)
    ("Test").fuzzy("stet", 1) // returns true
    ("Test").fuzzy("zzzzzest", 0.75) // returns false cause too many alien characters ("z")
    ("Test").fuzzy("es", 1) // returns true cause partial match (despite ratio being only 0.5)
    
    fuzzyString('liolor', 'lorem ipsum dolor sit');
    
    // returns
    {
      parts: [
        { content: 'l', type: 'input' },
        { content: 'orem ', type: 'fuzzy' },
        { content: 'i', type: 'input' },
        { content: 'psum d', type: 'fuzzy' },
        { content: 'olor', type: 'input' },
        { content: ' sit', type: 'suggestion' },
      ],
      score: 0.87,
    }
    
    type MatchRoleType = 'input' | 'fuzzy' | 'suggestion';
    
    interface FuzzyMatchPart {
      content: string;
      type: MatchRoleType;
    }
    
    interface FuzzyMatchData {
      parts: FuzzyMatchPart[];
      score: number;
    }
    
    interface FuzzyMatchOptions {
      truncateTooLongInput?: boolean;
      isCaseSesitive?: boolean;
    }
    
    function calculateFuzzyMatchPartsScore(fuzzyMatchParts: FuzzyMatchPart[]) {
      const getRoleLength = (role: MatchRoleType) =>
        fuzzyMatchParts
          .filter((part) => part.type === role)
          .map((part) => part.content)
          .join('').length;
    
      const fullLength = fuzzyMatchParts.map((part) => part.content).join('')
        .length;
      const fuzzyLength = getRoleLength('fuzzy');
      const inputLength = getRoleLength('input');
      const suggestionLength = getRoleLength('suggestion');
    
      return (
        (inputLength + fuzzyLength * 0.7 + suggestionLength * 0.9) / fullLength
      );
    }
    
    function compareLetters(a: string, b: string, isCaseSensitive = false) {
      if (isCaseSensitive) {
        return a === b;
      }
      return a.toLowerCase() === b.toLowerCase();
    }
    
    function fuzzyString(
      input: string,
      stringToBeFound: string,
      { truncateTooLongInput, isCaseSesitive }: FuzzyMatchOptions = {},
    ): FuzzyMatchData | false {
      // make some validation first
    
      // if input is longer than string to find, and we dont truncate it - it's incorrect
      if (input.length > stringToBeFound.length && !truncateTooLongInput) {
        return false;
      }
    
      // if truncate is enabled - do it
      if (input.length > stringToBeFound.length && truncateTooLongInput) {
        input = input.substr(0, stringToBeFound.length);
      }
    
      // if input is the same as string to be found - we dont need to look for fuzzy match - return it as match
      if (input === stringToBeFound) {
        return {
          parts: [{ content: input, type: 'input' }],
          score: 1,
        };
      }
    
      const matchParts: FuzzyMatchPart[] = [];
    
      const remainingInputLetters = input.split('');
    
      // let's create letters buffers
      // it's because we'll perform matching letter by letter, but if we have few letters matching or not matching in the row
      // we want to add them together as part of match
      let ommitedLettersBuffer: string[] = [];
      let matchedLettersBuffer: string[] = [];
    
      // helper functions to clear the buffers and add them to match
      function addOmmitedLettersAsFuzzy() {
        if (ommitedLettersBuffer.length > 0) {
          matchParts.push({
            content: ommitedLettersBuffer.join(''),
            type: 'fuzzy',
          });
          ommitedLettersBuffer = [];
        }
      }
    
      function addMatchedLettersAsInput() {
        if (matchedLettersBuffer.length > 0) {
          matchParts.push({
            content: matchedLettersBuffer.join(''),
            type: 'input',
          });
          matchedLettersBuffer = [];
        }
      }
    
      for (let anotherStringToBeFoundLetter of stringToBeFound) {
        const inputLetterToMatch = remainingInputLetters[0];
    
        // no more input - finish fuzzy matching
        if (!inputLetterToMatch) {
          break;
        }
    
        const isMatching = compareLetters(
          anotherStringToBeFoundLetter,
          inputLetterToMatch,
          isCaseSesitive,
        );
    
        // if input letter doesnt match - we'll go to the next letter to try again
        if (!isMatching) {
          // add this letter to buffer of ommited letters
          ommitedLettersBuffer.push(anotherStringToBeFoundLetter);
          // in case we had something in matched letters buffer - clear it as matching letters run ended
          addMatchedLettersAsInput();
          // go to the next input letter
          continue;
        }
    
        // we have input letter matching!
    
        // remove it from remaining input letters
        remainingInputLetters.shift();
    
        // add it to matched letters buffer
        matchedLettersBuffer.push(anotherStringToBeFoundLetter);
        // in case we had something in ommited letters buffer - add it to the match now
        addOmmitedLettersAsFuzzy();
    
        // if there is no more letters in input - add this matched letter to match too
        if (!remainingInputLetters.length) {
          addMatchedLettersAsInput();
        }
      }
    
      // if we still have letters left in input - means not all input was included in string to find - input was incorrect
      if (remainingInputLetters.length > 0) {
        return false;
      }
    
      // lets get entire matched part (from start to last letter of input)
      const matchedPart = matchParts.map((match) => match.content).join('');
    
      // get remaining part of string to be found
      const suggestionPart = stringToBeFound.replace(matchedPart, '');
    
      // if we have remaining part - add it as suggestion
      if (suggestionPart) {
        matchParts.push({ content: suggestionPart, type: 'suggestion' });
      }
      const score = calculateFuzzyMatchPartsScore(matchParts);
    
      return {
        score,
        parts: matchParts,
      };
    }