Java 使用stem运算符进行AppEngine全文文档索引搜索

Java 使用stem运算符进行AppEngine全文文档索引搜索,java,google-app-engine,full-text-search,Java,Google App Engine,Full Text Search,我正在评估AppEngine文档索引全文搜索,在使用Stem运算符“~”时遇到了一些问题。 基本上,我创建了几个测试文档的索引,所有文档都带有一个标题字段。该字段的一些示例值包括: "Houses Desks Tables" "referer image vod event" "events with cats and dogs and" "names very interesting days" 我使用的是Java,我的查询代码片段如下所示: Document doc = Document.

我正在评估AppEngine文档索引全文搜索,在使用Stem运算符“~”时遇到了一些问题。 基本上,我创建了几个测试文档的索引,所有文档都带有一个标题字段。该字段的一些示例值包括:

"Houses Desks Tables"
"referer image vod event"
"events with cats and dogs and"
"names very interesting days"
我使用的是Java,我的查询代码片段如下所示:

Document doc = Document.newBuilder().setId(key)
    .addField(Field.newBuilder().setName("title").setText(title))
    .addField(Field.newBuilder().setName("type").setText(type))            
    .addField(Field.newBuilder().setName("username").setText(username))
    .build();
DocumentSearchIndexService.getInstance().indexDocument(indexName, doc);
但是,返回的结果将始终仅匹配精确的单数或复数形式:

query cat, return nothing
query dog, return nothing
query name, return nothing
query house, return nothing

query cats, return "events with cats and dogs and"
query dogs, return "events with cats and dogs and"
query names, return "names very interesting days"
query houses, return "Houses Desks Tables"
因此,对于如何返回条目,或者如果我的查询构造方式不正确,我真的很困惑。

请注意,如果您在标准环境中使用Java 8的Java Development Server,则需要进行词干分析

如果要在App Engine上部署应用程序,请使用找到的Utils.java类对文档进行正确索引

我克隆了Google云平台java文档示例的代码,转到appengine-java8/search文件夹,并以以下方式修改了该类的代码,以便包含带有stem运算符的查询~:

...
  @Override
  public void doGet(HttpServletRequest req, HttpServletResponse resp) throws IOException {
    PrintWriter out = resp.getWriter();
    Document doc =
        Document.newBuilder()
            .setId("theOnlyPiano")
            .addField(Field.newBuilder().setName("product").setText("cats and dogs"))
            .addField(Field.newBuilder().setName("maker").setText("Yamaha"))
            .addField(Field.newBuilder().setName("price").setNumber(4000))
            .build();
    try {
      Utils.indexADocument(SEARCH_INDEX, doc);
    } catch (InterruptedException e) {
      // ignore
    }
    // [START search_document]
    final int maxRetry = 3;
    int attempts = 0;
    int delay = 2;
    while (true) {
      try {
        String searchText = "cat";
        String queryString = "product = ~"+searchText;
        Results<ScoredDocument> results = getIndex().search(queryString);

        // Iterate over the documents in the results
        for (ScoredDocument document : results) {
          // handle results
          out.print("product: " + document.getOnlyField("product").getText());
          //out.println(", price: " + document.getOnlyField("price").getNumber());
        }
      } catch (SearchException e) {
        if (StatusCode.TRANSIENT_ERROR.equals(e.getOperationResult().getCode())
            && ++attempts < maxRetry) {
          // retry
          try {
            Thread.sleep(delay * 1000);
          } catch (InterruptedException e1) {
            // ignore
          }
          delay *= 2; // easy exponential backoff
          continue;
        } else {
          throw e;
        }
      }
      break;
    }
    // [END search_document]
    // We don't test the search result below, but we're fine if it runs without errors.
    out.println(" Search performed");
    Index index = getIndex();
    // [START simple_search_1]
    index.search("rose water");
    // [END simple_search_1]
    // [START simple_search_2]
    index.search("1776-07-04");
    // [END simple_search_2]
    // [START simple_search_3]
    // search for documents with pianos that cost less than $5000
    index.search("product = ~cat AND price < 5000");
    // [END simple_search_3]
  }
}


你能分享代码中定义标题字段的部分吗?谢谢@DanielOcando,我已经添加了代码来生成文档并添加到索引中。非常感谢@DanielOcando,显然我错过了本地开发环境中不是wokring的部分,这就是为什么我一直在努力找出为什么事情没有按预期进行的原因。在将代码部署到AppEngine之后,词干分析现在可以正常工作了!
...
  @Override
  public void doGet(HttpServletRequest req, HttpServletResponse resp) throws IOException {
    PrintWriter out = resp.getWriter();
    Document doc =
        Document.newBuilder()
            .setId("theOnlyPiano")
            .addField(Field.newBuilder().setName("product").setText("cats and dogs"))
            .addField(Field.newBuilder().setName("maker").setText("Yamaha"))
            .addField(Field.newBuilder().setName("price").setNumber(4000))
            .build();
    try {
      Utils.indexADocument(SEARCH_INDEX, doc);
    } catch (InterruptedException e) {
      // ignore
    }
    // [START search_document]
    final int maxRetry = 3;
    int attempts = 0;
    int delay = 2;
    while (true) {
      try {
        String searchText = "cat";
        String queryString = "product = ~"+searchText;
        Results<ScoredDocument> results = getIndex().search(queryString);

        // Iterate over the documents in the results
        for (ScoredDocument document : results) {
          // handle results
          out.print("product: " + document.getOnlyField("product").getText());
          //out.println(", price: " + document.getOnlyField("price").getNumber());
        }
      } catch (SearchException e) {
        if (StatusCode.TRANSIENT_ERROR.equals(e.getOperationResult().getCode())
            && ++attempts < maxRetry) {
          // retry
          try {
            Thread.sleep(delay * 1000);
          } catch (InterruptedException e1) {
            // ignore
          }
          delay *= 2; // easy exponential backoff
          continue;
        } else {
          throw e;
        }
      }
      break;
    }
    // [END search_document]
    // We don't test the search result below, but we're fine if it runs without errors.
    out.println(" Search performed");
    Index index = getIndex();
    // [START simple_search_1]
    index.search("rose water");
    // [END simple_search_1]
    // [START simple_search_2]
    index.search("1776-07-04");
    // [END simple_search_2]
    // [START simple_search_3]
    // search for documents with pianos that cost less than $5000
    index.search("product = ~cat AND price < 5000");
    // [END simple_search_3]
  }
}

...
  @After
  public void tearDown() {
    helper.tearDown();
  }

  @Test
  public void doGet_successfulyInvoked() throws Exception {
  //  servletUnderTest.doGet(mockRequest, mockResponse);
  //  String content = responseWriter.toString();
  //  assertWithMessage("SearchServlet response").that(content).contains("maker: Yamaha");
  //  assertWithMessage("SearchServlet response").that(content).contains("price: 4000.0");
  }
}