有没有一种方法可以在solr中存储大型JSON而不进行标记化，并在复杂的solr函数中获取它？_Solr

有没有一种方法可以在solr中存储大型JSON而不进行标记化，并在复杂的solr函数中获取它？

solr

有没有一种方法可以在solr中存储大型JSON而不进行标记化，并在复杂的solr函数中获取它？,solr,Solr,我正在尝试使用这个字段 <fieldType name="json" class="solr.TextField" positionIncrementGap="100"> </fieldType> <field name='json_field' type="json" indexed="true" stored="true" omitNorms="false" required="false" multiValued="false

我正在尝试使用这个字段

<fieldType name="json" class="solr.TextField"
           positionIncrementGap="100">
</fieldType>
<field name='json_field' type="json" indexed="true" stored="true"
      omitNorms="false" required="false" multiValued="false"/>

然后我尝试获取整个json

json.strVal(doc)

但是，我只能得到部分代币

如果我尝试使用

<analyzer>
    <tokenizer class="solr.KeywordTokenizerFactory"/>
</analyzer>

我犯了一个错误

“SOLR不接受大于32766的令牌”

原因基本上是，如果您对大文本使用

KeywordTokenizer

，它将尝试创建一个大标记，这显然是有限的

/**
   * Absolute hard maximum length for a term, in bytes once
   * encoded as UTF8.  If a term arrives from the analyzer
   * longer than this length, an
   * <code>IllegalArgumentException</code>  is thrown
   * and a message is printed to infoStream, if set (see {@link
   * IndexWriterConfig#setInfoStream(InfoStream)}).
   */
  public final static int MAX_TERM_LENGTH = DocumentsWriterPerThread.MAX_TERM_LENGTH_UTF8;

没有办法获取完整的json，而是将其保存为未分析，例如设置

index=false

，但是您将无法搜索此json，它将只按原样存储。这真的是您所需要的吗？

请给我们一个例子，说明您在json_字段（json）中有什么内容

/**
   * Absolute hard maximum length for a term, in bytes once
   * encoded as UTF8.  If a term arrives from the analyzer
   * longer than this length, an
   * <code>IllegalArgumentException</code>  is thrown
   * and a message is printed to infoStream, if set (see {@link
   * IndexWriterConfig#setInfoStream(InfoStream)}).
   */
  public final static int MAX_TERM_LENGTH = DocumentsWriterPerThread.MAX_TERM_LENGTH_UTF8;