Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/solr/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/c/71.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
索引期间solr 4.5.1中的自动语言检测_Solr_Language Detection - Fatal编程技术网

索引期间solr 4.5.1中的自动语言检测

索引期间solr 4.5.1中的自动语言检测,solr,language-detection,Solr,Language Detection,我需要你的帮助。 我想在solr的索引期间检测韩语和英语 我的solr目录结构是 /opt/tmocat7/webapps/solr (solr webapp) /usr/share/solr/collection1 (solr core) /usr/share/solr/lib/langid (lib for langid) 首先,我将一些库(jsonic-1.2.7.jar、langdetect-1.1-20120112.jar、solr-langid-4.5.1.jar)复制到特定目录(

我需要你的帮助。 我想在solr的索引期间检测韩语和英语

我的solr目录结构是

/opt/tmocat7/webapps/solr (solr webapp)
/usr/share/solr/collection1 (solr core)
/usr/share/solr/lib/langid (lib for langid)
首先,我将一些库(jsonic-1.2.7.jar、langdetect-1.1-20120112.jar、solr-langid-4.5.1.jar)复制到特定目录(/usr/share/solr/lib/langid)——我的solr位于

我的solrconfig.xml是

<lib dir="../lib/langid/" regex=".*\.jar" />

<requestHandler name="/update" class="solr.UpdateRequestHandler">    
    <lst name="defaults">   
    <str name="update.chain">dedupe</str> 
    <str name="update.chain">uuid</str>
    <str name="update.chain">langid</str>
    </lst>
</requestHandler>

<updateRequestProcessorChain name="langid">
    <processor class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory">
        <bool name="langid">true</bool>
        <str name="langid.fl">title,content,comment</str>
        <str name="langid.langField">lang</str>
        <str name="langid.langsField">langs</str>
        <str name="langid.lcmap">ko:ko kor:ko en_GB:en en_US:en</str>
        <str name="langid.whitelist">ko,en</str>
        <bool name="langid.map">true</bool>
        <str name="langid.map.fl">title,content,comment</str>
        <bool name="langid.map.keepOrig">true</bool>
        <bool name="langid.map.individual">true</bool> 
        <str name="langid.fallback">ko</str>         
        <str name="langid.map.lcmap">ko:ko kor:ko en_GB:en en_US:en</str>
    </processor>
    <processor class="solr.LogUpdateProcessorFactory" />
    <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
我找不到任何其他警告或错误。 我需要你的建议
谢谢大家

我想你用的是
/update/extract
而不是
/update

在Solr5.3.1中,当我与
/update/extract
一起使用时,它可以正常工作

以下是完整配置:

<requestHandler name="/update/extract" 
              startup="lazy"
              class="solr.extraction.ExtractingRequestHandler" >
<lst name="defaults">
  <str name="lowernames">true</str>
  <str name="uprefix">ignored_</str>

  <!-- capture link hrefs but ignore div attributes -->
  <str name="captureAttr">true</str>
  <str name="fmap.a">links</str>
  <str name="fmap.div">ignored_</str>

  <str name="update.chain">langid</str>
</lst>

真的
忽略_
真的
链接
忽略_
兰吉德

感谢您的提问和精彩的回答,他们帮助我正确配置了系统。我不知道如何将JAR文件
solr langdetect.*.JAR
放入我的
lib
目录中,但每次启动solr时都会显示以下错误:

org.apache.solr.common.SolrException: com.cybozu.labs.langdetect.DetectorFactory.loadProfile(Ljava/util/List;)V

删除JAR文件后,一切正常。但是,问题中提到的其他三个JAR文件(
jsonic-*.*.JAR
langdetect-*.*.JAR
solr langid-*.*.JAR
)是必需的。但是对于上面的config,您可能会得到一个异常,因为上面的lib指令将regex拼写为regx。所以你可能没有得到你的图书馆。谢谢你,亚历山大,斯佩尔斯小姐是我的错。修复后,仍然有问题。我编辑我的问题。我没有错误、警告和LangDetectLanguageIdentifierUpdateProcessor工厂日志。如果检测器工作正常,我可以查看LangDetectLanguageIdentifierUpdateProcessorFactory日志吗?请您解释一下为什么这段代码回答了这个问题?
70634079 [http-bio-7070-exec-38] TRACE org.apache.solr.handler.UpdateRequestHandler  – body
70634079 [http-bio-7070-exec-38] DEBUG org.apache.solr.update.processor.LogUpdateProcessor  – PRE_UPDATE add{,id=2f2323f4f7966e0d} {{params({params(),defaults(update.chain=dedupe&update.chain=uuid&update.chain=langid)}),defaults(wt=xml)}}
70634125 [http-bio-7070-exec-38] TRACE org.apache.solr.update.UpdateLog  – TLOG: added id 2f2323f4f7966e0d to tlog{file=/usr/share/solr/collection1/data/tlog/tlog.0000000000000000129 refcount=1} LogPtr(29407) map=614254179
70634125 [http-bio-7070-exec-38] DEBUG org.apache.solr.update.processor.LogUpdateProcessor  – PRE_UPDATE FINISH {{params({params(),defaults(update.chain=dedupe&update.chain=uuid&update.chain=langid)}),defaults(wt=xml)}}
70634126 [http-bio-7070-exec-38] INFO  org.apache.solr.update.processor.LogUpdateProcessor  – [collection1] webapp=/solr path=/update params={} {add=[2f2323f4f7966e0d (1473490520171872256)]} 0 68
70634146 [http-bio-7070-exec-33] TRACE org.apache.solr.handler.UpdateRequestHandler  – body
70634146 [http-bio-7070-exec-33] DEBUG org.apache.solr.update.processor.LogUpdateProcessor  – PRE_UPDATE add{,id=329ee20831e1a0c7} {{params({params(),defaults(update.chain=dedupe&update.chain=uuid&update.chain=langid)}),defaults(wt=xml)}}
70634148 [http-bio-7070-exec-33] TRACE org.apache.solr.update.UpdateLog  – TLOG: added id 329ee20831e1a0c7 to tlog{file=/usr/share/solr/collection1/data/tlog/tlog.0000000000000000129 refcount=1} LogPtr(46005) map=614254179
70634148 [http-bio-7070-exec-33] DEBUG org.apache.solr.update.processor.LogUpdateProcessor  – PRE_UPDATE FINISH {{params({params(),defaults(update.chain=dedupe&update.chain=uuid&update.chain=langid)}),defaults(wt=xml)}}
70634148 [http-bio-7070-exec-33] INFO  org.apache.solr.update.processor.LogUpdateProcessor  – [collection1] webapp=/solr path=/update params={} {add=[329ee20831e1a0c7 (1473490520241078272)]} 0 2
<requestHandler name="/update/extract" 
              startup="lazy"
              class="solr.extraction.ExtractingRequestHandler" >
<lst name="defaults">
  <str name="lowernames">true</str>
  <str name="uprefix">ignored_</str>

  <!-- capture link hrefs but ignore div attributes -->
  <str name="captureAttr">true</str>
  <str name="fmap.a">links</str>
  <str name="fmap.div">ignored_</str>

  <str name="update.chain">langid</str>
</lst>
org.apache.solr.common.SolrException: com.cybozu.labs.langdetect.DetectorFactory.loadProfile(Ljava/util/List;)V