Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/solr/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/actionscript-3/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Lucene 在Solr中索引PDF文档,无唯一键_Lucene_Solr - Fatal编程技术网

Lucene 在Solr中索引PDF文档,无唯一键

Lucene 在Solr中索引PDF文档,无唯一键,lucene,solr,Lucene,Solr,我想索引PDF(和其他丰富的)文档。我正在使用DataImportHandler 下面是my schema.xml的外观: ......... ......... <field name="title" type="text" indexed="true" stored="true" multiValued="false"/> <field name="description" type="text" indexed="true" stored="true" multi

我想索引PDF(和其他丰富的)文档。我正在使用DataImportHandler

下面是my schema.xml的外观:

.........
.........
 <field name="title" type="text" indexed="true" stored="true" multiValued="false"/>
   <field name="description" type="text" indexed="true" stored="true" multiValued="false"/>
   <field name="date_published" type="string" indexed="false" stored="true" multiValued="false"/>
   <field name="link" type="string" indexed="true" stored="true" multiValued="false" required="false"/>
   <dynamicField name="attr_*" type="textgen" indexed="true" stored="true" multiValued="false"/>
........
........
<uniqueKey>link</uniqueKey>
我是否需要为pdf文档创建一个名为link的字段

之前已经有人问过这个问题,但提供的解决方案使用ExtractRequestHandler,但我想通过DataImportHandler使用它

试试这个:

<entity name="fileItems"  rootEntity="false" dataSource="dbSource" query="select path from file_paths">
  <field column="path" name="link"/>
  <entity name="tika-test" processor="TikaEntityProcessor" url="${fileItems.path}" dataSource="fileSource">
    <field column="title" name="title" meta="true"/>
    <field column="Creation-Date" name="date_published" meta="true"/>
  </entity>
</entity>

试试这个:

<entity name="fileItems"  rootEntity="false" dataSource="dbSource" query="select path from file_paths">
  <field column="path" name="link"/>
  <entity name="tika-test" processor="TikaEntityProcessor" url="${fileItems.path}" dataSource="fileSource">
    <field column="title" name="title" meta="true"/>
    <field column="Creation-Date" name="date_published" meta="true"/>
  </entity>
</entity>

<entity name="fileItems"  rootEntity="false" dataSource="dbSource" query="select path from file_paths">
  <field column="path" name="link"/>
  <entity name="tika-test" processor="TikaEntityProcessor" url="${fileItems.path}" dataSource="fileSource">
    <field column="title" name="title" meta="true"/>
    <field column="Creation-Date" name="date_published" meta="true"/>
  </entity>
</entity>