Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/solr/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/entity-framework/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
当从TikaEntity处理器中的Ftp服务器获取文件时,Solr DIH会遇到麻烦。如何将凭据传递给URLDASURCE?_Solr_Apache Tika_Dih - Fatal编程技术网

当从TikaEntity处理器中的Ftp服务器获取文件时,Solr DIH会遇到麻烦。如何将凭据传递给URLDASURCE?

当从TikaEntity处理器中的Ftp服务器获取文件时,Solr DIH会遇到麻烦。如何将凭据传递给URLDASURCE?,solr,apache-tika,dih,Solr,Apache Tika,Dih,当我试图使用tikaEntityProcessor从ftp服务器获取文件以提取元数据时,遇到了一些问题 我需要一种向UrlDataSource传递一些凭据的方法 谁能告诉我怎么做 示例值: url: ftp用户:alex ftp密码:通过 这是我的Data config.xml <dataConfig> <dataSource type="BinURLDataSource" name="binSource" baseUrl="ftp://loca

当我试图使用tikaEntityProcessor从ftp服务器获取文件以提取元数据时,遇到了一些问题

我需要一种向UrlDataSource传递一些凭据的方法

谁能告诉我怎么做

示例值:

url

ftp用户:alex

ftp密码:通过

这是我的
Data config.xml

<dataConfig>  
    <dataSource type="BinURLDataSource" name="binSource" 
        baseUrl="ftp://localhost:21/" onError="skip" />     
     <dataSource type="JdbcDataSource"
                 driver="org.postgresql.Driver"
                 url="jdbc:postgresql://localhost:5432/files"
                 user="postgres"
                 password="admin" 
                 readOnly="true" 
                 autoCommit="false"
                 transactionIsolation="TRANSACTION_READ_COMMITTED"
                 holdability="CLOSE_CURSORS_AT_COMMIT"/>
    <document>
        <entity name="item" query="select* from filesfromftp"
                deltaQuery="select url from filesfromftp"
                rootEntity="false"
                transformer="RegexTransformer">            
                <field column="url" name="id" />            
                <entity name="tika-test" 
                        processor="TikaEntityProcessor" 
                        url="${item.url}" 
                        format="none"
                        dataSource="binSource"                          
                        onError="skip">                     
                  <field column="Author" name="author" meta="true"/>
                  <field column="title" name="title" meta="true"/>
                  <field column="pdf:docinfo:title" name="title" meta="true"/>
                  <field column="xmpTPg:NPages" name="numPages" meta="true"/>
                  <field column="Creation-Date" name="createdDate" meta="true"/>
                </entity>
        </entity>
    </document>
</dataConfig>
请问,如何与SolrDIH内的FtpServer建立连接


是否有方法将某些凭据传递给UrlDataSource?

有一个用于此目的的修补程序。它非常旧,但您可以将其移植到新版本。请看最近的一篇评论,其中展示了如何使用auth创建自定义URLDataSource

Exception in entity : tika-test:org.apache.solr.handler.dataimport.DataImportHandlerException: Exception in invoking url ftp://localhost/jnioche-bristoljavameetup20150310-150311041443-conversion-gate01.pdf Processing Document # 1
    at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69)
    at org.apache.solr.handler.dataimport.BinURLDataSource.getData(BinURLDataSource.java:89)
    at org.apache.solr.handler.dataimport.BinURLDataSource.getData(BinURLDataSource.java:38)
    at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:128)
    at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:244)
    at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475)
    at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:516)
    at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
    at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
    at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
    at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
    at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:475)
    at org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:458)
    at java.lang.Thread.run(Thread.java:745)
Caused by: sun.net.ftp.FtpLoginException: Invalid username/password
    at sun.net.www.protocol.ftp.FtpURLConnection.connect(FtpURLConnection.java:308)
    at sun.net.www.protocol.ftp.FtpURLConnection.getInputStream(FtpURLConnection.java:393)
    at org.apache.solr.handler.dataimport.BinURLDataSource.getData(BinURLDataSource.java:86)
    ... 12 more