使用cfsearch和SOLR进行ColdFusion PDF文件搜索速度非常慢_Pdf_Coldfusion_Solr_Indexing_Cfsearch

使用cfsearch和SOLR进行ColdFusion PDF文件搜索速度非常慢

pdf coldfusion solr indexing

使用cfsearch和SOLR进行ColdFusion PDF文件搜索速度非常慢,pdf,coldfusion,solr,indexing,cfsearch,Pdf,Coldfusion,Solr,Indexing,Cfsearch,我有一个功能正常的AdobeColdFusion应用程序，它通过Solr search为大约2k个PDF文件编制索引，并提供预期的结果，但是对集合的每个搜索查询通常需要25-30秒以下是我如何将2k PDF文件编入Solr的索引：  <cfset getfiles = application.file.getfiles()>  &l

我有一个功能正常的AdobeColdFusion应用程序，它通过Solr search为大约2k个PDF文件编制索引，并提供预期的结果，但是对集合的每个搜索查询通常需要25-30秒

以下是我如何将2k PDF文件编入Solr的索引：

<!--- query database files --->
<cfset getfiles = application.file.getfiles()>

<!--- create solr query set --->
<cfset filesQuery = QueryNew("
    fileUID
    , filepath
    , title
    , description
    , fileext
    , added
")>

<!--- create new file query with key path and download url --->
<cfoutput query="getfiles">
<cfset ext = trim(getfiles.fileext)>
<cfset path = expandpath('/docs/#fileUID#.#ext#')>

<cfscript>
    newRow = QueryAddRow(filesQuery);
    QuerySetCell(filesQuery, "fileUID","#fileUID#" );
    QuerySetCell(filesQuery, "filepath","#path#" );
    QuerySetCell(filesQuery, "title","#filename#" );
    QuerySetCell(filesQuery, "description","#description#" );
    QuerySetCell(filesQuery, "added","#added#" );
</cfscript>

</cfoutput>

<!--- index the bunch --->
<cfindex  
    query = "filesQuery" 
    collection = "resumes" 
    action = "update" 
    type = "file" 
    key = "filepath"     
    title = "title" 
    body = "title, description"
    custom1 = "fileext"
    custom2 = "added"
    category= "file"
    status = "filestatus">


newRow=QueryAddRow（filequery）；
QuerySetCell（filequery，“fileUID”和“#fileUID#”）；
QuerySetCell（filequery，“filepath”，“path#”）；
QuerySetCell（文件查询，“标题”，“文件名”）；
QuerySetCell（文件查询，“描述”和“#描述#”）；
QuerySetCell（filesQuery，“added”、“#added#”）；

这是搜索文件的方式以及（25-30秒）Solr搜索的发生位置：

<!--- imagine form with (form.search) for terms --->

<cfsearch name = "results" 
    collection = "resumes" 
    criteria = "#form.search#
    contextPassages = "1"
    contextBytes = "300"
    maxrows = "100"
    contextHighlightBegin = "<strong>"
    contextHighlightEnd = " </strong>">

<!--- show (results) query --->


你可以试试使用。它使用solrapi。通过绕过
更新，您可能会获得性能提升！！！问题最终是cfsearch和cfdump瓶颈的结合，导致每次搜索大约需要30秒的时间。我使用这些标签的组合来确保出现预期的结果。通过cfsearch查询集合并通过cfloop查询返回结果后，所有性能问题都已修复。该集合现在可以在大约1-2秒内处理近3k的记录。