Lucene：迭代所有条目_Lucene_Loops - Fatal编程技术网

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/loops/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Lucene：迭代所有条目_Lucene_Loops - Fatal编程技术网

Lucene：迭代所有条目

lucene loops

Lucene：迭代所有条目,lucene,loops,Lucene,Loops,我有一个Lucene索引，我想迭代一下（在当前开发阶段进行一次评估）我有4个文档，每个文档有几十万到一百万个条目，我想迭代计算每个条目的字数（~2-10）并计算频率分布我目前正在做的是： for (int i = 0; i < reader.maxDoc(); i++) { if (reader.isDeleted(i)) continue; Docu

我有一个Lucene索引，我想迭代一下（在当前开发阶段进行一次评估）我有4个文档，每个文档有几十万到一百万个条目，我想迭代计算每个条目的字数（~2-10）并计算频率分布

我目前正在做的是：

   for (int i = 0; i < reader.maxDoc(); i++) {
                    if (reader.isDeleted(i))
                        continue;

                    Document doc = reader.document(i);
                Field text = doc.getField("myDocName#1");

                String content = text.stringValue();


                int wordLen = countNumberOfWords(content);
//store
}

for（int i=0；i


到目前为止，它正在迭代一些东西。调试确认它至少在文档中存储的术语上运行，但由于某些原因，它只处理存储术语的一小部分。我想知道我做错了什么？我只想遍历所有文档以及其中存储的所有内容？
首先，您需要确保启用了TermVectors进行索引
doc.add(new Field(TITLE, page.getTitle(), Field.Store.YES, Field.Index.ANALYZED, TermVector.WITH_POSITIONS_OFFSETS));

然后可以使用IndexReader.getTermFreqVector
来计算术语
TopDocs res = indexSearcher.search(YOUR_QUERY, null, 1000);

// iterate over documents in res, ommited for brevity

reader.getTermFreqVector(res.scoreDocs[i].doc, YOUR_FIELD, new TermVectorMapper() {
            public void map(String termval, int freq, TermVectorOffsetInfo[] offsets, int[] positions) {
                // increment frequency count of termval by freq
                freqs.increment(termval, freq);
            }

            public void setExpectations(String arg0, int arg1,boolean arg2, boolean arg3) {}
});




[loops]相关文章推荐



                                                        
                                       





随机文章推荐



                                                        
Vba 如何识别对象是集合还是字典？
vbacollections 
Vba MSAccess-在选项卡式视图上禁用右键单击？
vbams-access 
Vba 替换邮件正文中的文本时Outlook MailItem运行时错误287
vbaoutlook 
Vba 使用按钮在单元格范围内插入单个值
vbaexcel 
Vba 获取列中具有相同值的最后一行
vbaexcel 
Vba 如何知道工作表中最长行的长度
vbaexcel 
Excel2007VBA-Avaya CMS-SSH脚本
vbaexcelssh 
VBA，我的所有数据都在一列中。如何扫描和获取所有相关信息？
vbaexcelexcel-formula 
Word VBA：使宏易于运行
vbams-word 
Vba 循环和排序电子邮件项目
vbaoutlook 
使用VBA Excel 2010访问JIRA
vbaexceljira 
Vba 将一行或多行粘贴到表底部时如何更新列中的数据
vbaexcel 
Word Vba将行添加到重复节内容控件
vbams-word 
Vba 循环已经运行的代码以简化它
vbaexcel 
Vba 需要查看两个单元格之间的不同文本
vbaexcel-formula 
如何使用VBA从光标位置到段落开头选择文本
vbams-word 
Vba 访问表单上未触发DEL键（文本框键控事件）
vbams-access 
如果VBA更改了数据，则记录已更改错误
vbams-access 
访问VBA访问Outlook电子邮件输出在电子邮件正文中插入变量时添加换行符
vbams-accessoutlook 
如何在vba中从Access窗体获取过滤后的记录集？
vbams-access


                                        

                                        
                                        


                                                
                                                        [lucene]相关推荐
                                                        
你能在更新索引时读取lucene索引吗
									Lucene
							 
Lucene 增强多值字段
									Lucene
							 
Lucene 2.9中INDEX.TOKENIZER和INDEX.ANALYZER的真正区别是什么？
									Lucene
							 
Lucene 嵌套布尔查询？
									Lucene
							 
如何在lucene查询解析器中不分析子句？
									Lucene
							 
Lucene：通过向其添加IR信息来输出详细的数据
									Lucene
							 
Lucene：通过标记字符串和传递来构建查询
									Lucene
							 
lucene与本体论
									Lucene
							 
Lucene Solr-demax查询
									Lucene
							 									Solr
							 
用ShingleFilter在PyLucene中构建共分解分析仪
									Lucene
							 
tis文件中的Lucene术语查询
									Lucene
							 
Lucene Hibernate搜索不适用于使用@IdClass的复合主键
									Lucene
							 
使用Lucene Highlighter基础结构标记任意文本
									Lucene
							 
Lucene中涉及整数的范围查询的形成
									Lucene
							 
Lucene 为什么queryString即使在查询中添加了关键字分析字段也不会给出任何结果
									Lucene
							 
Lucene 使用py2neo-Neo4j向索引添加多值属性
									Lucene
							 									Neo4j
							 
Lucene评分机制
									Lucene
							 
Lucene Aem全文检索
									Lucene
							 									Aem
							 
OAK-Lucene指数中indexNodeName和：nodeName之间的差异
									Lucene
							 
lucene中的.tip文件是什么？
									Lucene
							 
Elasticsearch 弹性搜索相似性折扣重叠
									Lucene
							 
在Elasticsearch中将多个字符串匹配到单个字段
									Lucene
							 
Sitecore lucene+；基于计算日期字段的查询增强
									Lucene
							 									Sitecore
							 
Lucene 有没有一种方法可以像exludeLimit（）一样在hibernate搜索中包含限制？
									Lucene
							 
我可以清除lucene.net中的stopword列表以使精确匹配更好地工作吗？
									Lucene
							 
Elasticsearch 6.2/Kibana查询：一个字段必须存在，一个字段不能存在
									Lucene
							 									Kibana
							 
Lucene-如何在父文档中获取所有子文档'；给定父文档ID的s块
									Lucene
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Heroku
Biztalk
Yii2
Electron
Cors
Compiler Construction
Tags
Twilio
Clearcase
Go
Here Api
Delphi
Swing
Swiftui
Gulp
Visual Studio Code
Octave
Antlr4
Embedded
Bash
User Interface
Dynamic
Activemq
Sublimetext3
Solr
Map
Netty
Sitecore
Facebook Graph Api
Ibm Mobilefirst
Postgresql
Winapi
Graphics
Vhdl
Cypress
Next.js
Caching
Loopbackjs
Curl
Microservices
Jpa
Magento
Pytorch
Jmeter
Jira
Bison
C++
Spring Batch
Ssis
Dynamics Crm
Internet Explorer 8
Ruby On Rails 3
Mod Rewrite
Internet Explorer
Generics
Shell
Swift
Performance
Dialogflow Es
Text
Telegram
Opengl Es
Autodesk Forge
Webstorm
Rss
Google Colaboratory
Binding
Vb6
Objective C
Sublimetext2
Reference
Exchange Server
Ssrs 2008
Compilation
Asp.net Mvc 2
Botframework
Mongodb
EmptyTag
Model View Controller
Sparql
Syntax
Memory Leaks
Jvm
Linker
Hyperledger Fabric
Jboss
Cryptography
Ibm Cloud
Actionscript
Servlets
Amazon Cloudformation
F#
Triggers
Sorting
Numpy
Dictionary
Deep Learning
Rdf
Spotify
Browser
Drupal 6
Netbeans
Push Notification
Directx
Sass
Routing
Prolog
Speech Recognition
Nativescript
Workflow
Nginx
Blockchain
Core Data
Ipad
Ms Office
Laravel 4
Unity3d
Model
Openshift
Xamarin.ios
Windows Phone 7
Zsh
Sencha Touch 2
Testng
Error Handling
Db2
Sprite Kit
Image Processing
Kibana
Terminal
Matlab
Plsql
Postman
Dotnetnuke
Ant
Latex
Puppet
Properties
Parallel Processing
Jenkins
Vim
Opengl
Oauth 2.0
Optimization
Entity Framework
Fonts
Api
Authentication
Phpstorm
Mercurial
Windows 7
Sencha Touch
Pip
Excel Formula
Datatables
Youtube Api
Rx Java
Oracle11g
Automated Tests
Awk
Angular
Pyspark
Sip
Gdb
Jaxb
Cloud
Log4j
Interface
Java Me
Erlang
Rally
Wolfram Mathematica
Google Visualization
Jupyter Notebook
Parameters
Mapping
Sockets
Testing
Corda
Quickbooks
C# 3.0
Vbscript
Forms
Geolocation
Monitoring
Xquery
Scrapy
Material Ui
Pointers
Azure Devops
D3.js
Jwt
Activerecord
Ada
Apache Pig
Service
Visual Studio 2012
Cocoa
Visual Studio 2013
Android Ndk
Notepad++


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网