“线程中的异常”;“主要”;java.lang.NullPointerException-HBase索引数据
我正在解析pdf并将标题、作者等存储在变量中,我需要在hbase中为值编制索引。因此,我从我在项目中创建的变量中获取hbase表的数据。当我在hbase表中使用变量进行索引时,程序向我显示NullPointerException错误“线程中的异常”;“主要”;java.lang.NullPointerException-HBase索引数据,java,Java,我正在解析pdf并将标题、作者等存储在变量中,我需要在hbase中为值编制索引。因此,我从我在项目中创建的变量中获取hbase表的数据。当我在hbase表中使用变量进行索引时,程序向我显示NullPointerException错误 Exception in thread "main" java.lang.NullPointerException at java.lang.String.<init>(String.java:154) at testSolr.Testt
Exception in thread "main" java.lang.NullPointerException
at java.lang.String.<init>(String.java:154)
at testSolr.Testt.Parsing(Testt.java:50)
at testSolr.Testt.main(Testt.java:94)
及
以下是我的代码部分(我编写了重要部分):
我应该在解析开始时将变量设为null吗?我认为这没有任何意义。我应该做什么来修复错误
更新:
完整代码
public static String location=“/home/alican/Downloads/solr-4.10.2/example/solr/senior/PDFs/solr word.pdf”
}
标题、作者、页数和内容的输出:
Title: solr-word
Number of Page(s): 1
Author(s): Grant Ingersoll
Content of the PDF :
This is a test of PDF and Word extraction in Solr, it is only a test. Do not panic.
HBase部分假设nPage的变量为空。事实并非如此。nPage的值为1
p.add(Bytes.toBytes("book"),
Bytes.toBytes("pageNumber"),Bytes.toBytes(nPage));
解决方案:
get(“xmpTPg:NPage”)由于某种原因被分配给变量时返回null。我意识到这是因为解析器。我更改了解析器,不再有任何空变量。
-ApachePDFBox(我的新解析器)比ApacheTika(我的旧解析器)好。您的
元数据。get(“title”)
返回null,因此会引发NullPointerException。有关更多详细信息,请参阅。什么是元数据?请更新您的问题,以便我们能够提出适当的解决方案。没有任何内容是空的,但hbase部分假定nPage是空的,即使它不是空的。String title=new String(metadata.get(“title”)代码>,您获得的标题
的值是多少?看看stacktrace,问题在哪里非常清楚:testSolr.Testt.Parsing(Testt.java:50)
title的值是:“solr word”。我对nPage做了同样的事情,当我想查看nPage的输出时,它返回null。但是如果我这样写System.out.println(“页数):”+metadata.get(“xmpTPg:NPages”)代码>我可以看到输出为1,这是正确的(pdf只有1页)。页数:1我更改了解析器,现在没有任何空变量。我用的是ApacheTİka,现在用的是ApachePDFBox。谢谢你的帮助不客气。请记为正确答案,它可以在将来帮助别人。
Random rand = new Random();
int min=1, max=5000;
int randomNumber = rand.nextInt((max - min) + 1) + min;
//parsing part
String title = new String(metadata.get("title"));
String nPage = new String(metadata.get("xmpTPg:NPage"));
String author = new String(metadata.get("Author"));
String content = new String(handler.toString());
//hbase part(the part where I am getting the error.)
Put p = new Put(Bytes.toBytes(randomNumber));
p.add(Bytes.toBytes("book"),
Bytes.toBytes("title"),Bytes.toBytes(title));
p.add(Bytes.toBytes("book"),
Bytes.toBytes("author"),Bytes.toBytes(author));
p.add(Bytes.toBytes("book"),
Bytes.toBytes("pageNumber"),Bytes.toBytes(nPage));
p.add(Bytes.toBytes("book"),
Bytes.toBytes("content"),Bytes.toBytes(content));
hTable.put(p);
public static void Parsing(String location) throws IOException, SAXException, TikaException, SolrServerException {
// random number generator for ids
Random rand = new Random();
int min=1, max=5000;
int randomNumber = rand.nextInt((max - min) + 1) + min;
// random number generator for ids ends
// pdf Parser
BodyContentHandler handler = new BodyContentHandler(-1);
FileInputStream inputstream = new FileInputStream(location);
Metadata metadata = new Metadata();
ParseContext pcontext = new ParseContext();
PDFParser pdfparser = new PDFParser();
pdfparser.parse(inputstream, handler, metadata, pcontext);
String title = new String(metadata.get("title"));
String nPage = metadata.get("xmpTPg:NPage");
String author = new String(metadata.get("Author"));
String content = new String(handler.toString());
System.out.println("Title: " + metadata.get("title"));
System.out.println("Number of Page(s): " + metadata.get("xmpTPg:NPages"));
System.out.println("Author(s): " + metadata.get("Author"));
System.out.println("Content of the PDF :" + handler.toString());
// pdf Parser ends
// solr Indexing
SolrClient server = new HttpSolrClient(url);
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", randomNumber);
doc.addField("author", author);
doc.addField("title", title);
doc.addField("pageNumber", nPage);
doc.addField("content", content);
server.add(doc);
System.out.println("solr commiiitt......");
server.commit();
// solr Indexing ends
// hbase Indexing
Configuration config = HBaseConfiguration.create();
HTable hTable = new HTable(config, "books");
Put p = new Put(Bytes.toBytes(randomNumber));
p.add(Bytes.toBytes("book"),
Bytes.toBytes("title"),Bytes.toBytes(title));
p.add(Bytes.toBytes("book"),
Bytes.toBytes("author"),Bytes.toBytes(author));
p.add(Bytes.toBytes("book"),
Bytes.toBytes("pageNumber"),Bytes.toBytes(nPage));
p.add(Bytes.toBytes("book"),
Bytes.toBytes("content"),Bytes.toBytes(content));
hTable.put(p);
System.out.println("hbase commiiitttt..");
hTable.close();
// hbase Indexing ends
Title: solr-word
Number of Page(s): 1
Author(s): Grant Ingersoll
Content of the PDF :
This is a test of PDF and Word extraction in Solr, it is only a test. Do not panic.
p.add(Bytes.toBytes("book"),
Bytes.toBytes("pageNumber"),Bytes.toBytes(nPage));