Java ApachePOI将大型excel文件写入磁盘
我需要打开、修改和编写一个excel文件。由于改变的过程有点复杂,我将把它省略掉,因为问题不在其中。我遇到的问题是在将文件写入磁盘时。问题开始于大约7MB大小的文件(不管是公式还是值) 我准备了以下MVCE:Java ApachePOI将大型excel文件写入磁盘,java,excel,performance,apache-poi,out-of-memory,Java,Excel,Performance,Apache Poi,Out Of Memory,我需要打开、修改和编写一个excel文件。由于改变的过程有点复杂,我将把它省略掉,因为问题不在其中。我遇到的问题是在将文件写入磁盘时。问题开始于大约7MB大小的文件(不管是公式还是值) 我准备了以下MVCE: public static void main(String[] args) throws EncryptedDocumentException, IOException, InvalidFormatException { String filePath = "C:\\temp";
public static void main(String[] args) throws EncryptedDocumentException, IOException, InvalidFormatException {
String filePath = "C:\\temp";
String outputFilePath = "C:\\temp\\test";
ZipSecureFile.setMinInflateRatio(0);
File f = new File(filePath, "Test.xlsx");
try (XSSFWorkbook workBook = new XSSFWorkbook(f)) {
System.out.println("writing file");
File outputFile = new File(outputFilePath, f.getName());
try (FileOutputStream fos = new FileOutputStream(outputFile)) {
workBook.write(fos);
}
workBook.close();
}
System.out.println("fin");
}
此代码已导致我遇到的问题。确切的堆栈跟踪是:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3236)
at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:191)
at org.apache.poi.openxml4j.opc.internal.MemoryPackagePartOutputStream.flush(MemoryPackagePartOutputStream.java:76)
at org.apache.poi.openxml4j.opc.internal.MemoryPackagePartOutputStream.close(MemoryPackagePartOutputStream.java:51)
at org.apache.poi.xssf.usermodel.XSSFSheet.commit(XSSFSheet.java:3575)
at org.apache.poi.ooxml.POIXMLDocumentPart.onSave(POIXMLDocumentPart.java:462)
at org.apache.poi.ooxml.POIXMLDocumentPart.onSave(POIXMLDocumentPart.java:467)
at org.apache.poi.ooxml.POIXMLDocument.write(POIXMLDocument.java:236)
at test.TestWriteOriginalWorkbook.main(TestWriteOriginalWorkbook.java:25)
尽管跟踪本身因文件而异,但异常本身始终保持不变
据我所知,解决这个问题的唯一方法是增加应用程序的可用内存。如果可能的话,我想避免这种情况。我刚刚检查了我的MaxHeapSize
,它的默认值约为260MB,看起来相当低。因此,如果有必要,可以将容量增加到1GB
我使用-Xmx1g
运行了这段代码,得到了相同的异常,但原因不同:
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3045)
at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3065)
at org.apache.xmlbeans.impl.store.Locale$SaxHandler.startElement(Locale.java:3198)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:509)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:374)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2784)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:841)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:770)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3414)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1272)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1259)
at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:345)
at org.openxmlformats.schemas.spreadsheetml.x2006.main.WorksheetDocument$Factory.parse(Unknown Source)
at org.apache.poi.xssf.usermodel.XSSFSheet.read(XSSFSheet.java:227)
at org.apache.poi.xssf.usermodel.XSSFSheet.onDocumentRead(XSSFSheet.java:219)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.parseSheet(XSSFWorkbook.java:452)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.onDocumentRead(XSSFWorkbook.java:417)
at org.apache.poi.ooxml.POIXMLDocument.load(POIXMLDocument.java:184)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:286)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:323)
at test.TestWriteOriginalWorkbook.main(TestWriteOriginalWorkbook.java:21)
Cleaning up unclosed ZipFile for archive C:\temp\Test.xlsx
线程“main”java.lang.OutOfMemoryError中出现异常:超出GC开销限制
位于org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3045)
位于org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3065)
位于org.apache.xmlbeans.impl.store.Locale$SaxHandler.startElement(Locale.java:3198)
位于com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:509)
在com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:374)
位于com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2784)
位于com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
在com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
位于com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505)
位于com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:841)
位于com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:770)
位于com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
位于com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
位于com.sun.org.apache.xerces.internal.jaxp.saxpasserimpl$jaxpsaxpasser.parse(saxpasserimpl.java:643)
位于org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3414)
位于org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1272)
位于org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1259)
在org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:345)
位于org.openxmlformats.schemas.spreadsheetml.x2006.main.WorksheetDocument$Factory.parse(未知源)
位于org.apache.poi.xssf.usermodel.XSSFSheet.read(XSSFSheet.java:227)
位于org.apache.poi.xssf.usermodel.XSSFSheet.onDocumentRead(XSSFSheet.java:219)
位于org.apache.poi.xssf.usermodel.XSSFWorkbook.parseSheet(XSSFWorkbook.java:452)
位于org.apache.poi.xssf.usermodel.XSSFWorkbook.onDocumentRead(XSSFWorkbook.java:417)
位于org.apache.poi.ooxml.POIXMLDocument.load(POIXMLDocument.java:184)
位于org.apache.poi.xssf.usermodel.XSSFWorkbook.(XSSFWorkbook.java:286)
位于org.apache.poi.xssf.usermodel.XSSFWorkbook.(XSSFWorkbook.java:323)
位于test.TestWriteOriginalWorkbook.main(TestWriteOriginalWorkbook.java:21)
清理归档文件C:\temp\Test.xlsx的未关闭ZipFile
总结以下问题:我可以做些什么来提高workBook.write(fos)的性能代码>处理7MB以上的文件(最好是15MB)
Test.xlsx
是一个仅包含值的excel文件。我用1
在整个第一列和第二列创建了它,直到第443269行。如果连POI的SXSSF API都不能为您服务,唯一的解决方案是寻找…来创建XSSFWorkbook
是将*.xlsx
ZIP
存档的所有不同内容读入内存中的对象所必需的ApachePOI
在每个Excel
对象中使用至少两种不同类型的对象来执行此操作。有ooxml模式的低级CT*
对象,它基于从文件中读取的XML
,然后apache poi
放入它的高级XSSF*
对象。这并不是真正的内存节省。但另一种方法是只使用低级的CT*
对象或直接操作XML
,这并不容易编程。如果连POI的SXSSF API都不能为您服务,唯一的解决方案是寻找…来创建XSSFWorkbook
是将*.xlsx
ZIP
存档的所有不同内容读入内存中的对象所必需的ApachePOI
在每个Excel
对象中使用至少两种不同类型的对象来执行此操作。有ooxml模式的低级CT*
对象,它基于从文件中读取的XML
,然后apache poi
放入它的高级XSSF*
对象。这并不是真正的内存节省。但是另一种方法是只使用低级的CT*
对象,或者直接操作XML
,这并不容易编程。