Bug #238
Microsoft .xlsx files create java heap space error Out of memory error
| Status: | New | Start: | 11/04/2009 | |
| Priority: | Normal | Due date: | ||
| Assigned to: | - | % Done: | 0% |
|
| Category: | - | |||
| Target version: | - | |||
Description
While scanning a quite large repository it seem repeatedly to create the error below for excel.xlsx files which are larger than about 10 Mb.
It runs on Windows Xp with 3 Gb memory and there seems to physical memory left. Seem no difference when I try to close all unnecssary files and processes. If the memory management is difficult to change, at least a trap for the error just neglecting these large files should it make easy to continue scanning the repository.
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3039
at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3060
at org.apache.xmlbeans.impl.store.Locale$SaxHandler.startElement(Locale
java:3250)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.reportStartTag(Piccolo.
ava:1082)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseAttributesNS(
iccoloLexer.java:1822)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseOpenTagNS(Pic
oloLexer.java:1521)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseTagNS(Piccolo
exer.java:1362)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.yylex(PiccoloLexer
java:4678)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yylex(Piccolo.java:1290
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yyparse(Piccolo.java:14
0)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.parse(Piccolo.java:714)
at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:343
)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1
70)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1
57)
at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTyp
LoaderBase.java:345)
at org.openxmlformats.schemas.spreadsheetml.x2006.main.WorksheetDocumen
$Factory.parse(Unknown Source)
at org.apache.poi.xssf.usermodel.XSSFSheet.read(XSSFSheet.java:126)
at org.apache.poi.xssf.usermodel.XSSFSheet.onDocumentRead(XSSFSheet.jav
:118)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.onDocumentRead(XSSFWorkbo
k.java:201)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:
64)
at org.apache.poi.xssf.extractor.XSSFExcelExtractor.<init>(XSSFExcelExt
actor.java:48)
at org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorF
ctory.java:100)
at org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorF
ctory.java:86)
at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser
java:47)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:10
)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:
8)
at com.soebes.supose.scan.document.ScanExcelDocument.scan(ScanExcelDocu
ent.java:78)
at com.soebes.supose.scan.document.ScanExcelDocument.indexDocument(Scan
xcelDocument.java:64)
at com.soebes.supose.scan.FileExtensionHandler.execute(FileExtensionHan
ler.java:61)
at com.soebes.supose.scan.ScanRepository.indexFile(ScanRepository.java:
56)
at com.soebes.supose.scan.ScanRepository.workOnChangeSet(ScanRepository
java:204)
at com.soebes.supose.scan.ScanRepository.scan(ScanRepository.java:136)@
History
Updated by Karl Heinz Marbaise 822 days ago
Hi Ulrik,
what is your configuration? Changes made in the batch file in bin folder? Did you use -Xms or -Xmx options in any way?
Kind regards
Karl Heinz Marbaise
Updated by Ulrik Kautsky 820 days ago
what is your configuration? Changes made in the batch file in bin folder? Did you use -Xms or -Xmx options in any way?
When made the issue it was the default -Xmx of 1024
Then I increased this to 1536. I didn't dare to go further. I don't really know upper limits of Windows XP 32bit. But it didn't change the result. I looked on the memory usage and realised that java at least all the space.
Updated by Ulrik Kautsky 803 days ago
what is your configuration? Changes made in the batch file in bin folder? Did you use -Xms or -Xmx options in any way?
We tested on solaris also with heap 2008kb, same trouble. If there is no current fix on this, maybe just make an option to jump to next file above a certain size of xlsx files.