jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "P.C.Sun (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (JCR-2395) Text Extractor: Image parser throws exception (jpeg)
Date Mon, 29 Oct 2012 08:29:12 GMT

    [ https://issues.apache.org/jira/browse/JCR-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485893#comment-13485893
] 

P.C.Sun commented on JCR-2395:
------------------------------

Hi Jukka,

We are using LR which is also based on JCR to store, we got the following error, may i know
it's the same issue?

07:49:50,812 ERROR [FileImpl:247] org.apache.tika.exception.TikaException: TIKA-198: Illegal
IOException from org.apache.tika.parser.video.FLVParser@6824f26a
org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from org.apache.tika.parser.video.FLVParser@6824f26a
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:138)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99)
at org.apache.tika.Tika.parseToString(Tika.java:267)
at org.apache.tika.Tika.parseToString(Tika.java:296)
at com.liferay.portal.util.FileImpl.extractText(FileImpl.java:244)
at com.liferay.portal.kernel.util.FileUtil.extractText(FileUtil.java:118)
at com.liferay.portal.kernel.search.DocumentImpl.addFile(DocumentImpl.java:73)
at com.liferay.documentlibrary.util.DLIndexer.doGetDocument(DLIndexer.java:224)
at com.liferay.portal.kernel.search.BaseIndexer.getDocument(BaseIndexer.java:64)
at com.liferay.documentlibrary.util.JCRHook.reindex(JCRHook.java:627)
at com.liferay.documentlibrary.util.DLIndexer.doReindex(DLIndexer.java:283)
at com.liferay.portal.kernel.search.BaseIndexer.reindex(BaseIndexer.java:133)
at com.liferay.portlet.documentlibrary.util.DLIndexer.reindexFolders(DLIndexer.java:232)
at com.liferay.portlet.documentlibrary.util.DLIndexer.reindexFolders(DLIndexer.java:209)
at com.liferay.portlet.documentlibrary.util.DLIndexer.doReindex(DLIndexer.java:121)
at com.liferay.portal.kernel.search.BaseIndexer.reindex(BaseIndexer.java:133)
at com.liferay.portal.search.lucene.LuceneIndexer.doReIndex(LuceneIndexer.java:130)
at com.liferay.portal.search.lucene.LuceneIndexer.reindex(LuceneIndexer.java:61)
at com.liferay.portal.search.lucene.LuceneIndexer.run(LuceneIndexer.java:50)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.EOFException
at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:323)
at org.apache.tika.parser.video.FLVParser.readAMFString(FLVParser.java:127)
at org.apache.tika.parser.video.FLVParser.readAMFEcmaArray(FLVParser.java:150)
at org.apache.tika.parser.video.FLVParser.readAMFData(FLVParser.java:102)
at org.apache.tika.parser.video.FLVParser.parse(FLVParser.java:231)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:132)
... 19 more 

Thanks a lot.

JACK
                
> Text Extractor: Image parser throws exception (jpeg)
> ----------------------------------------------------
>
>                 Key: JCR-2395
>                 URL: https://issues.apache.org/jira/browse/JCR-2395
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: jackrabbit-text-extractors
>    Affects Versions: 2.0-beta1
>            Reporter: Philipp Koch
>             Fix For: 2.0-beta3
>
>
> the below exception is thrown over an over while uploading jpeg images:
> 16.11.2009 17:20:42 *WARN * LazyTextExtractorField: Failed to extract text from a binary
property (LazyTextExtractorField.java, line 165)
> org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from org.apache.tika.parser.image.ImageParser@c7bc3
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:125)
> 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:105)
> 	at org.apache.jackrabbit.core.query.lucene.LazyTextExtractorField$ParsingTask.run(LazyTextExtractorField.java:160)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
> 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:123)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:65)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:168)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
> 	at java.lang.Thread.run(Thread.java:613)
> Caused by: javax.imageio.IIOException: Not a JPEG file: starts with 0x00 0x05
> 	at com.sun.imageio.plugins.jpeg.JPEGImageReader.readImageHeader(Native Method)
> 	at com.sun.imageio.plugins.jpeg.JPEGImageReader.readNativeHeader(JPEGImageReader.java:554)
> 	at com.sun.imageio.plugins.jpeg.JPEGImageReader.checkTablesOnly(JPEGImageReader.java:309)
> 	at com.sun.imageio.plugins.jpeg.JPEGImageReader.gotoImage(JPEGImageReader.java:431)
> 	at com.sun.imageio.plugins.jpeg.JPEGImageReader.readHeader(JPEGImageReader.java:547)
> 	at com.sun.imageio.plugins.jpeg.JPEGImageReader.getHeight(JPEGImageReader.java:609)
> 	at org.apache.tika.parser.image.ImageParser.parse(ImageParser.java:47)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:119)
> 	... 10 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message