lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrea Gazzarini <a.gazzar...@gmail.com>
Subject Re: Solr hanging when extracting a some broken .doc files
Date Tue, 17 Dec 2013 19:43:26 GMT
Hi Augusto,
I don't believe the mailing list allows attachments. Could you please post
the complete stacktrace? In addition, set the logging level of tika classes
to FINEST in solr console, maybe can be helpful

Best,
Andrea
On 17 Dec 2013 16:30, "Augusto Camarotti" <augusto@prpb.mpf.gov.br> wrote:

>  Hi guys,
>
>    I'm having a problem with solr when trying to index some broken .doc
> files.
>    I have set up a test case using Solr to index all the files the users
> save on the shared directorys of the company that i work for and Solr is
> hanging when trying to index this file in particular(the one i'm attaching
> on this e-mail). There are some others broken .doc files that Solr index by
> the name without a problem, even logging some Tika erros during the
> process, but when it reaches this file in particular, it hangs and i have
> to cancel the upload.
>    I cannot guarantee the directorys will never hold a broken .doc file,
> or a broken file with some other extension, so i guess solr could just
> return a failing message, or something like that.
>    These are the logging messages solr is recording:
>
>
>   03:38:23 ERROR SolrCore org.apache.solr.common.SolrException:
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from
> org.apache.tika.parser.microsoft.OfficeParser@386f9474 03:38:25 ERROR
> SolrDispatchFilter null:org.apache.solr.common.SolrException:
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from
> org.apache.tika.parser.microsoft.OfficeParser@386f9474
>
> So, how do I prevent solr from hanging when trying to index broken files?
>
> Regards,
>
> Augusto Camarotti
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message