Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D30BA108B4 for ; Tue, 17 Dec 2013 19:43:59 +0000 (UTC) Received: (qmail 96734 invoked by uid 500); 17 Dec 2013 19:43:56 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 96639 invoked by uid 500); 17 Dec 2013 19:43:56 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 96601 invoked by uid 99); 17 Dec 2013 19:43:53 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Dec 2013 19:43:53 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of a.gazzarini@gmail.com designates 209.85.219.50 as permitted sender) Received: from [209.85.219.50] (HELO mail-oa0-f50.google.com) (209.85.219.50) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Dec 2013 19:43:47 +0000 Received: by mail-oa0-f50.google.com with SMTP id n16so7130759oag.23 for ; Tue, 17 Dec 2013 11:43:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=XNgU6YIKVotEpyeiGViMKBX+0c4pcPA8o2e4VP/2xKg=; b=E4fY1q7F3t+GKK0kapEHQDg+rXKkmK1m9JxffMpvBNzuDVyWmKq507Sjr8QpUouQUc 8g36H7IDWNk20TwFwWHM/WMVPTcSoDG+k2bISL3aZTUPf++6dG393G0pQ7k3pmiCFt3e FkrHaswYnBJZageLfjIUm8AuEXxbsF0Fc+t2SEq6kAja08jb677loL+ATaQU4k0z6ral UGVheqi2NyNvkoRlrPDkizAAErD8ziYbi8VVtOkRQR+kaS6Sn1m0lo/6/9UVzEauCjfi gI2+oo++ytd9ie+E3RuwvBGFiElSqIazKfunjNZFpwGxKG45DeLuJg2GIph3aEpFP5Vo ityg== MIME-Version: 1.0 X-Received: by 10.182.153.196 with SMTP id vi4mr3208066obb.75.1387309406377; Tue, 17 Dec 2013 11:43:26 -0800 (PST) Received: by 10.76.114.47 with HTTP; Tue, 17 Dec 2013 11:43:26 -0800 (PST) Received: by 10.76.114.47 with HTTP; Tue, 17 Dec 2013 11:43:26 -0800 (PST) In-Reply-To: <52B043A50200006F0001D574@tambau.prpb.mpf.gov.br> References: <52B043A50200006F0001D574@tambau.prpb.mpf.gov.br> Date: Tue, 17 Dec 2013 20:43:26 +0100 Message-ID: Subject: Re: Solr hanging when extracting a some broken .doc files From: Andrea Gazzarini To: solr-user@lucene.apache.org Content-Type: multipart/alternative; boundary=089e015382ac5c5eea04edc02433 X-Virus-Checked: Checked by ClamAV on apache.org --089e015382ac5c5eea04edc02433 Content-Type: text/plain; charset=ISO-8859-1 Hi Augusto, I don't believe the mailing list allows attachments. Could you please post the complete stacktrace? In addition, set the logging level of tika classes to FINEST in solr console, maybe can be helpful Best, Andrea On 17 Dec 2013 16:30, "Augusto Camarotti" wrote: > Hi guys, > > I'm having a problem with solr when trying to index some broken .doc > files. > I have set up a test case using Solr to index all the files the users > save on the shared directorys of the company that i work for and Solr is > hanging when trying to index this file in particular(the one i'm attaching > on this e-mail). There are some others broken .doc files that Solr index by > the name without a problem, even logging some Tika erros during the > process, but when it reaches this file in particular, it hangs and i have > to cancel the upload. > I cannot guarantee the directorys will never hold a broken .doc file, > or a broken file with some other extension, so i guess solr could just > return a failing message, or something like that. > These are the logging messages solr is recording: > > > 03:38:23 ERROR SolrCore org.apache.solr.common.SolrException: > org.apache.tika.exception.TikaException: Unexpected RuntimeException from > org.apache.tika.parser.microsoft.OfficeParser@386f9474 03:38:25 ERROR > SolrDispatchFilter null:org.apache.solr.common.SolrException: > org.apache.tika.exception.TikaException: Unexpected RuntimeException from > org.apache.tika.parser.microsoft.OfficeParser@386f9474 > > So, how do I prevent solr from hanging when trying to index broken files? > > Regards, > > Augusto Camarotti > --089e015382ac5c5eea04edc02433--