Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1025A6672 for ; Thu, 26 May 2011 13:23:17 +0000 (UTC) Received: (qmail 62766 invoked by uid 500); 26 May 2011 13:23:13 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 62706 invoked by uid 500); 26 May 2011 13:23:13 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 62698 invoked by uid 99); 26 May 2011 13:23:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 May 2011 13:23:13 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of rahul.warawdekar@gmail.com designates 74.125.82.176 as permitted sender) Received: from [74.125.82.176] (HELO mail-wy0-f176.google.com) (74.125.82.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 May 2011 13:23:07 +0000 Received: by wyb40 with SMTP id 40so729105wyb.35 for ; Thu, 26 May 2011 06:22:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=Ibdn+yM768j4mcK0vGKls/SHXlYouDBtzUgrUoVPeXs=; b=XI+PORKqgn1D0kGjzKkkzE3ymmV6IgJurOtgkJ8sUcUkhTGas4UUgSefciW9UVEES0 TVSANEczyB69ky5UhpR3t3GIiH0R8HnBenPmagJB54pJdOHwPU8I/lD/2JdoUiGb9vTt Rf094Cgx6WT4fX44eUq2/hSdu0fGcaqKWGCGU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=JiTjn7OfaPBe7VqxXC7Hi2bhXaqsX57RY61IlZRj8IefZu1+obrxAAbGLj0EPUD+Y7 YKOZrd5XuJL+rtWIMJg/HnCUspt/eTvzH21cVsxw8kCO6hUEUxSmCvgHxIctU3GJyLwC 4g9Q3ir82tDyiruc+d4rKs1Sgp9lG8ufw3Gpk= MIME-Version: 1.0 Received: by 10.216.81.203 with SMTP id m53mr6439429wee.9.1306416165837; Thu, 26 May 2011 06:22:45 -0700 (PDT) Received: by 10.216.158.75 with HTTP; Thu, 26 May 2011 06:22:45 -0700 (PDT) Date: Thu, 26 May 2011 09:22:45 -0400 Message-ID: Subject: Issue while extracting content from MS Excel 2007 file using TikaEntityProcessor From: Rahul Warawdekar To: solr-user@lucene.apache.org Content-Type: multipart/alternative; boundary=0016e6dd97e17e5d5e04a42db758 --0016e6dd97e17e5d5e04a42db758 Content-Type: text/plain; charset=ISO-8859-1 Hi All, I am using Solr 3.1 for one of our search based applications. We are using DIH to index our data and TikaEntityProcessor to index attachments. Currently we are running into an issue while extracting content from one of our MS Excel 2007 files, using TikaEntityProcessor. The issue is the TikaEntityProcessor is hung without throwing any exception which in tuen causes the indexing to be hung on the server. Has anyone faced a similar kind of issue in the past with TikaEntityProcessor ? Also, does someone know of a way to just skip this type of behaviour for that file and move to the next document to be indexed ? -- Thanks and Regards Rahul A. Warawdekar --0016e6dd97e17e5d5e04a42db758--