Return-Path: X-Original-To: apmail-pdfbox-users-archive@www.apache.org Delivered-To: apmail-pdfbox-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C19381790D for ; Mon, 27 Apr 2015 19:45:16 +0000 (UTC) Received: (qmail 1174 invoked by uid 500); 27 Apr 2015 19:45:16 -0000 Delivered-To: apmail-pdfbox-users-archive@pdfbox.apache.org Received: (qmail 1154 invoked by uid 500); 27 Apr 2015 19:45:16 -0000 Mailing-List: contact users-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@pdfbox.apache.org Delivered-To: mailing list users@pdfbox.apache.org Received: (qmail 1142 invoked by uid 99); 27 Apr 2015 19:45:16 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Apr 2015 19:45:16 +0000 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests= X-Spam-Check-By: apache.org Received-SPF: error (athena.apache.org: local policy) Received: from [54.191.145.13] (HELO mx1-us-west.apache.org) (54.191.145.13) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Apr 2015 19:45:08 +0000 Received: from mailout05.t-online.de (mailout05.t-online.de [194.25.134.82]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 1350927367 for ; Mon, 27 Apr 2015 19:43:31 +0000 (UTC) Received: from fwd33.aul.t-online.de (fwd33.aul.t-online.de [172.20.27.144]) by mailout05.t-online.de (Postfix) with SMTP id B144944311D for ; Mon, 27 Apr 2015 21:42:52 +0200 (CEST) Received: from [192.168.2.102] (EwNhgBZErh6fWrMhTtPf3aSlpH0NQSEHsO2CjT3JPy8nGnc3kBFQpIKM87TCWycQnr@[217.231.130.200]) by fwd33.t-online.de with (TLSv1.2:ECDHE-RSA-AES256-SHA encrypted) esmtp id 1YmovY-4WmrHE0; Mon, 27 Apr 2015 21:42:48 +0200 Message-ID: <553E9177.4060309@t-online.de> Date: Mon, 27 Apr 2015 21:43:51 +0200 From: Tilman Hausherr User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: users@pdfbox.apache.org Subject: Re: pdfbox gives ArrayIndexOutOfBounds in PDFTextStripper References: <553D6554.10502@t-online.de> In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-ID: EwNhgBZErh6fWrMhTtPf3aSlpH0NQSEHsO2CjT3JPy8nGnc3kBFQpIKM87TCWycQnr X-TOI-MSGID: 29b14717-1d83-432e-b810-71569631b83c X-Virus-Checked: Checked by ClamAV on apache.org Am 27.04.2015 um 00:33 schrieb Andrew Munn: > On Mon, 27 Apr 2015, Tilman Hausherr wrote: >> try >> stripper.setShouldSeparateByBeads(false) >> do you get what you need? > Thanks. I will check it out. That was just one of several PDFs I was > doing some testing with and that one happened to generate that out of > bounds exception. I have researched this a bit more and disabled ShouldSeparateByBeads in the area stripping class. https://issues.apache.org/jira/browse/PDFBOX-2775 --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org For additional commands, e-mail: users-help@pdfbox.apache.org