Return-Path: Delivered-To: apmail-jackrabbit-users-archive@minotaur.apache.org Received: (qmail 40768 invoked from network); 19 Jul 2009 20:35:11 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 19 Jul 2009 20:35:11 -0000 Received: (qmail 23159 invoked by uid 500); 19 Jul 2009 20:36:16 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 23100 invoked by uid 500); 19 Jul 2009 20:36:15 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 23089 invoked by uid 99); 19 Jul 2009 20:36:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 19 Jul 2009 20:36:15 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 19 Jul 2009 20:36:06 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1MSd6o-0000x7-9d for users@jackrabbit.apache.org; Sun, 19 Jul 2009 13:35:46 -0700 Message-ID: <24560696.post@talk.nabble.com> Date: Sun, 19 Jul 2009 13:35:46 -0700 (PDT) From: Vjger To: users@jackrabbit.apache.org Subject: Text extractors doesn't work correctly MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: mariner@libero.it X-Virus-Checked: Checked by ClamAV on apache.org Hi to all. I'm using JackRabbit 1.5.5 and in my classpath I've jackrabbit-text-extractors-1.5.0-jar Well, I noticed two problems. 1) The plain text text extractors depends by the file extension: in fact, in my workspace I've two nt:file node one as .txt extension the other as .sql extension. The SQL contains function found only the first even if the two file are identical (apart of the extension). 2) The pdf extractor has not worked correctly: with two different pdf files it has not found the searched text Any suggests? Thanks in advance -- View this message in context: http://www.nabble.com/Text-extractors-doesn%27t-work-correctly-tp24560696p24560696.html Sent from the Jackrabbit - Users mailing list archive at Nabble.com.