Return-Path: Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: (qmail 98185 invoked from network); 7 Apr 2009 21:29:50 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Apr 2009 21:29:50 -0000 Received: (qmail 38908 invoked by uid 500); 7 Apr 2009 21:29:50 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 38882 invoked by uid 500); 7 Apr 2009 21:29:50 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 38874 invoked by uid 99); 7 Apr 2009 21:29:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Apr 2009 21:29:50 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jukka.zitting@gmail.com designates 209.85.218.164 as permitted sender) Received: from [209.85.218.164] (HELO mail-bw0-f164.google.com) (209.85.218.164) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Apr 2009 21:29:42 +0000 Received: by bwz8 with SMTP id 8so2456452bwz.43 for ; Tue, 07 Apr 2009 14:29:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:from:date:message-id :subject:to:content-type:content-transfer-encoding; bh=u2oJB7W1OZAl0qUjdDM3wiffrToaUgDXt4BOZAcCTuk=; b=VsEP+QR9It23v7mecUr8wMv8Vd9lK1IEoXysagxuzkjueUzgRRt/REAAN8GKYSrvAh FHAOWzWrwCxRq0uh9ahLFQlpYMBU0FLeo1d5G76AUE6Pqg6Cjf/LwojPktthvRnHaSMk 7DA5B6T9U8H1VrGdAMz5krNFUlFi/A19aRRzU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type :content-transfer-encoding; b=xDH0Sr+sJRqaJOqWvyULYawhx/eO27zpdJb67DbD4RcJ+aB/xyd62fV1uOUmjopn00 df/5BNaa3oLQxRLe0GnLOvs1LuCYJekfN00LmgIfUA/Ih/jMwWvvD7kw1eZLSJzFYzpf f9eBoJLrOR+OxeEn3GGLtZR+PzNP1NFSW1l+M= MIME-Version: 1.0 Received: by 10.204.71.15 with SMTP id f15mr515478bkj.42.1239139761464; Tue, 07 Apr 2009 14:29:21 -0700 (PDT) From: Jukka Zitting Date: Tue, 7 Apr 2009 23:29:06 +0200 Message-ID: <510143ac0904071429u73de9193oc0a0fb431d24eace@mail.gmail.com> Subject: Getting rid of jackrabbit-text-extractors To: Jackrabbit Developers Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hi, JCR-1878 is now resolved and Jackrabbit trunk is depending on Apache Tika for text extraction functionality. Thus there is little more need for jackrabbit-text-extractors as a standalone component. Anyone who needs that functionality separately from jackrabbit-core should just go for Tika directly. For backwards compatibility with existing configurations (and potential extensions) we still need the current org.apache.jackrabbit.extractor classes, but I'm thinking of simply moving the entire package to jackrabbit-core and deprecating everything except the new Tika-based extractor. In fact I'd even go as far as changing the indexing code in jackrabbit-core to use the Tika Parser interface directly and only provide a backwards-compatibility layer for the TextExtractor classes we have. Thus Jackrabbit 1.6 would no longer contain a separate text-extractors jar, but all the existing TextExtractor classes would still be incluced. In Jackrabbit 2.0 we'd drop all the TextExtractors and only use Tika Parsers. BR, Jukka Zitting