Return-Path: Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: (qmail 55764 invoked from network); 2 Oct 2007 08:04:55 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 2 Oct 2007 08:04:55 -0000 Received: (qmail 65613 invoked by uid 500); 2 Oct 2007 08:04:44 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 65584 invoked by uid 500); 2 Oct 2007 08:04:44 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 65573 invoked by uid 99); 2 Oct 2007 08:04:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Oct 2007 01:04:44 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jukka.zitting@gmail.com designates 66.249.82.234 as permitted sender) Received: from [66.249.82.234] (HELO wx-out-0506.google.com) (66.249.82.234) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Oct 2007 08:04:46 +0000 Received: by wx-out-0506.google.com with SMTP id h29so3088550wxd for ; Tue, 02 Oct 2007 01:04:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=u7UP+SnOQF1SPylLVdvoI0OEw0RMO3E13Bu5v0OPns4=; b=r3YOLd5ZPp2HCOpxddBgy8KYWLo/6sgFVQrjqqsIDeUo/kTE0a8yENLHihKwXRkDCqMnD2BuoiQChAQeR5+NMrFdPG7Ft1hzDt7nOhTBKCweNetMtgh4jfH0Lg41zpYUoWjZnt5KOjGpLxf9NmFpVLxGQIvE5u7985gInJyMOas= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=Ed3yLbXYtw9+8Jf1YNcH1Qr3z1WHbQpl1o0HhDP/mKit7C3DJW2UGSOqSO1P2hEqCYzOmRfsI40U31ru07z9UpHSkr+4yYf0zyLvpS9sSwnI6dPuGd24BFag5pSNMNEaKXT//beLxZ8+/gBvHk0HaRy1oukVuHpmEh7fGZspD/o= Received: by 10.90.83.14 with SMTP id g14mr7188447agb.1191312265037; Tue, 02 Oct 2007 01:04:25 -0700 (PDT) Received: by 10.90.51.7 with HTTP; Tue, 2 Oct 2007 01:04:24 -0700 (PDT) Message-ID: <510143ac0710020104p382b6ce3r393d91b3864b6f05@mail.gmail.com> Date: Tue, 2 Oct 2007 11:04:24 +0300 From: "Jukka Zitting" To: dev@jackrabbit.apache.org Subject: Re: spellchecker In-Reply-To: <4700E79E.7070602@gmx.net> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <4700E79E.7070602@gmx.net> X-Virus-Checked: Checked by ClamAV on apache.org Hi, On 10/1/07, Marcel Reutegger wrote: > I'm about to write a spellchecker extension for the lucene query handler in > jackrabbit. Cool! Some concerns though, as I figure the spell checker would use the search index as a dictionary. Can there be a case where this feature could be used to circumvent access controls to retrieve isolated pieces of content from read-protected documents? I guess the threat is a bit theoretical, but how about a case where an attacker just wants to know if a repository contains some specific material (a list of specific names, etc.). The attacker could use the spellchecker as a mechanism to find out if a workspace contains a document with a specific name or keyword. > I planned to use the lucene-spellchecker contrib, however I don't > want to introduce another dependency in the jackrabbit-core. because the > spellchecker contrib in lucene only includes a handful of classes I would prefer > to copy the classes and refactor them into the jackrabbit package space. > > does anyone have a better idea how to handle this? Would there be interest within the Lucene team to include the feature in a future release of lucene-core? I see where Felix is going with extra modules, but there's always a cost in complexity with such modularity and I'm not sure if this feature is worth that overhead. BR, Jukka Zitting