Return-Path: Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: (qmail 18279 invoked from network); 1 Sep 2006 14:18:32 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 1 Sep 2006 14:18:32 -0000 Received: (qmail 2168 invoked by uid 500); 1 Sep 2006 14:18:24 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 2143 invoked by uid 500); 1 Sep 2006 14:18:24 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 2131 invoked by uid 99); 1 Sep 2006 14:18:24 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Sep 2006 07:18:23 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of gcaj-jackrabbit-dev@m.gmane.org designates 80.91.229.2 as permitted sender) Received: from [80.91.229.2] (HELO ciao.gmane.org) (80.91.229.2) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Sep 2006 07:18:20 -0700 Received: from list by ciao.gmane.org with local (Exim 4.43) id 1GJ9pz-0005J7-Tj for dev@jackrabbit.apache.org; Fri, 01 Sep 2006 16:17:40 +0200 Received: from gateway.subshell.com ([212.79.22.193]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 01 Sep 2006 16:17:39 +0200 Received: from kiehl by gateway.subshell.com with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 01 Sep 2006 16:17:39 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: dev@jackrabbit.apache.org From: Christoph Kiehl Subject: Re: Jackrabbits own FileSystem and unit tests Date: Fri, 01 Sep 2006 16:17:01 +0200 Lines: 24 Message-ID: References: <510143ac0608270703h78e4863bm8a2040aca36455c4@mail.gmail.com> <90a8d1c00608300308w73ff275cp987296f40f3fe924@mail.gmail.com> <90a8d1c00608300910h72d77267uf1813bcc2e91f2b0@mail.gmail.com> <44F6888B.7070202@gmx.net> <44F6C6F6.8090704@gmx.net> <44F718A7.6080303@gmx.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: gateway.subshell.com User-Agent: Thunderbird 1.5.0.5 (Windows/20060719) In-Reply-To: <44F718A7.6080303@gmx.net> Sender: news X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Marcel Reutegger wrote: > Christoph Kiehl wrote: >> I like the idea of having a transactional index but I don't think it's >> a good idea to read this index from a binary property in a database, >> because in our case we've got a fairly large repository where we got >> index files with a size of 40MB. As far as I understand you have to >> transfer 40MB to the database on every index change that gets >> committed. Am I right? > > In general, this is correct. but lucene is designed in a way that it > never modifies an existing index file. if you have a 40 MB index segment > file and you delete a document within that index, lucene will simply > update a small other file which is kept along the index called > .del. Adding a new document to an existing index segment > is not possible, in that case lucene will create a new segment. Ok. To get this working, you have to create at least one segment per transaction, right? And index merging could be done in background? Sounds really interesting. But if the blob values are cached locally they have to be downloaded on startup first before the index starts to be fast. Or does the blob cache survive restarts? Lots of questions ;) Cheers, Christoph