Return-Path: Delivered-To: apmail-jackrabbit-users-archive@locus.apache.org Received: (qmail 4113 invoked from network); 16 Jul 2008 09:46:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 16 Jul 2008 09:46:39 -0000 Received: (qmail 69066 invoked by uid 500); 16 Jul 2008 09:46:37 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 69051 invoked by uid 500); 16 Jul 2008 09:46:37 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 69040 invoked by uid 99); 16 Jul 2008 09:46:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Jul 2008 02:46:37 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of marcel.reutegger@gmx.net designates 213.165.64.20 as permitted sender) Received: from [213.165.64.20] (HELO mail.gmx.net) (213.165.64.20) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 16 Jul 2008 09:45:43 +0000 Received: (qmail invoked by alias); 16 Jul 2008 09:46:05 -0000 Received: from bsl-rtr.day.com (EHLO [10.0.0.80]) [62.192.10.254] by mail.gmx.net (mp052) with SMTP; 16 Jul 2008 11:46:05 +0200 X-Authenticated: #894343 X-Provags-ID: V01U2FsdGVkX1/QRwdgoJTQJ5D2FGlEL3lwAIUipGDnX8Ch9as6SO 9qrDhgFxw5Zwzc Message-ID: <487DC35B.1030108@gmx.net> Date: Wed, 16 Jul 2008 11:46:03 +0200 From: Marcel Reutegger User-Agent: Thunderbird 2.0.0.14 (Windows/20080421) MIME-Version: 1.0 To: users@jackrabbit.apache.org Subject: Re: Question on scalability References: <76CE541B04343E41AFB23F126A46FE160D79C65C@exchlt1.LIFETOUCH.NET> In-Reply-To: <76CE541B04343E41AFB23F126A46FE160D79C65C@exchlt1.LIFETOUCH.NET> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Y-GMX-Trusted: 0 X-FuHaFi: 0.73 X-Virus-Checked: Checked by ClamAV on apache.org Hi Paul, Paul Kling wrote: > I am currently working on a project that has the need to store 100 > million images every year. We tend to keep the images for around 18 > months. So at any one time we will have about 150 million images in the > repository. The question I have is does this sound reasonable to store > this quantity of images in JackRabbit or does this sound scary? I worry > about retrieval of the items. it doesn't sound scary to me, though I have to admit that I never worked with a repository of that size. I've seen repositories that contain around 10 million nodes without any issues. > I also noticed there is a clustering feature but the documentation > seemed to point you to using the DB for file storage. We have been down > the route of letting the DB store file data in the past and it has never > turned out to be something that worked well and I don't think I can get > people convinced of again. jackrabbit does not store the complete file in the database, it stores the contents of a file in a data store [1]. the database will only contain a reference into the data store. [1] http://wiki.apache.org/jackrabbit/DataStore regards marcel