From users-return-759-apmail-jackrabbit-users-archive=jackrabbit.apache.org@jackrabbit.apache.org Mon Sep 04 08:56:09 2006 Return-Path: Delivered-To: apmail-jackrabbit-users-archive@locus.apache.org Received: (qmail 64351 invoked from network); 4 Sep 2006 08:56:08 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 4 Sep 2006 08:56:08 -0000 Received: (qmail 10262 invoked by uid 500); 4 Sep 2006 08:56:08 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 10252 invoked by uid 500); 4 Sep 2006 08:56:08 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 10243 invoked by uid 99); 4 Sep 2006 08:56:08 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Sep 2006 01:56:08 -0700 X-ASF-Spam-Status: No, hits=0.5 required=10.0 tests=DNS_FROM_RFC_ABUSE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of seandynan@gmail.com designates 64.233.182.186 as permitted sender) Received: from [64.233.182.186] (HELO nf-out-0910.google.com) (64.233.182.186) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Sep 2006 01:56:07 -0700 Received: by nf-out-0910.google.com with SMTP id p77so1116501nfc for ; Mon, 04 Sep 2006 01:55:45 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:sender:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=G9RzP5uY+wh0LImgrEvxWmKP8JqFsk9UrlNN9ZYbBRzxmtNBnCjwSlgBXgt3ec4vwKgWRaESQpgFiNBRjCJiFxXGQ/+gKapONGDdMueVk6ClTJ5ILxoskm24cSjZRfCcXg9aBcDBgUl0m4T8hWOUZUb32B5iwUXI6DCCF7rErxs= Received: by 10.48.230.18 with SMTP id c18mr6346159nfh; Mon, 04 Sep 2006 01:55:45 -0700 (PDT) Received: by 10.49.87.3 with HTTP; Mon, 4 Sep 2006 01:55:45 -0700 (PDT) Message-ID: <63c452f80609040155y544931edr765fa36cbea08478@mail.gmail.com> Date: Mon, 4 Sep 2006 09:55:45 +0100 From: "Sean Dynan" Sender: seandynan@gmail.com To: users@jackrabbit.apache.org Subject: Re: Newbie seeking a leg up In-Reply-To: <96ab3ced0609040136q235ac34w5680130e603ed4f8@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <63c452f80609031708u2bb38820m834ae72314f1c829@mail.gmail.com> <96ab3ced0609040136q235ac34w5680130e603ed4f8@mail.gmail.com> X-Google-Sender-Auth: c7ae0db32df6c36a X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Thanks Michael. Yeah I'm gearing up to run a bunch of metrics (it's the only way, right? ;-) But I was also hoping to get some insight into some advice I'd read from mailing list archives. 1. "Use separate Write and Read repositories for high volume applications." This doesn't sound like what I would want since there is an implied lazy synch and I want my users to be able to see saved content immediately (not at some arbitrary time in the future). 2. "A deep-noded workspace may be more efficient/speedy for updates than a shallow, wide workspace." Again, this makes me pause for thought since I envisage my repository 'schema' being wide rather than deep. I guess I'm hunting for some Jackrabbit best practices, if such things exist. Thanks. -- Sean Dynan On 9/4/06, Michael Neale wrote: > Hi Sean. > > I was also looking in to how Jackrabbit behaved for my needs (I had some > specific numbers I was looking for) - the best thing I found was to just > build up some basic test code to load up enough nodes to verify it would do > roughly what I needed (which was a bit more effort then I thought, but I got > there in the end). > > The moral of the tale: try it ! hack away and see how it behaves. For a > large flat structure like you are suggesting, it seems that save operations > can be a little slow (but slow is relative, in terms of user experience its > still fast) for large numbers of nodes under the same parent, but otherwise > seemed fine (I was also concerned with versioning a lot, which I am not sure > concerns you). > > On 9/4/06, Sean Dynan wrote: > > > > Hi all > > > > I am just starting to investigate Jackrabbit as the content repository > > for an application and some advice from the experts would seriously > > speed up my evaluation. > > > > The problem domain in question is somewhat similar to a corporate mail > > server: > > - Large user base > > - Frequent user reads and writes, mostly of text and > > images. By 'frequent' I mean usage on a par with an > > email client > > > > Where it differs from the mail server comparison: > > - Users can query the repository by keyword and expect > > rapid results (think Google) > > - Users can query each other's information stores > > - No upper bound to the physical size of the repository > > > > Right now, I don't envisage a deep-noded store. Think of many Items, > > each containing content and a little bunch of metadata (e.g. datetime, > > list of keywords, etc.). The email analogy would be many email > > messages, each with a body and a header. > > > > I am also thinking of implementing each user's content store as a > > Workspace. Each workspace would have two top-level nodes hanging from > > its root: Private and Public. Then each of those two nodes would > > contain many, many content Items. > > > > Is there anything so far that wouldn't be well served by building on > > top of Jackrabbit? > > > > Am I right in assuming (given my ideas above) that cross-workspace > > queries are perfectly possible, and that all that is required is a > > separate, logged-in Session to each one? > > > > Can you envisage any issues (performance or otherwise) with my > > pitifully meagre outline? Any sage words of advice on how to best > > start architecting my repository? > > > > > > Many thanks! > > -- > > Sean Dynan