Return-Path: Delivered-To: apmail-jackrabbit-users-archive@locus.apache.org Received: (qmail 15913 invoked from network); 10 Jul 2007 08:26:03 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 10 Jul 2007 08:26:03 -0000 Received: (qmail 32157 invoked by uid 500); 10 Jul 2007 08:26:04 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 32136 invoked by uid 500); 10 Jul 2007 08:26:04 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 32127 invoked by uid 99); 10 Jul 2007 08:26:03 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jul 2007 01:26:03 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of jukka.zitting@gmail.com designates 209.85.132.247 as permitted sender) Received: from [209.85.132.247] (HELO an-out-0708.google.com) (209.85.132.247) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jul 2007 01:26:00 -0700 Received: by an-out-0708.google.com with SMTP id c37so274607anc for ; Tue, 10 Jul 2007 01:25:40 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=jAQokbnLug86YqjiioXjjL2EQGVmlR9ApAM8meB4Y23Q1lS/cdB5EKaiyewZ+areAa9reCTIf0yzX9h+J6kaRfnODE8JOLaOHPjGishV+4O7ff+nNN9A9MPvQjLDsqcRVy18KLTypE0PAnPukRf8hqhy1meuiwxvaKmYAEvhxUo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=SIcKgR8m1XhF3WnI091CISm6/j7VVY/k5cOxy+P2txv+rUSGP5n4Y5UZezZ4Hv3qPtLR+wg3sfhwCn8OVKeWucfYEyrx3Cqpb4zHvSj4OFW02rlnoW3IsU/tXwpvS93ydQeEw4gJdqEllgpqK2E7WgY+FfIlIdCViZOezsGYOr0= Received: by 10.100.8.18 with SMTP id 18mr2047512anh.1184055939947; Tue, 10 Jul 2007 01:25:39 -0700 (PDT) Received: by 10.100.163.1 with HTTP; Tue, 10 Jul 2007 01:25:39 -0700 (PDT) Message-ID: <510143ac0707100125m172ec5f7n172f453da4e2dc7@mail.gmail.com> Date: Tue, 10 Jul 2007 11:25:39 +0300 From: "Jukka Zitting" To: users@jackrabbit.apache.org Subject: Re: DM Rule #4: Beware of Same Name Siblings. In-Reply-To: <8641fd7c0707100109y59f3965m7668553b09bf691@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <003401c7c0a6$d1238aa0$736a9fe0$@co.uk> <140176f0707090614y4620a324idd2f8cb0f0e509bf@mail.gmail.com> <76a6ebd00707091316q776df0b9k950a82df44841ebc@mail.gmail.com> <510143ac0707100056k5b419b49r449dee28993927ac@mail.gmail.com> <8641fd7c0707100109y59f3965m7668553b09bf691@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org Hi, On 7/10/07, Tako Schotanus wrote: > I understand the recommendation, I just would like to add that > sometimes finding a good identifier is pretty difficult. The example > of using a person's email address is just not stable enough, there are > lots of people who are changing email addresses continuously and many > of the addresses don't contain any clue as to the person they belong > to. Note that names don't really need to be globally unique or stable. A name doesn't even need to reflect any specific attribute (real name, etc.) of the node. > And since the 1.5 years that I live in Spain now I'm still amazed at > the number of people that have EXACTLY the same name! Even taking into > account that they have 2 last names (from both parents) and normally > several first names as well! (Probably due to the fact that it was > customary to name children after grandparents) This is where the recommendation to avoid huge flat collections comes to help. A repository that models the population of Spain could (and should!) use some hierarchy. A geographic hierarchy would divide people based on the area, city, street, block, etc. where they live in. A geneologic hierarchy could use either maternal or paternal lines and have the repository hierarchy reflect the actual parent/child relations in real world. BR, Jukka Zitting