Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 87932 invoked from network); 26 Apr 2004 20:36:59 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 26 Apr 2004 20:36:59 -0000 Received: (qmail 32556 invoked by uid 500); 26 Apr 2004 20:36:42 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 32532 invoked by uid 500); 26 Apr 2004 20:36:41 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 32509 invoked from network); 26 Apr 2004 20:36:41 -0000 Received: from unknown (HELO n6mcgw16.cchmc.org) (205.142.197.62) by daedalus.apache.org with SMTP; 26 Apr 2004 20:36:41 -0000 Received: from DOMSVC03-MTA by n6mcgw16.cchmc.org with Novell_GroupWise; Mon, 26 Apr 2004 16:36:46 -0400 Message-Id: X-Mailer: Novell GroupWise Internet Agent 6.5.1 Date: Mon, 26 Apr 2004 16:36:22 -0400 From: "Gerard Sychay" To: Subject: Re: Adding duplicate Fields to Documents Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N As two people have already stated, fields that are not tokenised are stored seprately. Then a single document with two fields of the same name can be retrieved by searching for either of the fields. However, retrieving all of the field values seems to be impossible. That is, given ("field_name", "keyword1") and ("field_name", "keyword2"), using doc.get("field_name") always returns "keyword2", the last value added. Of course, I can't really think of a scenario where this would be a problem.. Thanks for the help! >>> Gerard Sychay 04/26/04 01:57PM >>> Luke is a good idea. I'll also just write a simple test program and play around with it (something I probably should have done before posting) and then post my findings here. >>> Stephane James Vaucher 04/24/04 02:02PM >>> >From my experience (that is little experience;)), fields that are not tokenised, are stored separately. Someone more qualified can surely give you more details. You can look at your index with Luke, it might be insightful. sv On Thu, 22 Apr 2004, Gerard Sychay wrote: > Hello, > > I am wondering what happens when you add two Fields with same names to > a Document. The API states that "if the fields are indexed, their text > is treated as though appended." This much makes sense. But what about > the following two cases: > > - Adding two fields with same name that are indexed, not tokenized > (keywords)? E.g. given ("field_name", "keyword1") and ("field_name", > "keyword2"), would the final keyword field be ("field_name", > "keyword1keyword2")? Seems weird.. > > - Adding two fields with same name that are stored, but not indexed and > not tokenized (e.g. database keys)? Are they appended (which would mess > up the database key when retrieved from the Hit)? > --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org