Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 45830 invoked from network); 24 Aug 2006 20:24:00 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 24 Aug 2006 20:24:00 -0000 Received: (qmail 25890 invoked by uid 500); 24 Aug 2006 20:22:32 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 25836 invoked by uid 500); 24 Aug 2006 20:22:32 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 25635 invoked by uid 99); 24 Aug 2006 20:22:32 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Aug 2006 13:22:31 -0700 X-ASF-Spam-Status: No, hits=0.5 required=10.0 tests=DNS_FROM_RFC_ABUSE,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of gdedian@gmail.com designates 66.249.92.172 as permitted sender) Received: from [66.249.92.172] (HELO ug-out-1314.google.com) (66.249.92.172) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Aug 2006 13:22:29 -0700 Received: by ug-out-1314.google.com with SMTP id y2so577262uge for ; Thu, 24 Aug 2006 13:22:08 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=OX3JfSi9oBDvozATXIChuzLXBVCueHO1rIgd0j15JvcxOwBxHLOo4T0qU8Lj5Kxf9X9UNPvc/xtLUpIgz/K1gj/yuKagfUkCTdBnaFxbWzJL4SHLLRuC4/1TJU1Ny5ntM3qoB6/xjb2oApbCRdmmTLCva+BGc9RY4yu0WM6S88w= Received: by 10.67.93.6 with SMTP id v6mr1378542ugl; Thu, 24 Aug 2006 13:22:08 -0700 (PDT) Received: by 10.67.27.10 with HTTP; Thu, 24 Aug 2006 13:22:08 -0700 (PDT) Message-ID: Date: Thu, 24 Aug 2006 13:22:08 -0700 From: "Dedian Guo" To: java-user@lucene.apache.org, "Zhao, Xin" Subject: Re: controlled library In-Reply-To: <16ee01c6c790$2f7b36e0$963181a2@win.ad.jhu.edu> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_140221_16146051.1156450928201" References: <16ee01c6c790$2f7b36e0$963181a2@win.ad.jhu.edu> X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N ------=_Part_140221_16146051.1156450928201 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline in my solution, you can apply one doc for each mesh term, or apply different keyword such as "mesh_1"...."mesh_10" for your top 10 terms...or u can group your mesh terms as one string then add into a field, which requires a simple string parser for the group string when you wanna read the terms... not sure if that works or answers your question... On 8/24/06, Zhao, Xin wrote: > > Hi, > I have a design question. Here is what we try to do for indexing: > We designed an indexing tool to generate standard MeSH terms from medical > citations, and then use Lucene to save the terms and citations for future > search. The information we need to save are: > a) the exact mesh terms (top 10) > b) the score for each term > so the codings are like > ----------------------------------- > for the top 10 MeSH Terms > myField=Field.Keyword("mesh", mesh.toLowerCase()); > myField.setBoost(score); > doc.add(myFiled); > end for > ------------------------------------ > as you could see we generate all the terms under named field "mesh". If I > understand correctly, all the fields under the same name would > eventually save into one field, with all the scores be normalized into > filed boost. In this case, we wouldn't be able to save separate score, so > the information is lost. Am I correct? Is there anyway we could change it? I > understand Lucene is for keyword search, and what we try to do is Controlled > Vocabulary search, Any other tool we could use? > > Thank you, > Xin > > > ------=_Part_140221_16146051.1156450928201--