Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 96746 invoked from network); 6 Jul 2010 07:19:03 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 6 Jul 2010 07:19:03 -0000 Received: (qmail 28900 invoked by uid 500); 6 Jul 2010 07:19:02 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 28353 invoked by uid 500); 6 Jul 2010 07:18:57 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 28344 invoked by uid 99); 6 Jul 2010 07:18:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Jul 2010 07:18:56 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ian.lea@gmail.com designates 209.85.214.48 as permitted sender) Received: from [209.85.214.48] (HELO mail-bw0-f48.google.com) (209.85.214.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Jul 2010 07:18:48 +0000 Received: by bwz2 with SMTP id 2so5253909bwz.35 for ; Tue, 06 Jul 2010 00:17:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=03s4ZiC4D5UuiPrRokODDQtDt3/itAGmnKdjffWUzus=; b=tPY1N9IKpWqHLfzCr8kHAG/mIhKZSdMGw6wrlo9HdZpxJM2q+R9YAVGFCwn61sarXA 7I7h7QG5u9PnnSAmSnxsgbz9+bmSP8NlLNm/WrGAlpBg7eilNdj9+jXMMkcBR6FwNO8w /r1wQYGA92uZU8RSel4fw5NZw128La0is1WxQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=ACK9nZZDQMsbX5wPGr0+uwsgTbrp8fI3mw6ywqRXlmxbFZ9rHnFahYAtAMYrdLb+um 7kLYSAhBxVHr+bb99t8/8GZkRjISFEekCjUiWyApXM3Lg/iO2c7MAtrgQTFuwVS9JF3f G4nn9J3A4yI76IYEmGH+e1E0es1uTIrKAHc4w= Received: by 10.204.47.193 with SMTP id o1mr3296059bkf.134.1278400648232; Tue, 06 Jul 2010 00:17:28 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.76.13 with HTTP; Tue, 6 Jul 2010 00:17:08 -0700 (PDT) In-Reply-To: References: <9CABEBCD-664C-4110-97E6-B495D3FEA7F7@apache.org> From: Ian Lea Date: Tue, 6 Jul 2010 08:17:08 +0100 Message-ID: Subject: Re: Lucene Scoring To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org You are calling the explain method incorrectly. You need something like System.out.println(indexSearcher.explain(query, 0)); See the javadocs for details. -- Ian. On Tue, Jul 6, 2010 at 7:39 AM, manjula wijewickrema wrote: > Dear Grant, > > Thanks a lot for your guidence. As you have mentioned, I tried to use > explain() method to get the explanations for relevant scoring. But, once = I > call the explain() method, system indicated the following error. > > Error- > 'The method explain(Query,int) in the type Searcher is not applicable for > the arguments (String, int)'. > > In my code I call the explain() method as follows- > Searcher.explain("rice",0); > > Possibly the wrong with my way of passing parameters. In my case, I have > chosen "rice" as my query and indexed only one document. > > Could you pls. let me know what's wrong with this. I also included the co= de > with this. > > Thanx > Manjula > > code- > ** > > *import* org.apache.lucene.search.Searcher; > > *public* *class* LuceneDemo { > > *public* *static* *final* String *FILES_TO_INDEX_DIRECTORY* =3D "filesToI= ndex" > ; > > *public* *static* *final* String *INDEX_DIRECTORY* =3D "indexDirectory"; > > *public* *static* *final* String *FIELD_PATH* =3D "path"; > > *public* *static* *final* String *FIELD_CONTENTS* =3D "contents"; > > *public* *static* *void* main(String[] args) *throws* Exception { > > *createIndex*(); > > *searchIndex*("rice"); > > =A0} > > *public* *static* *void* createIndex() *throws* CorruptIndexException, > LockObtainFailedException, IOException { > > =A0SnowballAnalyzer analyzer =3D *new* SnowballAnalyzer( "English", > StopAnalyzer.ENGLISH_STOP_WORDS); > > *boolean* recreateIndexIfExists =3D *true*; > > IndexWriter indexWriter =3D *new* IndexWriter(*INDEX_DIRECTORY*, analyzer= , > recreateIndexIfExists); > > File dir =3D *new* File(*FILES_TO_INDEX_DIRECTORY*); > > File[] files =3D dir.listFiles(); > > *for* (File file : files) { > > Document document =3D *new* Document(); > > String path =3D file.getCanonicalPath(); > > document.add(*new* Field(*FIELD_PATH*, path, Field.Store.*YES*, Field.Ind= ex. > UN_TOKENIZED,Field.TermVector.*YES*)); > > Reader reader =3D *new* FileReader(file); > > document.add(*new* Field(*FIELD_CONTENTS*, reader)); > > indexWriter.addDocument(document); > > =A0} > > indexWriter.optimize(); > > indexWriter.close(); > > } > > *public* *static* *void* searchIndex(String searchString) > *throws*IOException, ParseException { > > System.*out*.println("Searching for '" + searchString + "'"); > > Directory directory =3D FSDirectory.getDirectory(*INDEX_DIRECTORY*); > > IndexReader indexReader =3D IndexReader.open(directory); > > IndexSearcher indexSearcher =3D *new* IndexSearcher(indexReader); > > =A0SnowballAnalyzer analyzer =3D *new* SnowballAnalyzer( "English", > StopAnalyzer.ENGLISH_STOP_WORDS); > > QueryParser queryParser =3D *new* QueryParser(*FIELD_CONTENTS*, analyzer)= ; > > Query query =3D queryParser.parse(searchString); > > Hits hits =3D indexSearcher.search(query); > > System.*out*.println("Number of hits: " + hits.length()); > > TopDocs results =3D indexSearcher.search(query,10); > > ScoreDoc[] hits1 =3D results.scoreDocs; > > *for* (ScoreDoc hit : hits1) { > > Document doc =3D indexSearcher.doc(hit.doc); > > //System.out.printf("%5.3f %s\n",hit.score,doc.get(FIELD_CONTENTS)); > > System.*out*.println(hit.score); > > Searcher.explain("rice",0); > > } > > =A0Iterator it =3D hits.iterator(); > > *while* (it.hasNext()) { > > Hit hit =3D it.next(); > > Document document =3D hit.getDocument(); > > String path =3D document.get(*FIELD_PATH*); > > System.*out*.println("Hit: " + path); > > } > > } > > } > > > On Mon, Jul 5, 2010 at 7:46 PM, Grant Ingersoll wro= te: > >> >> On Jul 5, 2010, at 5:02 AM, manjula wijewickrema wrote: >> >> > Hi, >> > >> > In my application, I input only single term query (at one time) and ge= t >> back >> > the corresponding scorings for those queries. But I am little struggli= ng >> of >> > understanding Lucene scoring. I have reffered >> > >> http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similar= ity.html >> > and >> > some other pages to resolve my matters. But some are still remain. >> > >> > 1) Why it has taken the squareroot of frequency as the tf value and >> square >> > of the idf vale in score function? >> >> Somewhat arbitrary, I suppose, but I think someone way back did some tes= ts >> and decided it performed "best" in general. =A0More importantly, the poi= nt of >> the Similarity class is you can override these if you desire. >> >> > >> > 2) If I enter single term query, then what will return bythe coord(q,d= )? >> > Since there are always one term in the query, I think always it should= be >> 1! >> > Am I correct? >> >> Should be. =A0You can run the explain() method to confirm. >> >> > >> > 3) I am also struggling understanding sumOfSquaredWeights (in >> queryNorm(q)). >> > As I can understand, this value depends on the nature of the query we >> input >> > and depends on that, it uses different methods such as TermQuery, >> > MultiTermQuery, BooleanQuery, WildcardQuery, PhraseQuery, PrefixQuery, >> etc. >> > But if I always use single term query, then what will be the way selec= ted >> by >> > the system from above? >> >> The queryNorm is an attempt at making scores comparable across queries. >> =A0Again, I'd try the explain() method to see the practical aspects of h= ow it >> effects score. >> >> See http://lucene.apache.org/java/2_4_0/scoring.html for more info on >> scoring. >> >> -Grant >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org