Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm
Precedence: bulk
Reply-To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Received-SPF: pass (hermes.apache.org: local policy)
Message-ID: <41530200.4040403@getopt.org>
Date: Thu, 23 Sep 2004 19:04:00 +0200
From: Andrzej Bialecki <ab@getopt.org>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
 rv:1.7.2) Gecko/20040803
MIME-Version: 1.0
To: Lucene Users List <lucene-user@jakarta.apache.org>
Subject: Re: Clustering lucene's results
References: <BAY8-F35nXAwIvOTDlT00036f91@hotmail.com>
 <4152D0EF.1050404@cs.put.poznan.pl>
In-Reply-To: <4152D0EF.1050404@cs.put.poznan.pl>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Dawid Weiss wrote:
> Hi William,
> 
> No, I don't have examples because I never used Lucene directly. If you 
> provide me with a sample index and an API that executes a query on this 
> index (I need document titles, summaries, or snippets and an anchor 
> (identifier), can be an URL).

Hi Dawid :-)

I believe the approach to this component should be that you first 
initialize it by reading a mapping of Lucene index field names to 
"logical" names (metadata) like title, url, body, etc. The reason is 
that each index uses its own metadata schema, i.e. in Lucene-speak, the 
field names.

Moreover, when you execute a query you get just a document id plus its 
score. It's up to you to build a snippet. There is a code in the 
jakarta-lucene-sandbox CVS repo. (highlighter) to create snippets from 
the query and the hit list, take a look at this...

> 
> Send me such a snippet and I'll try to write the integration code with 
> Lucene. It is only a matter of writing a simple InputComponent instance 
> and this is really trivial (see Nutch's plugin code).

The basic usage scenario is that you open the IndexReader (either using 
directory name as a String or a Directory instance), and then create a 
Query instance, usually using QueryParser, and finally you search using 
IndexSearcher. You get a list of Hits, which you can use to get scores, 
and the contents of the documents. Take a look at the IndexFiles and 
SearchFiles classes in org.apache.lucene.demo package (under /src/demo).

-- 
Best regards,
Andrzej Bialecki

-------------------------------------------------
Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
-------------------------------------------------
FreeBSD developer (http://www.freebsd.org)


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org