lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DM Smith <>
Subject Re: Basic Question on Documents and File Format
Date Fri, 11 Nov 2005 16:29:35 GMT
Ashwin Satyanarayana wrote:

>I am new to Lucene. I was trying to use Lucene with TREC-6 Data. The dataset for TREC-6
used in 1997 contains many input files.  Each input file hasmultiple documents
>(some files contain over 200 documents) tagged by DOCNO. The result given
>by Lucene to a query is a list of files and not documents.
>Q1) Is there a way of getting the query results in terms of documents
>within the files rather than files ( without modifying the code)?
In lucene a Document object is the unit of search/storage/indexing. It 
may or may not correspond to an user's view of files or documents.

>Q2) If the above is not posssible, what would be the best way to modify
>the code?
To achieve what you want, I think you need to store and/or index each of 
your documents as a lucene Document. You may also want to store the file 
name and document identifier as a lucene field in the lucene Document.

>Thanks and Regards,
Questions on how to use lucene should be addressed to the lucene users 
mailing list. This one is for developers developing lucene itself.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message