lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DM Smith <dmsmith...@gmail.com>
Subject Re: Basic Question on Documents and File Format
Date Fri, 11 Nov 2005 16:29:35 GMT
Ashwin Satyanarayana wrote:

>Hello,
> 
>I am new to Lucene. I was trying to use Lucene with TREC-6 Data. The dataset for TREC-6
used in 1997 contains many input files.  Each input file hasmultiple documents
>(some files contain over 200 documents) tagged by DOCNO. The result given
>by Lucene to a query is a list of files and not documents.
> 
>Q1) Is there a way of getting the query results in terms of documents
>within the files rather than files ( without modifying the code)?
>  
>
In lucene a Document object is the unit of search/storage/indexing. It 
may or may not correspond to an user's view of files or documents.

> 
>Q2) If the above is not posssible, what would be the best way to modify
>the code?
>  
>
To achieve what you want, I think you need to store and/or index each of 
your documents as a lucene Document. You may also want to store the file 
name and document identifier as a lucene field in the lucene Document.

> 
>Thanks and Regards,
>Ashwin
>
Questions on how to use lucene should be addressed to the lucene users 
mailing list. This one is for developers developing lucene itself.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message