lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Serebrennikov <dmit...@earthlink.net>
Subject Re: indexing size
Date Thu, 09 Sep 2004 00:50:43 GMT
Niraj Alok wrote:

>Hi PA,
>
>Thanks for the detail ! Since we are using lucene to store the data also, I
>guess I would not be able to use it.
>  
>
By the way, I could be wrong, but I think the 35% figure you referenced 
in the your first e-mail actually does not include any stored fields. 
The deal with 35% was, I think, to illustrate that index data structures 
used for searching by Lucene are efficient. But Lucene does nothing 
special about stored content - no compression or anything like that. So 
you end up with the pure size of your data plus the 35% of the indexed 
data.

Cheers.
Dmitry.

>Regards,
>Niraj
>----- Original Message -----
>From: "petite_abeille" <petite_abeille@mac.com>
>To: "Lucene Users List" <lucene-user@jakarta.apache.org>
>Sent: Wednesday, September 01, 2004 1:14 PM
>Subject: Re: indexing size
>
>
>  
>
>>Hi Niraj,
>>
>>On Sep 01, 2004, at 06:45, Niraj Alok wrote:
>>
>>    
>>
>>>If I make some of them Field.Unstored, I can see from the javadocs
>>>that it
>>>will be indexed and tokenized but not stored. If it is not stored, how
>>>can I
>>>use it while searching?
>>>      
>>>
>>The different type of fields don't impact how you do your search. This
>>is always the same.
>>
>>Using Unstored fields simply means that you use Lucene as a pure index
>>for search purpose only, not for storing any data.
>>
>>Specifically, the assumption is that your original data lives somewhere
>>else, outside of Lucene. If this assumption is true, then you can index
>>everything as Unstored with the addition of one Keyword per document.
>>The Keyword field holds some sort of unique identifier which allows you
>>to retrieve the original data if necessary (e.g. a primary key, an URI,
>>what not).
>>
>>Here is an example of this approach:
>>
>>(1) For indexing, check the indexValuesWithID() method
>>
>>http://cvs.sourceforge.net/viewcvs.py/zoe/ZOE/Frameworks/SZObject/
>>SZIndex.java?view=markup
>>
>>Note the addition of a Field.Keyword for each document and the use of
>>Field.UnStored for everything else
>>
>>(2) For fetching, check objectsWithSpecificationAndHitsInStore()
>>
>>http://cvs.sourceforge.net/viewcvs.py/zoe/ZOE/Frameworks/SZObject/
>>SZFinder.java?view=markup
>>
>>HTH.
>>
>>Cheers,
>>
>>PA.
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>    
>>
>
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message