lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Cortes <dcor...@fib.upc.edu>
Subject Re: index question
Date Mon, 27 Dec 2004 11:49:15 GMT
A lot of thks Nader, I try now, and I tell you the results.
thks

Nader Henein wrote:

> ok, so you can index the whole document in one shot, but you should 
> store certain fields like what you display in the search results in 
> the index to avoid a round trip to the DB.
>
> so for example you would store "title" "synopsis" "link" "doc_id" 
> "date" and then just index what you want to be searchable, the reason 
> why you would have title stored in one field and indexed again in 
> another so if you stem that field it will become useless for display 
> purposes.  So the logical representation of your index would look 
> something like this:
>
> <document>
>    <id> stored/ indexed
>    <title> stored/ un-indexed
>    <synopsis> stored/ un-indexed
>    <date> stored / indexed
>    <full document stemmed>  indexed / un stored
> </document>
>
> Enjoy
>
> Nader Henein
>
>
> Daniel Cortes wrote:
>
>> thks nader
>> I need a general search of documents, it's for this that I ask yours 
>> recomendations, because fields are only for info in the search. 
>> Tipically search on Google for example
>>
>> search:casa
>>
>> La casa roja
>> ..habĂ­a una vez una casa roja que tenia ....
>> htttp:\\go.to\casa    Modification date:25-12-04
>>
>> for do this  what fields and options (keybord,text,unindex,unstored) 
>> do you should use?
>>
>> thks
>>
>> Nader Henein wrote:
>>
>>> It comes down to your searching needs, do you need to have your 
>>> documents searcheable by these fields or do you need a general 
>>> search of the whole document, your decisions will impact the size of 
>>> the index and the speed of indexing and searching so give it due 
>>> thought, start from your GUI requirement and design the index that 
>>> responds to your user needs best.
>>>
>>> Nader
>>>
>>> Daniel Cortes wrote:
>>>
>>>> I want to know In the case that you use Lucene for index files how 
>>>> a general searcher, what fields (or keys) do you use to index.
>>>> For example, in my case are html,pdf,doc,ppt and txt and I'm 
>>>> thinked to use Field Autor, Field title, field url, field content, 
>>>> field modification date.
>>>> Something more? some recommendation?
>>>> thks
>>>> and Merry Xmas for all.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>>>
>>>>
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message