Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-user@lucene.apache.org
Received-SPF: neutral (athena.apache.org: local policy)
Message-ID: <48CCC608.8070808@correobancomer.com>
Date: Sun, 14 Sep 2008 03:06:32 -0500
From: Gerardo Segura <gsegura@correobancomer.com>
User-Agent: Thunderbird 2.0.0.16 (Windows/20080708)
MIME-Version: 1.0
To: java-user@lucene.apache.org
Subject: Re: Frequently updated fields
References: <1221781428.20080912135749@gmail.com>
 <6E7487CD-2836-4B3C-9940-67F989F018E8@gmail.com>
 <1999039569.20080912145111@gmail.com>
In-Reply-To: <1999039569.20080912145111@gmail.com>
Content-Type: text/plain; charset=ISO-8859-2; format=flowed
Content-Transfer-Encoding: 8bit

I had similar requirements: some fields didn't required text processing, 
there were just used as filters to focus the search on subset of 
documents in solr. As Karl suggested, implementing a filter was the most 
direct approach for me.

The issue was that, not been familiar myself with solr, I couldn't 
manage to integrate my filter without modifying SolrIndexSearcher,  the 
change was basically to replace every invocation of

          searcher.search(query, new HitCollector() { ... }) ;
with
          searcher.search(query, myCustomFilter, new HitCollector() { 
... }) ;

myCustomFilter is an instance of TermsFilter with document's keys added 
based on a query from external database.  Also minor changes were made 
in SolrCore.java to be able to declare the filter in sorlconfig.xml.
The thing worked ok, but I always wondered if that was the best way to 
integrate the filter.

regards,

Wojciech Strza�ka wrote:
> The most changing fields will be I think:
>   Status (read/unread):  in fact I'm affraid of this at most - any
>                          mail incoming to the system will need to be indexed at least twice
>   Flags:   0..n values from enum
>   Tags:    0..n values from enum
>
> Of course all the other fields can also change - even content in draft messages
> (it's live content, not archival) - but in such a case I'm ready to go
> with the re-indexing.
>   
>> Hi Wojciech,
>>     
>> can you please give us a bit more specific information about the meta
>> data fields that will change? I would recommend you looking at  
>> creating filters from your primary persistency for query clauses such
>> as unread/read, mailbox folders, et c.
>>     
>>        karl
>>     
>> 12 sep 2008 kl. 13.57 skrev Wojciech Strza?ka:
>>     
>>> Hi.
>>>
>>>   I'm new to Lucene and I would like to get a few answers (they can
>>>   be lame)
>>>
>>>   I want to index large amount of emails using Lucene (maybe SOLR),  
>>> not only
>>>   the contents but also some metadata like state or flags. The
>>>   problem is that the metadata will change during mail lifecycle,
>>>   although much smaller updating this information will require
>>>   reindex the whole mail content which I see performance bottleneck.
>>>
>>>   I have the data in DB also so my first question is:
>>>
>>>   - are there any best practices to implement my needs (querying both
>>>   lucene & DB and then merging in memory?, close one eye and re-index
>>>   the whole content on every metadata change? others?)
>>>
>>>   - is at all Lucene good solution for my problem?
>>>
>>>   - are there any plans to implement field updates in more efficient  
>>> way then
>>>   delete/insert the whole document? if yes what's the time horizon?
>>>
>>>
>>>                                        Best regards
>>>                                               Wojtek
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>       
>
>
>   
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>     
>
>
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org