lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wojciech Strzałka <>
Subject Re[2]: Frequently updated fields
Date Sat, 13 Sep 2008 10:22:18 GMT

   My strong reqirement is that search server runs on different machine then
   client - so I think I have two options: SOLR or Lucene via RMI
   So by now it looks like I have several options:

   1. TagIndex - like described here, looks like
   exactly what I need. Unfortunatelly it's not yet commited and
   source code patching will be required.
   Probably will not work with SOLR soon.
   2. SOLR updatable fields - I'm not really sure if it
   reduces reindexing the whole document or the only benefit is that I don't need to push
   content to server but it's internally reindexed anyway. Not yet
   commited anyway.
   3. Writing custom filter - it looks like the most stable at the
   moment but I'm not sure few things:
   - will I be able to use this with SOLR - setup and then control
   filter (which needs to be dynamic) by some URL params or query itself
   - I'll need to query the DB for metadata which I think will be fast
   but complicates the architecture

   The question is how advanced are 1 & 2 - are they ready for
   production? Will further work change it behavior or data
   The filter solution is also possible - I'd love to, if I can use it
   inside SOLR.

> Yes Tag Index will work.  I have not had time to complete it however
> if you are interested in working on it please feel free to contact me.

> On Fri, Sep 12, 2008 at 3:48 PM, Mark Miller <> wrote:
>> You might check out the tagindex issue in jira as well. Havn't looked at it
>> myself, but I believe its supposed to be an option for this.
>> Gerardo Segura wrote:
>>> I think the important question is: in general how to cope with frequently
>>> changing fields.
>>> Karl Wettin wrote:
>>>> Hi Wojciech,
>>>> can you please give us a bit more specific information about the meta
>>>> data fields that will change? I would recommend you looking at creating
>>>> filters from your primary persistency for query clauses such as unread/read,
>>>> mailbox folders, et c.
>>>>      karl
>>>> 12 sep 2008 kl. 13.57 skrev Wojciech Strza?ka:
>>>>> Hi.
>>>>>  I'm new to Lucene and I would like to get a few answers (they can
>>>>>  be lame)
>>>>>  I want to index large amount of emails using Lucene (maybe SOLR), not
>>>>> only
>>>>>  the contents but also some metadata like state or flags. The
>>>>>  problem is that the metadata will change during mail lifecycle,
>>>>>  although much smaller updating this information will require
>>>>>  reindex the whole mail content which I see performance bottleneck.
>>>>>  I have the data in DB also so my first question is:
>>>>>  - are there any best practices to implement my needs (querying both
>>>>>  lucene & DB and then merging in memory?, close one eye and re-index
>>>>>  the whole content on every metadata change? others?)
>>>>>  - is at all Lucene good solution for my problem?
>>>>>  - are there any plans to implement field updates in more efficient way
>>>>> then
>>>>>  delete/insert the whole document? if yes what's the time horizon?
>>>>>                                       Best regards
>>>>>                                              Wojtek
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail:
>>>>> For additional commands, e-mail:
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail:
>>>> For additional commands, e-mail:
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>>> For additional commands, e-mail:
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:

> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

 Wojciech Strzalka

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message