lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From selvakumar netaji <vvekselva...@gmail.com>
Subject Re: Combining The results from DB and Index Regd.,
Date Tue, 13 Nov 2012 07:29:27 GMT
Hi  Arjen,

Thanks for the reply

I have one query. If the index is updated often(for every minute) will the
search performance degrade. Is it good approach to index the documents
often?


On Tue, Nov 13, 2012 at 12:43 PM, Arjen van der Meijden <
acmmailing@tweakers.net> wrote:

> On 13-11-2012 4:15 selvakumar netaji wrote:
>
>> Hi All,
>>
>>
>> We are using lucene for searching data from the database in our enterprise
>> application.
>>
>> The  searches would be in a single index, whose documents are indexed from
>> two different databases A and B. The frequency of updating the database A
>> is linear i.e. for every minute it gets inserted whereas the frequency of
>> updating of the database B is on a weekly basis.
>>
>>
>> The problem is with the indexing of the database A. For eg if the indexing
>> got completed in t second and and a data(d1) gets inserted in (t+1) second
>> then the search of Data d1 would not be in index.
>>
>
> Is it really a problem that there is a window where an update from the
> database is not yet visible? Or do you just perceive it as a problem? I.e.
> is it something an end-user will (or did) complain about?
>
>
>  To avoid this data loss,
>> Searching can be performed in index and in db(whose record are not in
>> index). The problem over here is that we won't be able to get the score
>> base ordering in database and there would be problems in combining the
>> results from the db and from the index. Is there are any way to get the
>> lucene score form the search results in db.
>>
>> The other alternative would be update the index for every 30(might be less
>> than that)  sec so that the whenever the db gets updated the index gets
>> updated. Is there are any other solution to update the index  directly
>> whenever the db gets updated. Can you please suggest.
>>
>
> Perhaps you can convert it into something event-based, for instance with a
> Message Queue (jms) which allows you to stream the updates as soon as you
> know they're made. And combined with NRT (near real time search) you should
> be able to access the changes fairly quickly after being made.
>
> But there will still be a window where the database is ahead of the search
> index.
>
>
>  The final solution as I've thought would be to have two indexes, one file
>> system index and a in-memory index. The file system index would be indexed
>> or updated on a daily basis and the in-memory index would be updated
>> whenever the db changes. So we'll search both the indexes and we'll
>> combine
>> the data since both have the lucene scores. So there would not be any data
>> loss.
>>
>
> This scenario will also have a windows where the database is ahead...
> Unless you start a transaction on the database and make that wait for the
> update on the search index.
>
> You could also try some tricks with the user interface, if the amount of
> results from the database is very small compared to the normal result. Say
> you have 100 results from the index and 3 from the db that are not yet in
> the index.
> You could present that as 'These 3 results are so new, the're not properly
> processed yet and here are the 100 results that are fully processed'
> That way, you leave the 'scoring' to the user.
>
> Best regards,
>
> Arjen
>
> ------------------------------**------------------------------**---------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.**apache.org<java-user-unsubscribe@lucene.apache.org>
> For additional commands, e-mail: java-user-help@lucene.apache.**org<java-user-help@lucene.apache.org>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message