lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sunnyfr <johanna...@gmail.com>
Subject Re: commit often and lot of data cost too much?
Date Wed, 01 Apr 2009 08:06:42 GMT

Yep but we won't change the system now :(
Or maybe I can have two kinds of schema ? 
One which is the new video during the day so just new datas and the other
one by night which update all caracteristic of videos ?  full update nightly
and light new update during the day ? 
what do you think ?? 
Because the other caracteristics are not that important but used for
filters, most view, comment ... 

:)
Thanks Ted


Ted Dunning wrote:
> 
> What kind of updates are these?  new documents?  Small changes to existing
> documents?
> 
> Are the changing fields important for searching?
> 
> If the updates are not involved in searches, then it would be much better
> to
> put the non-searched characteristics onto an alternative storage system.
> That would drive down the update rate dramatically and leave you with a
> pretty simple system.
> 
> If the updates *are* involved in searches, then you might consider using a
> system more like Katta than solr.  You can then create a new shard out of
> the update and broadcast a mass delete to all nodes just before adding the
> new shard to the system.  This has the benefit of very fast add updates
> and
> good balancing, but has the defect that you don't have persistence of your
> deletes until you do a full index again.  Your search nodes could right
> the
> updated index back to the persistent store, but that is scary without
> something like hadoop to handle failed updates.
> 
> On Tue, Mar 31, 2009 at 6:51 AM, sunnyfr <johanna.34@gmail.com> wrote:
> 
>>
>> I've about 14M of document. My index is about 11G.
>> For the moment I update every 20mn about 30 000 documents.
>> Lucene alwarys merge data, What would you reckon?
>> My replication cost too much for the slave, they always bring back new
>> index
>> directories and no segment.
>>
>> Is there a way to get around this issue ? what would you reckon to people
>> who need fresh update on the slave with a big amount of data ??
>> Thanks a lot,
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/commit-often-and-lot-of-data-cost-too-much--tp22804941p22821675.html
Sent from the Lucene - General mailing list archive at Nabble.com.


Mime
View raw message