directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lécharny <elecha...@gmail.com>
Subject Re: Mavibot bulkloader update
Date Wed, 24 Sep 2014 14:30:24 GMT
Le 24/09/14 14:07, Kiran Ayyagari a écrit :
> On Wed, Sep 24, 2014 at 2:45 PM, Emmanuel Lécharny <elecharny@gmail.com>
> wrote:
>
>> Hi guys,
>>
>> a quick update on the bulkloader development...Using what Kiran did in
>> the mavibot-partition-bulkloader, I moved the part that specifically
>> handle the BTree bulkloading to Mavibot, because there is no reason
>> someone could not bulkload any kind of data into a BTree.
>>
>> I created a BulkLoader class in Mavibot for that purpose. The bulk
>> loading is a 2 phases process :
>> - sort the data
>> - create the btree
>>
>> We had something working for in-memory data (ie, when you can read
>> everything in memory), but when it comes to read huge set of data, you
>> have to write sorted data on disk. This was what I was working on
>> lately. Basically, we read a chunk of data, and when we reach a given
>> number (configurable), we sort what we read and write it back on disk. I
>> tested it with 10M elements, written on 100 files. The deal is to merge
>> the sorted data when we read them back from the files (as each file is
>> sorted, it's just a matter to take the sammlest value from all the files).
>>
>> All in all, this part is now working.
>>
>> The second phase was an issue in the current bulkload implementation :
>> it left the btree unbalanced (when you fill the leaves completely, the
>> last leaf may contain less than N/2 elements : the current
>> implementation didn't handle this case). It was quite a challenge to
>> balance a btree when you were buildling it, as the pb spread to all the
>> upper layers. After a few hours thinking about the best way to do it
>> without having to shuffle many already written pages, I realized that we
>> actually *know* how many entries we have in the BTree, as we have sorted
>> the entries at first ! Knowing the numbe rof elements is the key : you
>> can predict how the btree should be balanced.
>>
>> This is the part I'm working on atm.
>>
>> There are a few things that need some love still :
>> - the entrySerializer has to be moved from JDBM to the Mavibot Partition
>>
> can we move this to a xdbm-partition module?

Sure ! This is the exact place it should go to.
>
>> - the Mavibot partition builder has to be rewritten using the Mavibot
>> bulkloader (but that should be easy, as we just have to call the
>> BulkLoader primitive)
>> - the In-memory bulk load has to be implemented
>> - I'd like to implement a compact() method taht takes an existing BTree
>> and compact it on the fly
>>
> hmm, how about applying this to the entire database level rather than just
> an individual BTree?
The compact method will be for one single tree, but we can extend it to
the partition, but the  it will be a partition specific method.

Note that compacting a full partition is most certainly problematic, as
it will block writes for potentially minutes (or worse...)

OTOH, we benefit from the fact that the data are already sorted, so we
spare teh first phase.

> - we have to deal with the RecordManager, because atm we can't have it
>> opened while processing a btree. It could be interesting to be able to
>> compact a btree while the RM is up and running (assuming no write
>> operation is done in the mean time).
>>
>> right, this is why I think compacting should be applied to entire database
> instead of a single BTree.
>
> otoh, bulkloader will be quite handy here while compacting, all we need to
> do is pass the relevant cursors of the existing BTrees.
Yes.


Mime
View raw message