directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lécharny <>
Subject Re: [Mavibot] BulkLoad
Date Fri, 20 Jun 2014 15:40:54 GMT
Le 20/06/2014 14:50, Howard Chu a écrit :
> Emmanuel Lécharny wrote:
>> Hi guys,
>> many thanks Kiran for the OOM fix !
>> That's one step toward a fast load of big database load.
>> The next steps are also critical. We are currently limited by the memory
>> size as we store in memory the DNs we load. In order to go one step
>> farther, we need to implement a system where we can prcoess a ldif file
>> with no limitation due to the available memory.
>> That supposes we prcoess the ldif file by chunks, and once the chuks are
>> sorted, then we process them as a whole, pulling one element from each
>> of the sorted list of DN and picking the smallest to inject it into the
>> BTree.
> Why do you store the DNs in memory? Why are you sorting them?

We need to build the RDN index, which contains ParentIDandRDN data
structure, where each element is a tuple with the parentID and the
current RDN. That means we must have seen the parent before we can deal
with the children. This is why we keep the DN in memory.

View raw message