apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chandni Singh <chan...@datatorrent.com>
Subject Re: Fault-tolerant cache backed by a store
Date Mon, 09 Nov 2015 06:37:35 GMT
Forgot to attach the link.
https://docs.google.com/document/d/1gRWN9ufKSZSZD0N-pthlhpC9TZ8KwJ6hJlAX6nxl5f8/edit#heading=h.wlc0p58uzygb

On Sun, Nov 8, 2015 at 10:36 PM, Chandni Singh <chandni@datatorrent.com>
wrote:

> Hi,
> This contains the overview of large state management.
> Some parts need more description which I am working on but please free to
> go through it and any feedback is appreciated.
>
> Thanks,
> Chandni
>
>
> On Tue, Oct 20, 2015 at 8:31 AM, Pramod Immaneni <pramod@datatorrent.com>
> wrote:
>
>> This is a much needed component Chandni.
>>
>> The API for the cache will be important as users will be able to plugin
>> different implementations in future like those based off of popular
>> distributed in-memory caches. Ehcache is a popular cache mechanism and API
>> that comes to bind. It comes bundled with a non-distributed implementation
>> but there are commercial distributed implementations of it as well like
>> BigMemory.
>>
>> Given our needs for fault tolerance we may not be able to adopt the
>> ehcache
>> API as is but an extension of it might work. We would still provide a
>> default implementation but going off of a well recognized API will
>> facilitate development of other implementations in future based off of
>> popular implementations already available. We will need to investigate if
>> we can use the API as is or with relatively straightforward extensions
>> which will be a positive for using it. But if the API turns out to be
>> significantly deviating from what we need then that would be a negative.
>>
>> Also it would be great if we could support an iterator to scan all the
>> keys, lazy loading as needed, since this need comes up from time to time
>> in
>> different scenarios such as change data capture calculations.
>>
>> Thanks.
>>
>> On Mon, Oct 19, 2015 at 9:10 PM, Chandni Singh <chandni@datatorrent.com>
>> wrote:
>>
>> > Hi All,
>> >
>> > While working on making the Join operator fault-tolerant, we realized
>> the
>> > need of a fault-tolerant Cache in Malhar library.
>> >
>> > This cache is useful for any operator which is state-full and stores
>> > key/values for a very long period (more than an hour).
>> >
>> > The problem with just having a non-transient HashMap for the cache is
>> that
>> > over a period of time this state will become so large that
>> checkpointing it
>> > will be very costly and will cause bigger issues.
>> >
>> > In order to address this we need to checkpoint the state iteratively,
>> i.e.,
>> > save the difference in state at every application window.
>> >
>> > This brings forward the following broad requirements for the cache:
>> > 1. The cache needs to have a max size and is backed by a filesystem.
>> >
>> > 2. When this threshold is reached, then adding more data to it should
>> evict
>> > older entries from memory.
>> >
>> > 3. To minimize cache misses, a block of data is loaded in memory.
>> >
>> > 4. A block or bucket to which a key belongs is provided by the user
>> > (operator in this case) as the information about closeness in keys (that
>> > can potentially reduce future misses) is not known to the cache but to
>> the
>> > user.
>> >
>> > 5. lazy load the keys in case of operator failure
>> >
>> > 6. To offset the cost of loading a block of keys when there is a miss,
>> > loading can be done asynchronously with a callback that indicates when
>> the
>> > key is available. This allows the operator to process other keys which
>> are
>> > in memory.
>> >
>> > 7. data that is spilled over needs to be purged when it is not needed
>> > anymore.
>> >
>> >
>> > In past we solved this problem with BucketManager which is not in open
>> > source now and also there were some limitations with the bucket api -
>> the
>> > biggest one is that it doesn't allow to save multiple values for a key.
>> >
>> > My plan is to create a similar solution as BucketManager in Malhar with
>> > improved api.
>> > Also save the data on hdfs in TFile which provides better performance
>> when
>> > saving key/values.
>> >
>> > Thanks,
>> > Chandni
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message