asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jianfeng Jia <jianfeng....@gmail.com>
Subject Re: Should in-memory components from different dataset share the entire memory?
Date Thu, 10 Mar 2016 23:23:19 GMT
Another way is to allocate the entire global space upfront, but there is no upper-bound for
each dataset. The bigger dataset gets more pages. The drawback is that we may waste more 
space if all dataset are all small. 

The ideal case is there is only one dynamically memory allocation manager with one global
upper-bound, and all datasets can share the space without an extra bound per dataset. 


> On Mar 10, 2016, at 2:55 PM, Yingyi Bu <buyingyi@gmail.com> wrote:
> 
>>> A more fundament question Is it possible that all those datasets share a
> global budget in a multi-tenant way?
> In principle, the budget should just be a upper-bound. If a dataset doesn't
> need that much, it shouldn't pre-allocate all
> "storage.memorycomponent.numpages"
> pages.
> 
> However, in the current implementation, we pre-allocate all in-memory pages
> upfront:
> https://github.com/apache/incubator-asterixdb-hyracks/blob/master/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/VirtualBufferCache.java#L247
> 
> I think we should fix it to dynamically allocate memory when needed.  (Disk
> buffer cache already does that.)
> 
> Best,
> Yingyi
> 
> 
> On Thu, Mar 10, 2016 at 2:46 PM, Jianfeng Jia <jianfeng.jia@gmail.com>
> wrote:
> 
>> Dear Devs,
>> 
>> I have some questions about the memory management of the in-memory
>> components for different datasets.
>> 
>> The current AsterixDB backing the cloudberry demo is down every few days.
>> It always throws an exception like following:
>> Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: Failed
>> to open index with resource ID 7 since it does not exist.
>> 
>> As described in ASTERIXDB-1337, each dataset has a fixed budget no matter
>> how small/big it is. Then the number of datasets can be loaded at the same
>> time is also fixed by $number =
>> storage.memorycomponent.globalbudget/storage.memorycomponent.numpages. My
>> question is if we have more than $number of datasets, then the eviction
>> will happen? Will it evict a entire dataset of the victim? Base on the
>> symptom of above exception, it seems the metadata get evicted? Could we
>> protect the metadata from eviction?
>> 
>> A more fundament question Is it possible that all those datasets share a
>> global budget in a multi-tenant way?
>> In my workload there are one main dataset( ~10Gb) and five tiny auxiliary
>> datasets (each size <20M). In addition, the client will create a bunch of
>> temporary datasets depends on how many concurrent users are and each
>> temp-dataset will be “refreshed" for a new query. (The refresh is done by
>> drop and create the temp-dataset). It’s hard to find one
>> storage.memorycomponent.numpages that make every dataset happy.
>> 
>> 
>> 
>> Best,
>> 
>> Jianfeng Jia
>> PhD Candidate of Computer Science
>> University of California, Irvine
>> 
>> 



Best,

Jianfeng Jia
PhD Candidate of Computer Science
University of California, Irvine


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message