asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: Should in-memory components from different dataset share the entire memory?
Date Fri, 11 Mar 2016 01:30:15 GMT
+1 for this incremental allocation proposal.  Also, our whole budgeting 
/ memory management model was version 0 - and we need to think about 
version 1 sometime in the not-so-distant future.  (Once upon a time we 
imagined an intelligent memory controller overseeing the union of buffer 
cache, working memories for queries, and in-memory components, and 
working with a future version of the query optimizer to make more 
intelligent choices. In the absence of statistics, workload info, and 
cost info, or runtime observations thereof, this was our first 
brain-dead approach to getting something running that we could improve 
later - and it seems to be getting towards later now. :-))

On 3/10/16 2:55 PM, Yingyi Bu wrote:
>>> A more fundament question Is it possible that all those datasets share a
> global budget in a multi-tenant way?
> In principle, the budget should just be a upper-bound. If a dataset doesn't
> need that much, it shouldn't pre-allocate all
> "storage.memorycomponent.numpages"
> pages.
>
> However, in the current implementation, we pre-allocate all in-memory pages
> upfront:
> https://github.com/apache/incubator-asterixdb-hyracks/blob/master/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/VirtualBufferCache.java#L247
>
> I think we should fix it to dynamically allocate memory when needed.  (Disk
> buffer cache already does that.)
>
> Best,
> Yingyi
>
>
> On Thu, Mar 10, 2016 at 2:46 PM, Jianfeng Jia <jianfeng.jia@gmail.com>
> wrote:
>
>> Dear Devs,
>>
>> I have some questions about the memory management of the in-memory
>> components for different datasets.
>>
>> The current AsterixDB backing the cloudberry demo is down every few days.
>> It always throws an exception like following:
>> Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: Failed
>> to open index with resource ID 7 since it does not exist.
>>
>> As described in ASTERIXDB-1337, each dataset has a fixed budget no matter
>> how small/big it is. Then the number of datasets can be loaded at the same
>> time is also fixed by $number =
>> storage.memorycomponent.globalbudget/storage.memorycomponent.numpages. My
>> question is if we have more than $number of datasets, then the eviction
>> will happen? Will it evict a entire dataset of the victim? Base on the
>> symptom of above exception, it seems the metadata get evicted? Could we
>> protect the metadata from eviction?
>>
>> A more fundament question Is it possible that all those datasets share a
>> global budget in a multi-tenant way?
>> In my workload there are one main dataset( ~10Gb) and five tiny auxiliary
>> datasets (each size <20M). In addition, the client will create a bunch of
>> temporary datasets depends on how many concurrent users are and each
>> temp-dataset will be “refreshed" for a new query. (The refresh is done by
>> drop and create the temp-dataset). It’s hard to find one
>> storage.memorycomponent.numpages that make every dataset happy.
>>
>>
>>
>> Best,
>>
>> Jianfeng Jia
>> PhD Candidate of Computer Science
>> University of California, Irvine
>>
>>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message