asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jianfeng Jia <jianfeng....@gmail.com>
Subject Re: Should in-memory components from different dataset share the entire memory?
Date Thu, 10 Mar 2016 23:01:34 GMT
Got it. Thanks Murtadha. 

> On Mar 10, 2016, at 2:55 PM, Murtadha Hubail <hubailmor@gmail.com> wrote:
> 
> Hi Jianfeng,
> 
> I just want to clarify that the issue causing the exception is the fact metadata datasets
are being evicted and not opened again.
> This is caused by the special treat of metadata indexes (not going thru the same code
path as other indexes). I have already filed an issue[1] for it and will submit a fix today
that will fix the exception issue.
> 
> The dynamic memory allocation needs its separate discussion.
> 
> Cheers,
> Murtadha
> 
> [1] https://issues.apache.org/jira/browse/ASTERIXDB-1338 <https://issues.apache.org/jira/browse/ASTERIXDB-1338>
> 
>> On Mar 10, 2016, at 2:46 PM, Jianfeng Jia <jianfeng.jia@gmail.com> wrote:
>> 
>> Dear Devs,
>> 
>> I have some questions about the memory management of the in-memory components for
different datasets.
>> 
>> The current AsterixDB backing the cloudberry demo is down every few days. It always
throws an exception like following: 
>> Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: Failed to open
index with resource ID 7 since it does not exist.
>> 
>> As described in ASTERIXDB-1337, each dataset has a fixed budget no matter how small/big
it is. Then the number of datasets can be loaded at the same time is also fixed by $number
= storage.memorycomponent.globalbudget/storage.memorycomponent.numpages. My question is if
we have more than $number of datasets, then the eviction will happen? Will it evict a entire
dataset of the victim? Base on the symptom of above exception, it seems the metadata get evicted?
Could we protect the metadata from eviction? 
>> 
>> A more fundament question Is it possible that all those datasets share a global budget
in a multi-tenant way? 
>> In my workload there are one main dataset( ~10Gb) and five tiny auxiliary datasets
(each size <20M). In addition, the client will create a bunch of temporary datasets depends
on how many concurrent users are and each temp-dataset will be “refreshed" for a new query.
(The refresh is done by drop and create the temp-dataset). It’s hard to find one storage.memorycomponent.numpages
that make every dataset happy.
>> 
>> 
>> 
>> Best,
>> 
>> Jianfeng Jia
>> PhD Candidate of Computer Science
>> University of California, Irvine
>> 
> 



Best,

Jianfeng Jia
PhD Candidate of Computer Science
University of California, Irvine


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message