asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Murtadha Hubail <hubail...@gmail.com>
Subject Re: Should in-memory components from different dataset share the entire memory?
Date Thu, 10 Mar 2016 22:55:08 GMT
Hi Jianfeng,

I just want to clarify that the issue causing the exception is the fact metadata datasets
are being evicted and not opened again.
This is caused by the special treat of metadata indexes (not going thru the same code path
as other indexes). I have already filed an issue[1] for it and will submit a fix today that
will fix the exception issue.

The dynamic memory allocation needs its separate discussion.

Cheers,
Murtadha

[1] https://issues.apache.org/jira/browse/ASTERIXDB-1338 <https://issues.apache.org/jira/browse/ASTERIXDB-1338>

> On Mar 10, 2016, at 2:46 PM, Jianfeng Jia <jianfeng.jia@gmail.com> wrote:
> 
> Dear Devs,
> 
> I have some questions about the memory management of the in-memory components for different
datasets.
> 
> The current AsterixDB backing the cloudberry demo is down every few days. It always throws
an exception like following: 
> Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: Failed to open index
with resource ID 7 since it does not exist.
> 
> As described in ASTERIXDB-1337, each dataset has a fixed budget no matter how small/big
it is. Then the number of datasets can be loaded at the same time is also fixed by $number
= storage.memorycomponent.globalbudget/storage.memorycomponent.numpages. My question is if
we have more than $number of datasets, then the eviction will happen? Will it evict a entire
dataset of the victim? Base on the symptom of above exception, it seems the metadata get evicted?
Could we protect the metadata from eviction? 
> 
> A more fundament question Is it possible that all those datasets share a global budget
in a multi-tenant way? 
> In my workload there are one main dataset( ~10Gb) and five tiny auxiliary datasets (each
size <20M). In addition, the client will create a bunch of temporary datasets depends on
how many concurrent users are and each temp-dataset will be “refreshed" for a new query.
(The refresh is done by drop and create the temp-dataset). It’s hard to find one storage.memorycomponent.numpages
that make every dataset happy.
> 
> 
> 
> Best,
> 
> Jianfeng Jia
> PhD Candidate of Computer Science
> University of California, Irvine
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message