asterixdb-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: Data in AsterixDB skewing towards one node
Date Wed, 04 Nov 2015 19:36:33 GMT
Hi Pouria,

I create internal datasets and load the data by reading record files from

Regards, Max

> Hi Max,
> Can you please explain this part a bit more:
> "… When I load the external data it is all saved on a single node"
> Are you using "external datasets" or "internal datasets, loaded from files
> on HDFS".
> The fact is if you are using "external datasets", then AsterixDB does not
> really load any thing. It just gets the location of blocks on HDFS and
> remembers them. So in this case, if there is any issue with uniform
> distribution of data files, that is really related to HDFS and not
> AsterixDB. But if you are 'loading' an "internal" dataset by reading
> records from files on HDFS and you see issues with uniform distribution of
> created on-disk components, then that is another issue and could be
> related
> to AsterixDB.
> Pouria
> On Wed, Nov 4, 2015 at 11:23 AM, <> wrote:
>> Hello,
>> I have a cluster setup of AsterixDB running 4 nodes with the first being
>> the master node and a node controller running on each of them. As a test
>> I
>> run TPC-H queries on them loading the generated TPC-H datasets from a
>> hadoop distributed file system.
>> When I load the external data it is all saved on a single node. For
>> later
>> querying that means that most of the computations are done by that
>> single
>> node which slows down the whole query (and makes the distributed
>> computation idea obsolete).
>> By now I tried to setup the system several times and interestingly
>> enough
>> two times I was able to receive a fully functional system. Unfortunatly
>> I
>> currently cannot reproduce a functional system state and whenever I try
>> to
>> do a new setup I get the data skewing towards one node.
>> Has that ever happened before? Do you know the reason for this or how to
>> fix that?
>> Regards, Max

View raw message