asterixdb-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Maxon <ima...@uci.edu>
Subject Re: unable to load external data
Date Wed, 21 Oct 2015 06:17:54 GMT
No problem :) I would definitely try the latest master version then.
Asterix 0.8.6 uses Hadoop 0.20.2, which is really ancient. You will
probably be best off checking out from source and changing the Hadoop
dependency in the top-level Asterix pom to 2.6.0. from 2.2.0.

On Tue, Oct 20, 2015 at 3:31 PM,  <schultze@informatik.hu-berlin.de> wrote:
> I am using AsterixDB 0.8.6 and Hadoop 2.6.0.
>
> Thanks for the help,
> Max
>
>
>> Hi Max,
>> Which version of AsterixDB are you running? The old stable release
>> uses a really old version of Hadoop dependencies, so that might be it.
>> What's the version your HDFS cluster has? The latest master is using
>> 2.2.0 by default, but 2.4.0 or 2.6.0 should work as well.
>>
>> Thanks,
>> -Ian
>>
>> On Tue, Oct 20, 2015 at 5:40 AM,  <schultze@informatik.hu-berlin.de>
>> wrote:
>>> Hello,
>>>
>>> I have done a cluster setup of AsterixDB on four nodes. Everyhing is
>>> running fine and I want to load some data into the system to run sum
>>> bigger examples. However I am unable to do so using the description at
>>>
>>> https://asterixdb.ics.uci.edu/documentation/aql/externaldata.html
>>>
>>> I created a dataverse, a datatype and a dataset as follows:
>>>
>>> create dataverse tpch;
>>>
>>> use dataverse tpch
>>> create type LineitemType as closed {
>>>       orderkey:int32,
>>>       partkey: int32,
>>>       suppkey: int32,
>>>       linenumber: int32,
>>>       quantity: double,
>>>       extendedprice: double,
>>>       discount: double,
>>>       tax: double,
>>>       returnflag: string,
>>>       linestatus: string,
>>>       shipdate: string,
>>>       commitdate: string,
>>>       receiptdate: string,
>>>       shipinstruct: string,
>>>       shipmode: string,
>>>       comment: string}
>>>
>>> create dataset lineitem(LineitemType) if not exists primary key
>>> orderkey,
>>> linenumber
>>>
>>> as described on the homepage linked above there are two ways to load
>>> data
>>> from, using either a reachable HDFS or the localFS. I have a running
>>> HDFS
>>> within the same network containing the data I want to access and tried
>>> to
>>> reach it like this:
>>>
>>> load dataset lineitem using hdfs
>>> (("hdfs"="hdfs://192.168.127.11:50040"),
>>> ("path"="/user/schultzem/lineitem.tbl"),
>>> ("input-format"="text-input-format"),
>>> ("format"="delimited-text"),
>>> ("delimiter"="|"));
>>>
>>> However I get an error message
>>>
>>> Unable to create adapter org.apache.hadoop.ipc.RemoteException: Server
>>> IPC
>>> version 9 cannot communicate with client version 3 [AlgebricksException]
>>>
>>> all I found out about this was an old Issue from 2013 that recommends an
>>> older version of hadoop, which is not an option for me.
>>>
>>> https://code.google.com/p/asterixdb/issues/detail?id=521
>>>
>>> Is this somehow fixable?
>>>
>>> The other option to load data from the localFS also throws an error.
>>>
>>> load dataset lineitem using localfs
>>> (("path"="192.168.127.21:///home/schultzem/tpch/TPCH_data_10GB/lineitem.tbl"),
>>>     ("format"="delimited-text"),
>>>     ("delimiter"="|"));
>>>
>>> leads to
>>>
>>> No node controllers found at the address: 192.168.127.21
>>> [AsterixException]
>>>
>>> which is the same error as for 127.0.0.1.
>>>
>>> On the linked documentation about external datasets it is assumed that
>>> AsterixDB is used in local mode. Is this the problem why I cannot reach
>>> the cluster nodes?
>>>
>>> Did I make a mistake accessing the data? How can I load data into the
>>> database?
>>>
>>> Regards, Max
>>>
>>
>
>

Mime
View raw message