asterixdb-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From schul...@informatik.hu-berlin.de
Subject Re: unable to load external data
Date Tue, 20 Oct 2015 22:31:59 GMT
I am using AsterixDB 0.8.6 and Hadoop 2.6.0.

Thanks for the help,
Max


> Hi Max,
> Which version of AsterixDB are you running? The old stable release
> uses a really old version of Hadoop dependencies, so that might be it.
> What's the version your HDFS cluster has? The latest master is using
> 2.2.0 by default, but 2.4.0 or 2.6.0 should work as well.
>
> Thanks,
> -Ian
>
> On Tue, Oct 20, 2015 at 5:40 AM,  <schultze@informatik.hu-berlin.de>
> wrote:
>> Hello,
>>
>> I have done a cluster setup of AsterixDB on four nodes. Everyhing is
>> running fine and I want to load some data into the system to run sum
>> bigger examples. However I am unable to do so using the description at
>>
>> https://asterixdb.ics.uci.edu/documentation/aql/externaldata.html
>>
>> I created a dataverse, a datatype and a dataset as follows:
>>
>> create dataverse tpch;
>>
>> use dataverse tpch
>> create type LineitemType as closed {
>>       orderkey:int32,
>>       partkey: int32,
>>       suppkey: int32,
>>       linenumber: int32,
>>       quantity: double,
>>       extendedprice: double,
>>       discount: double,
>>       tax: double,
>>       returnflag: string,
>>       linestatus: string,
>>       shipdate: string,
>>       commitdate: string,
>>       receiptdate: string,
>>       shipinstruct: string,
>>       shipmode: string,
>>       comment: string}
>>
>> create dataset lineitem(LineitemType) if not exists primary key
>> orderkey,
>> linenumber
>>
>> as described on the homepage linked above there are two ways to load
>> data
>> from, using either a reachable HDFS or the localFS. I have a running
>> HDFS
>> within the same network containing the data I want to access and tried
>> to
>> reach it like this:
>>
>> load dataset lineitem using hdfs
>> (("hdfs"="hdfs://192.168.127.11:50040"),
>> ("path"="/user/schultzem/lineitem.tbl"),
>> ("input-format"="text-input-format"),
>> ("format"="delimited-text"),
>> ("delimiter"="|"));
>>
>> However I get an error message
>>
>> Unable to create adapter org.apache.hadoop.ipc.RemoteException: Server
>> IPC
>> version 9 cannot communicate with client version 3 [AlgebricksException]
>>
>> all I found out about this was an old Issue from 2013 that recommends an
>> older version of hadoop, which is not an option for me.
>>
>> https://code.google.com/p/asterixdb/issues/detail?id=521
>>
>> Is this somehow fixable?
>>
>> The other option to load data from the localFS also throws an error.
>>
>> load dataset lineitem using localfs
>> (("path"="192.168.127.21:///home/schultzem/tpch/TPCH_data_10GB/lineitem.tbl"),
>>     ("format"="delimited-text"),
>>     ("delimiter"="|"));
>>
>> leads to
>>
>> No node controllers found at the address: 192.168.127.21
>> [AsterixException]
>>
>> which is the same error as for 127.0.0.1.
>>
>> On the linked documentation about external datasets it is assumed that
>> AsterixDB is used in local mode. Is this the problem why I cannot reach
>> the cluster nodes?
>>
>> Did I make a mistake accessing the data? How can I load data into the
>> database?
>>
>> Regards, Max
>>
>



Mime
View raw message