tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Einspanjer <deinspan...@gmail.com>
Subject Getting started with sandbox on mac
Date Tue, 03 Dec 2013 19:10:33 GMT
I originally started playing with Tajo by installing it in a Linux VM of
Hadoop, but I'm trying to make changes to the source, and I'd really like
to be able to test those changes using my dev machine which is a Mac.

I'm trying to figure out how to get the dependencies sorted out as well as
debugging with IntelliJ IDEA.

I'm trying to use a local pseudo-distributed instance of CDH 4.4 for this
dev work.

I've cloned the Tajo git repo and run:

mvn package -DskipTests -Pdist

I then went into the tajo-dist/target/tajo-0.8.0-SNAPSHOT directory
and edited the tajo-env.sh file to point HADOOP_HOME at my
cdh44/hadoop-2.0.0-cdh4.4.0 directory.

I was able to create an App config in IntelliJ for TajoMaster and set
it up as mentioned previously in this mailing list.

When I try to create an external table, I started to run into problems..

First, I tried creating a table1/data.csv file in HDFS, but I wasn't
sure how to reference the host in the location clause.

I tried "location 'hdfs://localhost:8020/table1'", but that gave me
the following exception:

Failed on local exception: com.google.protobuf.InvalidProtocolBufferException:
 Message missing required fields: callId, status; Host Details :
 local host is: "den.local/"; destination host is:

I then tried just pointing at a local directory using "location
'file:/Users/deinspanjer/table1';" instead.

I was able to create the external table and describe it using \d, but
when I try to query it using a simple "select * from table1;", it
starts outputting lines similar to this and never stops:

Progress: 100%, response time: 0.751 sec
Progress: 100%, response time: 1.761 sec
Progress: 100%, response time: 2.788 sec
Progress: 100%, response time: 3.796 sec
Progress: 100%, response time: 90.325 sec

Progress: 100%, response time: 91.332 sec
Progress: 100%, response time: 92.337 sec

If anyone has suggestions on how to resolve my issue with pointing at
hdfs, or why the query against a local file doesn't complete would be


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message