tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jae Lee <otooil...@gmail.com>
Subject Re: I have question with tajo query
Date Tue, 12 Nov 2013 01:15:36 GMT
Hi Jihoon,

Thank you for your answer.
About Q1 has more question.
I already waiting for query result long time. I think that is not normaly.
COUNT(*) query got result only 300sec, but DISTINCT, GROUPBY and SUM query
is excuting whole day.
I found another error message. Please see below message.
Error message is about integer type column name of "Year".
The query was "select distinct year from departuredelay;"
I was execute same query on Hive. It had no error.
But Year column has some null or blank data.
Table was create EXTERNAL table with several CSV files on HDFS.
---------------------------------------------------------------------------

2013-11-11 18:34:01,436 ERROR worker.Task (Task.java:run(363)) -
java.lang.NumberFormatException: For input string: "Year"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
 at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.valueOf(Integer.java:582)
at org.apache.tajo.datum.DatumFactory.createInt4(DatumFactory.java:140)
 at org.apache.tajo.storage.LazyTuple.createByTextBytes(LazyTuple.java:313)
at org.apache.tajo.storage.LazyTuple.get(LazyTuple.java:126)
 at org.apache.tajo.engine.eval.FieldEval.eval(FieldEval.java:58)
at org.apache.tajo.engine.planner.Projector.eval(Projector.java:87)
 at
org.apache.tajo.engine.planner.physical.SeqScanExec.next(SeqScanExec.java:111)
at
org.apache.tajo.engine.planner.physical.HashAggregateExec.compute(HashAggregateExec.java:57)
 at
org.apache.tajo.engine.planner.physical.HashAggregateExec.next(HashAggregateExec.java:83)
at
org.apache.tajo.engine.planner.physical.PartitionedStoreExec.next(PartitionedStoreExec.java:121)
 at org.apache.tajo.worker.Task.run(Task.java:355)
at org.apache.tajo.worker.TaskRunner$1.run(TaskRunner.java:376)
 at java.lang.Thread.run(Thread.java:744)

-----------------------------------------------------------------------------



Also I attach file tajo-site.xml.
Please check my config is correct.

hostname | hadoop | tajo | DBMS
namenode | namenode | master | Maria (Metastore)
snamenode | snamenode+datanode1 | worker
datanode02 | datanode2 | worker
datanode03 | datanode3 | worker

-----------------------------------------------------------------------------
<configuration>
<property>
    <name>tajo.rootdir</name>
    <value>hdfs://namenode:9000/tajo</value>
</property>
<property>
    <name>tajo.master.umbilical-rpc.address</name>
    <value>namenode:26001</value>
</property>
<property>
    <name>tajo.master.client-rpc.address</name>
    <value>namenode:26002</value>
</property>
<property>
    <name>tajo.master.info-http.address</name>
    <value>namenode:26080</value>
</property>
<property>
    <name>tajo.catalog.client-rpc.address</name>
    <value>namenode:26005</value>
</property>
</configuration>


Regards,
Jae


2013/11/11 Jihoon Son <ghoonson@gmail.com>

> Hi Jae Lee,
> thanks for your interesting to Tajo.
>
> Here are my answers.
>
> 1. The timeout message looks like an error, but it does not mean that the
> query is failed. (We should change the message.)
> Would you wait for some time after executing a query, please?
> If any other errors occur, please report it to us.
>
> 2. Tajo's SQL commands are designed to follow those of traditional
> relational databases.
> In those systems, the 'DROP table' command deletes data from disks.
> However, we are also considering the Hive-style 'DROP table', because
> tables are generally very large.
>
> 3. Tajo currently does not provide any commands to kill executing queries.
> Instead, you should kill the master and every worker using the unix 'kill'
> command.
>
> If you have any other questions,
> please feel free to ask us.
>
> Thanks,
> Jihoon
>
>
> 2013/11/11 Jae Lee <otooiland@gmail.com>
>
> > Hello,
> >
> > :: I have error message and hang query with below.
> > It's from clustered tajo worker.
> > Centos 6.2 + hadoop 2.0.5 + tajo 0.2.0
> > Just count(*) query is working but  use distinct or group by query had
> hang
> > and this error messages
> >
> > :: have more question
> > Tajo delete files on hdfs when i drop EXTERNAL table. is it normal?
> > Because Hive is not delete files when drop external table.
> >
> > :: How to can i kill tajo jobs (query)?
> >
> > ---------------------------------------------------------------------
> > 2013-11-11 18:44:22,751 WARN  worker.TaskRunner
> (TaskRunner.java:run(339))
> > - Timeout
> >
> >
> GetTask:eb_1384155011466_0005_000001,container_1384155011466_0005_01_000013,
> > but retry
> > java.util.concurrent.TimeoutException
> > at org.apache.tajo.rpc.CallFuture.get(CallFuture.java:81)
> > at org.apache.tajo.worker.TaskRunner$1.run(TaskRunner.java:328)
> >  at java.lang.Thread.run(Thread.java:744)
> >
> >
> > Regards,
> > Jae
> >
>
>
>
> --
> Jihoon Son
>
> Database & Information Systems Group,
> Prof. Yon Dohn Chung Lab.
> Dept. of Computer Science & Engineering,
> Korea University
> 1, 5-ga, Anam-dong, Seongbuk-gu,
> Seoul, 136-713, Republic of Korea
>
> Tel : +82-2-3290-3580
> E-mail : jihoonson@korea.ac.kr
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message