tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jihoon Son <jihoon...@apache.org>
Subject Re: Feedback for tajo-0.10.0
Date Mon, 16 Mar 2015 09:35:33 GMT
Honestly, even if there are sufficiently many input files, Impala may be
faster than Tajo becuase their optimization for Parquet if better than that
of Tajo.

On Mon, Mar 16, 2015 at 6:16 PM Jihoon Son <jihoonson@apache.org> wrote:

> Interesting.
> Here are some reasons I think.
>
> Impala can process Parquet files in a distributed manner even if their
> size is very large. This is becuase of its DBMS-like architecture. Impala
> has a storage manager and a query executor, which have totally separated
> roles during query processing. Once a query is executed, the storage
> manager continuously loads data from HDFS into memory. Query executors just
> process data loaded in memory according to the query plan. As a result,
> even though the number of data files is small, huge amounts of query
> executors can process them simultaneously.
>
> However, in Tajo, each query executor (i.e., task) has a scanner to read
> data from HDFS. During query processing, their roles are closely related.
> Thus, the number of tasks is mainly decided based on the number of files.
> (Of course, when input data are dispersed on the entire cluster nodes, the
> number of tasks is decided based on the number of cpu cores and disks per
> worker.)
>
> So, I expect that Tajo's performance with Parquet will be improved when
> there are sufficiently many input files.
> As I aforementioned, this is just my suspection.
> I will investigate further.
>
> Thanks,
> Jihoon
>
>
> On Mon, Mar 16, 2015 at 5:45 PM Azuryy Yu <azuryyyu@gmail.com> wrote:
>
>> HDFS block size is also 1GB
>>
>>
>> On Mon, Mar 16, 2015 at 4:18 PM, Jihoon Son <jihoonson@apache.org> wrote:
>>
>> > Right. A large file size can improves the sequential scan on Parquet.
>> > However, if you want to use the large file size, it is recommended to
>> also
>> > increase the HDFS block size to reduce the remote read cost.
>> > How large size did you set for HDFS blocks?
>> >
>> > On Impala's good performance, I will also investigate it.
>> > It seems to be related with Impala's storage manager.
>> >
>> > Best,
>> > Jihoon
>> >
>> > On Mon, Mar 16, 2015 at 5:05 PM Azuryy Yu <azuryyyu@gmail.com> wrote:
>> >
>> > > Hi Jihoon,
>> > >
>> > > Impala works on Parquet is more faster than other file formats. and
>> > Impala
>> > > advice don't make more small parquet files. 1GB would be better.
>> > >
>> > >
>> > >
>> > >
>> > > On Mon, Mar 16, 2015 at 3:57 PM, Jihoon Son <jihoonson@apache.org>
>> > wrote:
>> > >
>> > > > Thanks!
>> > > > It is really interesting.
>> > > > I suspect that the large file size of Parquet makes Tajo slower.
>> This
>> > is
>> > > > because Parquet is non-splittable, which means that only 4 workers
>> read
>> > > > data from HDFS. In addition, if the HDFS block size is smaller than
>> > 1GB,
>> > > a
>> > > > lot of data can be moved over network during the scan phase.
>> > > >
>> > > > But, I have no idea why Impala shows good performance.
>> > > > Maybe, its cache scheme improved it.
>> > > >
>> > > > Best regards,
>> > > > Jihoon
>> > > >
>> > > > On Mon, Mar 16, 2015 at 4:16 PM Azuryy Yu <azuryyyu@gmail.com>
>> wrote:
>> > > >
>> > > > > PS. my Parquet data was generated by Impala: "Insert into a
>> parquet
>> > > table
>> > > > > [SHUFFLE] ... AS select .... from a text table"
>> > > > >
>> > > > > On Mon, Mar 16, 2015 at 3:11 PM, Azuryy Yu <azuryyyu@gmail.com>
>> > wrote:
>> > > > >
>> > > > > > Hi Jihoon,
>> > > > > >
>> > > > > > Here is an example:
>> > > > > > My data: (Parquet file is 1GB limited)
>> > > > > >  hadoop fs -ls /data/basetable/par/dt=20150301/pf=pc
>> > > > > >
>> > > > > > -rw-r--r--   9 hadoop tajo 1062932057 2015-03-12 15:08
>> > > > > > /data/basetable/par/dt=20150301/pf=pc/cc456c9d427c88a3-
>> > > > > 3ead7e35ecf0da8_448517166_data.0.parq
>> > > > > > -rw-r--r--   9 hadoop tajo 1063205684 2015-03-12 15:11
>> > > > > > /data/basetable/par/dt=20150301/pf=pc/cc456c9d427c88a3-
>> > > > > 3ead7e35ecf0da8_448517166_data.1.parq
>> > > > > > -rw-r--r--   9 hadoop tajo 1063236005 2015-03-12 15:14
>> > > > > > /data/basetable/par/dt=20150301/pf=pc/cc456c9d427c88a3-
>> > > > > 3ead7e35ecf0da8_448517166_data.2.parq
>> > > > > > -rw-r--r--   9 hadoop tajo  543786632 2015-03-12 15:16
>> > > > > > /data/basetable/par/dt=20150301/pf=pc/cc456c9d427c88a3-
>> > > > > 3ead7e35ecf0da8_448517166_data.3.parq
>> > > > > >
>> > > > > > hadoop fs -ls /data/basetable/snappy/dt=20150301/pf=pc
>> > > > > >
>> > > > > > -rw-r--r--   9 tajo tajo  144059045 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00000
>> > > > > > -rw-r--r--   9 tajo tajo  144178118 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00001
>> > > > > > -rw-r--r--   9 tajo tajo  143642438 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00002
>> > > > > > -rw-r--r--   9 tajo tajo  143553142 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00003
>> > > > > > -rw-r--r--   9 tajo tajo  143849627 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00004
>> > > > > > -rw-r--r--   9 tajo tajo  144648456 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00005
>> > > > > > -rw-r--r--   9 tajo tajo  144647502 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00006
>> > > > > > -rw-r--r--   9 tajo tajo  144551053 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00007
>> > > > > > -rw-r--r--   9 tajo tajo  144017287 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00008
>> > > > > > -rw-r--r--   9 tajo tajo  144205111 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00009
>> > > > > > -rw-r--r--   9 tajo tajo  145066506 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00010
>> > > > > > -rw-r--r--   9 tajo tajo  144740791 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00011
>> > > > > > -rw-r--r--   9 tajo tajo  144198266 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00012
>> > > > > > -rw-r--r--   9 tajo tajo  143575440 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00013
>> > > > > > -rw-r--r--   9 tajo tajo  143922343 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00014
>> > > > > > -rw-r--r--   9 tajo tajo  143930019 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00015
>> > > > > > -rw-r--r--   9 tajo tajo  144253019 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00016
>> > > > > > -rw-r--r--   9 tajo tajo  144175506 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00017
>> > > > > > -rw-r--r--   9 tajo tajo  143072995 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00018
>> > > > > > -rw-r--r--   9 tajo tajo  143818118 2015-03-16 11:48
>> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00019
>> > > > > >
>> > > > > > Result:
>> > > > > >
>> > > > > > default> select sum (cast(movie_vv as bigint)),
>> sum(cast(movie_cv
>> > as
>> > > > > > bigint)),sum(cast(movie_pt as bigint)) from snappy where
>> pf='pc';
>> > > > > > Progress: 19%, response time: 1.87 sec
>> > > > > > Progress: 19%, response time: 1.873 sec
>> > > > > > Progress: 19%, response time: 2.276 sec
>> > > > > > Progress: 100%, response time: 2.372 sec
>> > > > > > ?sum_3,  ?sum_4,  ?sum_5
>> > > > > > -------------------------------
>> > > > > > 6928463,  6183665,  6055494385
>> > > > > > (1 rows, 2.372 sec, 27 B selected)
>> > > > > > default> select sum (cast(movie_vv as bigint)),
>> sum(cast(movie_cv
>> > as
>> > > > > > bigint)),sum(cast(movie_pt as bigint)) from par where pf='pc';
>> > > > > > Progress: 0%, response time: 0.751 sec
>> > > > > > Progress: 0%, response time: 0.753 sec
>> > > > > > Progress: 0%, response time: 1.155 sec
>> > > > > > Progress: 0%, response time: 1.959 sec
>> > > > > > Progress: 0%, response time: 2.962 sec
>> > > > > > Progress: 0%, response time: 3.965 sec
>> > > > > > Progress: 0%, response time: 4.968 sec
>> > > > > > Progress: 0%, response time: 5.97 sec
>> > > > > > Progress: 12%, response time: 6.974 sec
>> > > > > > Progress: 12%, response time: 7.977 sec
>> > > > > > Progress: 12%, response time: 8.979 sec
>> > > > > > Progress: 12%, response time: 9.982 sec
>> > > > > > Progress: 25%, response time: 10.985 sec
>> > > > > > Progress: 100%, response time: 11.14 sec
>> > > > > > ?sum_3,  ?sum_4,  ?sum_5
>> > > > > > -------------------------------
>> > > > > > 6928463,  6183665,  6055494385
>> > > > > > (1 rows, 11.14 sec, 27 B selected)
>> > > > > >
>> > > > > > On Mon, Mar 16, 2015 at 2:58 PM, Jihoon Son <
>> jihoonson@apache.org>
>> > > > > wrote:
>> > > > > >
>> > > > > >> Azuryy, thanks for your feedbacks.
>> > > > > >> They are very interesting results.
>> > > > > >> Would you mind telling me how Tajo with Parquet is slower
than
>> > Tajo
>> > > > with
>> > > > > >> RCFile?
>> > > > > >>
>> > > > > >> Thanks,
>> > > > > >> Jihoon
>> > > > > >>
>> > > > > >> On Mon, Mar 16, 2015 at 3:39 PM Hyunsik Choi <
>> hyunsik@apache.org>
>> > > > > wrote:
>> > > > > >>
>> > > > > >> > Hi Azuryy,
>> > > > > >> >
>> > > > > >> > Thank for sharing the test results. They are very
inspiring
>> to
>> > us.
>> > > > > >> > Also, I'll make some jira about the problems that
you found.
>> > > > > >> >
>> > > > > >> > Best regards,
>> > > > > >> > Hyunsik
>> > > > > >> >
>> > > > > >> > On Sun, Mar 15, 2015 at 10:58 PM, Azuryy Yu <
>> azuryyyu@gmail.com
>> > >
>> > > > > wrote:
>> > > > > >> > > Another fix:
>> > > > > >> > > My test result is unfair during compare Imapla-2.1.2
and
>> > > > > Tajo-0.10.0,
>> > > > > >> > > because I used Parquet with Impala and RCFILE
snappy with
>> > Tajo.
>> > > I
>> > > > > >> should
>> > > > > >> > > use the same file format to compare.
>> > > > > >> > >
>> > > > > >> > > because I've got a clear conclusion that Imapala
works
>> better
>> > on
>> > > > > >> Parquet
>> > > > > >> > > than Tajo, so I use RCFILE as the test data.
>> > > > > >> > >
>> > > > > >> > > *Tajo*:
>> > > > > >> > > default> select sum (cast(movie_vv as bigint)),
>> > > sum(cast(movie_cv
>> > > > as
>> > > > > >> > > bigint)),sum(cast(movie_pt as bigint)) from
snappy;
>> > > > > >> > > Progress: 0%, response time: 1.598 sec
>> > > > > >> > > Progress: 0%, response time: 1.6 sec
>> > > > > >> > > Progress: 0%, response time: 2.003 sec
>> > > > > >> > > Progress: 0%, response time: 2.806 sec
>> > > > > >> > > Progress: 37%, response time: 3.808 sec
>> > > > > >> > > Progress: 100%, response time: 4.792 sec
>> > > > > >> > > ?sum_3,  ?sum_4,  ?sum_5
>> > > > > >> > > -------------------------------
>> > > > > >> > > 22557920,  19648838,  2005366694576
>> > > > > >> > > (1 rows, 4.792 sec, 32 B selected)
>> > > > > >> > >
>> > > > > >> > > *Impala*:
>> > > > > >> > >  > select sum (cast(movie_vv as bigint)),
>> sum(cast(movie_cv as
>> > > > > >> > > bigint)),sum(cast(movie_pt as bigint)) from
snappy;
>> > > > > >> > > +-----------------------------
>> --+---------------------------
>> > > > > >> > ----+-------------------------------+
>> > > > > >> > > | sum(cast(movie_vv as bigint)) | sum(cast(movie_cv
as
>> > bigint))
>> > > |
>> > > > > >> > > sum(cast(movie_pt as bigint)) |
>> > > > > >> > > +-----------------------------
>> --+---------------------------
>> > > > > >> > ----+-------------------------------+
>> > > > > >> > > | 22557920                      | 19648838
>> > > |
>> > > > > >> > > 2005366694576                 |
>> > > > > >> > > +-----------------------------
>> --+---------------------------
>> > > > > >> > ----+-------------------------------+
>> > > > > >> > > Fetched 1 row(s) in 11.12s
>> > > > > >> > >
>> > > > > >> > >
>> > > > > >> > >
>> > > > > >> > > On Mon, Mar 16, 2015 at 1:49 PM, Azuryy Yu
<
>> > azuryyyu@gmail.com>
>> > > > > >> wrote:
>> > > > > >> > >
>> > > > > >> > >> There is a typo in my Email. I corrected
here:
>> > > > > >> > >>
>> > > > > >> > >> for example:
>> > > > > >> > >>
>> > > > > >> > >>   <property>
>> > > > > >> > >>     <name>tajo.master.umbilical-rpc.address</name>
>> > > > > >> > >>     <value>1-1-1-1:26001</value>
>> > > > > >> > >>   </property>
>> > > > > >> > >>
>> > > > > >> > >> which does work under tajo-0.9.0, but
it complain
>> > > "1-1-1-1:2601"
>> > > > is
>> > > > > >> not
>> > > > > >> > a
>> > > > > >> > >> valid network address under tajo-0.10.0.
>> > > > > >> > >>
>> > > > > >> > >> I have to change to:
>> > > > > >> > >>   <property>
>> > > > > >> > >>     <name>tajo.master.umbilical-rpc.address</name>
>> > > > > >> > >>     <value>1.1.1.1:26001</value>
>> > > > > >> > >>   </property>
>> > > > > >> > >>
>> > > > > >> > >>
>> > > > > >> > >> On Mon, Mar 16, 2015 at 1:44 PM, Azuryy
Yu <
>> > azuryyyu@gmail.com
>> > > >
>> > > > > >> wrote:
>> > > > > >> > >>
>> > > > > >> > >>> Hi,
>> > > > > >> > >>> I compiled tajo-0.10 source based
on hadoop-2.6.0, then
>> post
>> > > > some
>> > > > > >> > >>> feedback here.
>> > > > > >> > >>>
>> > > > > >> > >>> My cluster:
>> > > > > >> > >>> 1 tajo-master, 9 tajo-worker
>> > > > > >> > >>> 24 CPU(logic), 64GB mem, 4TB*12 HDD
>> > > > > >> > >>>
>> > > > > >> > >>> Feedback:
>> > > > > >> > >>> 1) tajo task progress estimate is
normal on partitioned
>> > table,
>> > > > > >> which is
>> > > > > >> > >>> incorrect sometimes in tajo-0.9.0
>> > > > > >> > >>> 2) Tajo configuration doesn't support
hostname in
>> > > tajo-site.xml.
>> > > > > >> > >>> for example:
>> > > > > >> > >>>
>> > > > > >> > >>>   <property>
>> > > > > >> > >>>     <name>tajo.master.umbilical-rpc.address</name>
>> > > > > >> > >>>     <value>1-1-1-1:26001</value>
>> > > > > >> > >>>   </property>
>> > > > > >> > >>>
>> > > > > >> > >>> which does work under tajo-0.9.0,
but it complain
>> > > "1-1-1-1:2601"
>> > > > > is
>> > > > > >> > not a
>> > > > > >> > >>> valid network address.
>> > > > > >> > >>>
>> > > > > >> > >>> I have to change to:
>> > > > > >> > >>>   <property>
>> > > > > >> > >>>     <name>tajo.master.umbilical-rpc.address</name>
>> > > > > >> > >>>     <value>1.1.1.1:26001</value>
>> > > > > >> > >>>   </property>
>> > > > > >> > >>>
>> > > > > >> > >>> but we don't use IP in our cluster,
only hostname. so I
>> did
>> > a
>> > > > > >> little in
>> > > > > >> > >>> the code:
>> > > > > >> > >>> org.apache.tajo.validation.NetworkAddressValidator.java:
>> > > > > >> > >>> hostnamePattern = Pattern.compile("\\d*-\\d*-\\d*-\\d");
>> > > > > >> > >>> then It works.
>> > > > > >> > >>>
>> > > > > >> > >>> 3) I did some test on the parquet,
RCFILE(snappy
>> > compressed),
>> > > > > >> > >>> RCFILE(GZIP compressed)
>> > > > > >> > >>>
>> > > > > >> > >>> they are the same data, only different
from file format.
>> > > > > >> > >>> the table has six partitions, 20 RCFILES,
each parquet
>> file
>> > is
>> > > > > 1GB.
>> > > > > >> > >>>
>> > > > > >> > >>> then rcfile with snappy's performance
is similiar to
>> rcfile
>> > > with
>> > > > > >> gzip.
>> > > > > >> > >>> but they are all two~three times better
than parquet.
>> > > > > >> > >>>
>> > > > > >> > >>> 4) I compared tajo-0.10 and Impala-2.1.2,
>> > > > > >> > >>> Impala can provide very good support
for parquet. more
>> > better
>> > > > than
>> > > > > >> > Tajo.
>> > > > > >> > >>>
>> > > > > >> > >>> but impala is more *slow *with other
format than Tajo.
>> > > > > >> > >>> such as(I don't use WHERE because
I want query all six
>> > > > partitions
>> > > > > >> > >>> together):
>> > > > > >> > >>>
>> > > > > >> > >>> *Impala*:
>> > > > > >> > >>>  > select sum (cast(movie_vv as
bigint)),
>> sum(cast(movie_cv
>> > as
>> > > > > >> > >>> bigint)),sum(cast(movie_pt as bigint))
from par;
>> > > > > >> > >>>
>> > > > > >> > >>> +-----------------------------
>> --+---------------------------
>> > > > > >> > ----+-------------------------------+
>> > > > > >> > >>> | sum(cast(movie_vv as bigint)) |
sum(cast(movie_cv as
>> > > bigint))
>> > > > |
>> > > > > >> > >>> sum(cast(movie_pt as bigint)) |
>> > > > > >> > >>>
>> > > > > >> > >>> +-----------------------------
>> --+---------------------------
>> > > > > >> > ----+-------------------------------+
>> > > > > >> > >>> | 22557920                      |
19648838
>> > > > |
>> > > > > >> > >>> 2005366694576           |
>> > > > > >> > >>>
>> > > > > >> > >>> +-----------------------------
>> --+---------------------------
>> > > > > >> > ----+-------------------------------+
>> > > > > >> > >>> Fetched 1 row(s) in 6.02s
>> > > > > >> > >>>
>> > > > > >> > >>> *Tajo:*
>> > > > > >> > >>>
>> > > > > >> > >>> *default*> select sum (cast(movie_vv
as bigint)),
>> > > > > sum(cast(movie_cv
>> > > > > >> as
>> > > > > >> > >>> bigint)),sum(cast(movie_pt as bigint))
from snappy;
>> > > > > >> > >>> Progress: 0%, response time: 1.598
sec
>> > > > > >> > >>> Progress: 0%, response time: 1.6 sec
>> > > > > >> > >>> Progress: 0%, response time: 2.003
sec
>> > > > > >> > >>> Progress: 0%, response time: 2.806
sec
>> > > > > >> > >>> Progress: 37%, response time: 3.808
sec
>> > > > > >> > >>> Progress: 100%, response time: 4.792
sec
>> > > > > >> > >>> ?sum_3,  ?sum_4,  ?sum_5
>> > > > > >> > >>> -------------------------------
>> > > > > >> > >>> 22557920,  19648838,  2005366694576
>> > > > > >> > >>> (1 rows, 4.792 sec, 32 B selected)
>> > > > > >> > >>>
>> > > > > >> > >>>
>> > > > > >> > >>>
>> > > > > >> > >>>
>> > > > > >> > >>>
>> > > > > >> > >>>
>> > > > > >> > >>>
>> > > > > >> > >>>
>> > > > > >> > >>>
>> > > > > >> > >>>
>> > > > > >> > >>
>> > > > > >> >
>> > > > > >>
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message