tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Azuryy Yu <azury...@gmail.com>
Subject Re: Feedback for tajo-0.10.0
Date Mon, 16 Mar 2015 09:46:47 GMT
Does that possible Tajo reuse Parquet bundle from Impala?


On Mon, Mar 16, 2015 at 5:35 PM, Jihoon Son <jihoonson@apache.org> wrote:

> Honestly, even if there are sufficiently many input files, Impala may be
> faster than Tajo becuase their optimization for Parquet if better than that
> of Tajo.
>
> On Mon, Mar 16, 2015 at 6:16 PM Jihoon Son <jihoonson@apache.org> wrote:
>
> > Interesting.
> > Here are some reasons I think.
> >
> > Impala can process Parquet files in a distributed manner even if their
> > size is very large. This is becuase of its DBMS-like architecture. Impala
> > has a storage manager and a query executor, which have totally separated
> > roles during query processing. Once a query is executed, the storage
> > manager continuously loads data from HDFS into memory. Query executors
> just
> > process data loaded in memory according to the query plan. As a result,
> > even though the number of data files is small, huge amounts of query
> > executors can process them simultaneously.
> >
> > However, in Tajo, each query executor (i.e., task) has a scanner to read
> > data from HDFS. During query processing, their roles are closely related.
> > Thus, the number of tasks is mainly decided based on the number of files.
> > (Of course, when input data are dispersed on the entire cluster nodes,
> the
> > number of tasks is decided based on the number of cpu cores and disks per
> > worker.)
> >
> > So, I expect that Tajo's performance with Parquet will be improved when
> > there are sufficiently many input files.
> > As I aforementioned, this is just my suspection.
> > I will investigate further.
> >
> > Thanks,
> > Jihoon
> >
> >
> > On Mon, Mar 16, 2015 at 5:45 PM Azuryy Yu <azuryyyu@gmail.com> wrote:
> >
> >> HDFS block size is also 1GB
> >>
> >>
> >> On Mon, Mar 16, 2015 at 4:18 PM, Jihoon Son <jihoonson@apache.org>
> wrote:
> >>
> >> > Right. A large file size can improves the sequential scan on Parquet.
> >> > However, if you want to use the large file size, it is recommended to
> >> also
> >> > increase the HDFS block size to reduce the remote read cost.
> >> > How large size did you set for HDFS blocks?
> >> >
> >> > On Impala's good performance, I will also investigate it.
> >> > It seems to be related with Impala's storage manager.
> >> >
> >> > Best,
> >> > Jihoon
> >> >
> >> > On Mon, Mar 16, 2015 at 5:05 PM Azuryy Yu <azuryyyu@gmail.com> wrote:
> >> >
> >> > > Hi Jihoon,
> >> > >
> >> > > Impala works on Parquet is more faster than other file formats. and
> >> > Impala
> >> > > advice don't make more small parquet files. 1GB would be better.
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > On Mon, Mar 16, 2015 at 3:57 PM, Jihoon Son <jihoonson@apache.org>
> >> > wrote:
> >> > >
> >> > > > Thanks!
> >> > > > It is really interesting.
> >> > > > I suspect that the large file size of Parquet makes Tajo slower.
> >> This
> >> > is
> >> > > > because Parquet is non-splittable, which means that only 4 workers
> >> read
> >> > > > data from HDFS. In addition, if the HDFS block size is smaller
> than
> >> > 1GB,
> >> > > a
> >> > > > lot of data can be moved over network during the scan phase.
> >> > > >
> >> > > > But, I have no idea why Impala shows good performance.
> >> > > > Maybe, its cache scheme improved it.
> >> > > >
> >> > > > Best regards,
> >> > > > Jihoon
> >> > > >
> >> > > > On Mon, Mar 16, 2015 at 4:16 PM Azuryy Yu <azuryyyu@gmail.com>
> >> wrote:
> >> > > >
> >> > > > > PS. my Parquet data was generated by Impala: "Insert into
a
> >> parquet
> >> > > table
> >> > > > > [SHUFFLE] ... AS select .... from a text table"
> >> > > > >
> >> > > > > On Mon, Mar 16, 2015 at 3:11 PM, Azuryy Yu <azuryyyu@gmail.com>
> >> > wrote:
> >> > > > >
> >> > > > > > Hi Jihoon,
> >> > > > > >
> >> > > > > > Here is an example:
> >> > > > > > My data: (Parquet file is 1GB limited)
> >> > > > > >  hadoop fs -ls /data/basetable/par/dt=20150301/pf=pc
> >> > > > > >
> >> > > > > > -rw-r--r--   9 hadoop tajo 1062932057 2015-03-12 15:08
> >> > > > > > /data/basetable/par/dt=20150301/pf=pc/cc456c9d427c88a3-
> >> > > > > 3ead7e35ecf0da8_448517166_data.0.parq
> >> > > > > > -rw-r--r--   9 hadoop tajo 1063205684 2015-03-12 15:11
> >> > > > > > /data/basetable/par/dt=20150301/pf=pc/cc456c9d427c88a3-
> >> > > > > 3ead7e35ecf0da8_448517166_data.1.parq
> >> > > > > > -rw-r--r--   9 hadoop tajo 1063236005 2015-03-12 15:14
> >> > > > > > /data/basetable/par/dt=20150301/pf=pc/cc456c9d427c88a3-
> >> > > > > 3ead7e35ecf0da8_448517166_data.2.parq
> >> > > > > > -rw-r--r--   9 hadoop tajo  543786632 2015-03-12 15:16
> >> > > > > > /data/basetable/par/dt=20150301/pf=pc/cc456c9d427c88a3-
> >> > > > > 3ead7e35ecf0da8_448517166_data.3.parq
> >> > > > > >
> >> > > > > > hadoop fs -ls /data/basetable/snappy/dt=20150301/pf=pc
> >> > > > > >
> >> > > > > > -rw-r--r--   9 tajo tajo  144059045 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00000
> >> > > > > > -rw-r--r--   9 tajo tajo  144178118 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00001
> >> > > > > > -rw-r--r--   9 tajo tajo  143642438 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00002
> >> > > > > > -rw-r--r--   9 tajo tajo  143553142 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00003
> >> > > > > > -rw-r--r--   9 tajo tajo  143849627 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00004
> >> > > > > > -rw-r--r--   9 tajo tajo  144648456 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00005
> >> > > > > > -rw-r--r--   9 tajo tajo  144647502 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00006
> >> > > > > > -rw-r--r--   9 tajo tajo  144551053 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00007
> >> > > > > > -rw-r--r--   9 tajo tajo  144017287 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00008
> >> > > > > > -rw-r--r--   9 tajo tajo  144205111 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00009
> >> > > > > > -rw-r--r--   9 tajo tajo  145066506 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00010
> >> > > > > > -rw-r--r--   9 tajo tajo  144740791 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00011
> >> > > > > > -rw-r--r--   9 tajo tajo  144198266 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00012
> >> > > > > > -rw-r--r--   9 tajo tajo  143575440 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00013
> >> > > > > > -rw-r--r--   9 tajo tajo  143922343 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00014
> >> > > > > > -rw-r--r--   9 tajo tajo  143930019 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00015
> >> > > > > > -rw-r--r--   9 tajo tajo  144253019 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00016
> >> > > > > > -rw-r--r--   9 tajo tajo  144175506 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00017
> >> > > > > > -rw-r--r--   9 tajo tajo  143072995 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00018
> >> > > > > > -rw-r--r--   9 tajo tajo  143818118 2015-03-16 11:48
> >> > > > > > /data/basetable/snappy/dt=20150301/pf=pc/part-r-00019
> >> > > > > >
> >> > > > > > Result:
> >> > > > > >
> >> > > > > > default> select sum (cast(movie_vv as bigint)),
> >> sum(cast(movie_cv
> >> > as
> >> > > > > > bigint)),sum(cast(movie_pt as bigint)) from snappy
where
> >> pf='pc';
> >> > > > > > Progress: 19%, response time: 1.87 sec
> >> > > > > > Progress: 19%, response time: 1.873 sec
> >> > > > > > Progress: 19%, response time: 2.276 sec
> >> > > > > > Progress: 100%, response time: 2.372 sec
> >> > > > > > ?sum_3,  ?sum_4,  ?sum_5
> >> > > > > > -------------------------------
> >> > > > > > 6928463,  6183665,  6055494385
> >> > > > > > (1 rows, 2.372 sec, 27 B selected)
> >> > > > > > default> select sum (cast(movie_vv as bigint)),
> >> sum(cast(movie_cv
> >> > as
> >> > > > > > bigint)),sum(cast(movie_pt as bigint)) from par where
pf='pc';
> >> > > > > > Progress: 0%, response time: 0.751 sec
> >> > > > > > Progress: 0%, response time: 0.753 sec
> >> > > > > > Progress: 0%, response time: 1.155 sec
> >> > > > > > Progress: 0%, response time: 1.959 sec
> >> > > > > > Progress: 0%, response time: 2.962 sec
> >> > > > > > Progress: 0%, response time: 3.965 sec
> >> > > > > > Progress: 0%, response time: 4.968 sec
> >> > > > > > Progress: 0%, response time: 5.97 sec
> >> > > > > > Progress: 12%, response time: 6.974 sec
> >> > > > > > Progress: 12%, response time: 7.977 sec
> >> > > > > > Progress: 12%, response time: 8.979 sec
> >> > > > > > Progress: 12%, response time: 9.982 sec
> >> > > > > > Progress: 25%, response time: 10.985 sec
> >> > > > > > Progress: 100%, response time: 11.14 sec
> >> > > > > > ?sum_3,  ?sum_4,  ?sum_5
> >> > > > > > -------------------------------
> >> > > > > > 6928463,  6183665,  6055494385
> >> > > > > > (1 rows, 11.14 sec, 27 B selected)
> >> > > > > >
> >> > > > > > On Mon, Mar 16, 2015 at 2:58 PM, Jihoon Son <
> >> jihoonson@apache.org>
> >> > > > > wrote:
> >> > > > > >
> >> > > > > >> Azuryy, thanks for your feedbacks.
> >> > > > > >> They are very interesting results.
> >> > > > > >> Would you mind telling me how Tajo with Parquet
is slower
> than
> >> > Tajo
> >> > > > with
> >> > > > > >> RCFile?
> >> > > > > >>
> >> > > > > >> Thanks,
> >> > > > > >> Jihoon
> >> > > > > >>
> >> > > > > >> On Mon, Mar 16, 2015 at 3:39 PM Hyunsik Choi <
> >> hyunsik@apache.org>
> >> > > > > wrote:
> >> > > > > >>
> >> > > > > >> > Hi Azuryy,
> >> > > > > >> >
> >> > > > > >> > Thank for sharing the test results. They are
very inspiring
> >> to
> >> > us.
> >> > > > > >> > Also, I'll make some jira about the problems
that you
> found.
> >> > > > > >> >
> >> > > > > >> > Best regards,
> >> > > > > >> > Hyunsik
> >> > > > > >> >
> >> > > > > >> > On Sun, Mar 15, 2015 at 10:58 PM, Azuryy Yu
<
> >> azuryyyu@gmail.com
> >> > >
> >> > > > > wrote:
> >> > > > > >> > > Another fix:
> >> > > > > >> > > My test result is unfair during compare
Imapla-2.1.2 and
> >> > > > > Tajo-0.10.0,
> >> > > > > >> > > because I used Parquet with Impala and
RCFILE snappy with
> >> > Tajo.
> >> > > I
> >> > > > > >> should
> >> > > > > >> > > use the same file format to compare.
> >> > > > > >> > >
> >> > > > > >> > > because I've got a clear conclusion that
Imapala works
> >> better
> >> > on
> >> > > > > >> Parquet
> >> > > > > >> > > than Tajo, so I use RCFILE as the test
data.
> >> > > > > >> > >
> >> > > > > >> > > *Tajo*:
> >> > > > > >> > > default> select sum (cast(movie_vv
as bigint)),
> >> > > sum(cast(movie_cv
> >> > > > as
> >> > > > > >> > > bigint)),sum(cast(movie_pt as bigint))
from snappy;
> >> > > > > >> > > Progress: 0%, response time: 1.598 sec
> >> > > > > >> > > Progress: 0%, response time: 1.6 sec
> >> > > > > >> > > Progress: 0%, response time: 2.003 sec
> >> > > > > >> > > Progress: 0%, response time: 2.806 sec
> >> > > > > >> > > Progress: 37%, response time: 3.808 sec
> >> > > > > >> > > Progress: 100%, response time: 4.792
sec
> >> > > > > >> > > ?sum_3,  ?sum_4,  ?sum_5
> >> > > > > >> > > -------------------------------
> >> > > > > >> > > 22557920,  19648838,  2005366694576
> >> > > > > >> > > (1 rows, 4.792 sec, 32 B selected)
> >> > > > > >> > >
> >> > > > > >> > > *Impala*:
> >> > > > > >> > >  > select sum (cast(movie_vv as bigint)),
> >> sum(cast(movie_cv as
> >> > > > > >> > > bigint)),sum(cast(movie_pt as bigint))
from snappy;
> >> > > > > >> > > +-----------------------------
> >> --+---------------------------
> >> > > > > >> > ----+-------------------------------+
> >> > > > > >> > > | sum(cast(movie_vv as bigint)) | sum(cast(movie_cv
as
> >> > bigint))
> >> > > |
> >> > > > > >> > > sum(cast(movie_pt as bigint)) |
> >> > > > > >> > > +-----------------------------
> >> --+---------------------------
> >> > > > > >> > ----+-------------------------------+
> >> > > > > >> > > | 22557920                      | 19648838
> >> > > |
> >> > > > > >> > > 2005366694576                 |
> >> > > > > >> > > +-----------------------------
> >> --+---------------------------
> >> > > > > >> > ----+-------------------------------+
> >> > > > > >> > > Fetched 1 row(s) in 11.12s
> >> > > > > >> > >
> >> > > > > >> > >
> >> > > > > >> > >
> >> > > > > >> > > On Mon, Mar 16, 2015 at 1:49 PM, Azuryy
Yu <
> >> > azuryyyu@gmail.com>
> >> > > > > >> wrote:
> >> > > > > >> > >
> >> > > > > >> > >> There is a typo in my Email. I corrected
here:
> >> > > > > >> > >>
> >> > > > > >> > >> for example:
> >> > > > > >> > >>
> >> > > > > >> > >>   <property>
> >> > > > > >> > >>     <name>tajo.master.umbilical-rpc.address</name>
> >> > > > > >> > >>     <value>1-1-1-1:26001</value>
> >> > > > > >> > >>   </property>
> >> > > > > >> > >>
> >> > > > > >> > >> which does work under tajo-0.9.0,
but it complain
> >> > > "1-1-1-1:2601"
> >> > > > is
> >> > > > > >> not
> >> > > > > >> > a
> >> > > > > >> > >> valid network address under tajo-0.10.0.
> >> > > > > >> > >>
> >> > > > > >> > >> I have to change to:
> >> > > > > >> > >>   <property>
> >> > > > > >> > >>     <name>tajo.master.umbilical-rpc.address</name>
> >> > > > > >> > >>     <value>1.1.1.1:26001</value>
> >> > > > > >> > >>   </property>
> >> > > > > >> > >>
> >> > > > > >> > >>
> >> > > > > >> > >> On Mon, Mar 16, 2015 at 1:44 PM,
Azuryy Yu <
> >> > azuryyyu@gmail.com
> >> > > >
> >> > > > > >> wrote:
> >> > > > > >> > >>
> >> > > > > >> > >>> Hi,
> >> > > > > >> > >>> I compiled tajo-0.10 source based
on hadoop-2.6.0, then
> >> post
> >> > > > some
> >> > > > > >> > >>> feedback here.
> >> > > > > >> > >>>
> >> > > > > >> > >>> My cluster:
> >> > > > > >> > >>> 1 tajo-master, 9 tajo-worker
> >> > > > > >> > >>> 24 CPU(logic), 64GB mem, 4TB*12
HDD
> >> > > > > >> > >>>
> >> > > > > >> > >>> Feedback:
> >> > > > > >> > >>> 1) tajo task progress estimate
is normal on partitioned
> >> > table,
> >> > > > > >> which is
> >> > > > > >> > >>> incorrect sometimes in tajo-0.9.0
> >> > > > > >> > >>> 2) Tajo configuration doesn't
support hostname in
> >> > > tajo-site.xml.
> >> > > > > >> > >>> for example:
> >> > > > > >> > >>>
> >> > > > > >> > >>>   <property>
> >> > > > > >> > >>>     <name>tajo.master.umbilical-rpc.address</name>
> >> > > > > >> > >>>     <value>1-1-1-1:26001</value>
> >> > > > > >> > >>>   </property>
> >> > > > > >> > >>>
> >> > > > > >> > >>> which does work under tajo-0.9.0,
but it complain
> >> > > "1-1-1-1:2601"
> >> > > > > is
> >> > > > > >> > not a
> >> > > > > >> > >>> valid network address.
> >> > > > > >> > >>>
> >> > > > > >> > >>> I have to change to:
> >> > > > > >> > >>>   <property>
> >> > > > > >> > >>>     <name>tajo.master.umbilical-rpc.address</name>
> >> > > > > >> > >>>     <value>1.1.1.1:26001</value>
> >> > > > > >> > >>>   </property>
> >> > > > > >> > >>>
> >> > > > > >> > >>> but we don't use IP in our cluster,
only hostname. so I
> >> did
> >> > a
> >> > > > > >> little in
> >> > > > > >> > >>> the code:
> >> > > > > >> > >>>
> org.apache.tajo.validation.NetworkAddressValidator.java:
> >> > > > > >> > >>> hostnamePattern =
> Pattern.compile("\\d*-\\d*-\\d*-\\d");
> >> > > > > >> > >>> then It works.
> >> > > > > >> > >>>
> >> > > > > >> > >>> 3) I did some test on the parquet,
RCFILE(snappy
> >> > compressed),
> >> > > > > >> > >>> RCFILE(GZIP compressed)
> >> > > > > >> > >>>
> >> > > > > >> > >>> they are the same data, only
different from file
> format.
> >> > > > > >> > >>> the table has six partitions,
20 RCFILES, each parquet
> >> file
> >> > is
> >> > > > > 1GB.
> >> > > > > >> > >>>
> >> > > > > >> > >>> then rcfile with snappy's performance
is similiar to
> >> rcfile
> >> > > with
> >> > > > > >> gzip.
> >> > > > > >> > >>> but they are all two~three times
better than parquet.
> >> > > > > >> > >>>
> >> > > > > >> > >>> 4) I compared tajo-0.10 and Impala-2.1.2,
> >> > > > > >> > >>> Impala can provide very good
support for parquet. more
> >> > better
> >> > > > than
> >> > > > > >> > Tajo.
> >> > > > > >> > >>>
> >> > > > > >> > >>> but impala is more *slow *with
other format than Tajo.
> >> > > > > >> > >>> such as(I don't use WHERE because
I want query all six
> >> > > > partitions
> >> > > > > >> > >>> together):
> >> > > > > >> > >>>
> >> > > > > >> > >>> *Impala*:
> >> > > > > >> > >>>  > select sum (cast(movie_vv
as bigint)),
> >> sum(cast(movie_cv
> >> > as
> >> > > > > >> > >>> bigint)),sum(cast(movie_pt as
bigint)) from par;
> >> > > > > >> > >>>
> >> > > > > >> > >>> +-----------------------------
> >> --+---------------------------
> >> > > > > >> > ----+-------------------------------+
> >> > > > > >> > >>> | sum(cast(movie_vv as bigint))
| sum(cast(movie_cv as
> >> > > bigint))
> >> > > > |
> >> > > > > >> > >>> sum(cast(movie_pt as bigint))
|
> >> > > > > >> > >>>
> >> > > > > >> > >>> +-----------------------------
> >> --+---------------------------
> >> > > > > >> > ----+-------------------------------+
> >> > > > > >> > >>> | 22557920                  
   | 19648838
> >> > > > |
> >> > > > > >> > >>> 2005366694576           |
> >> > > > > >> > >>>
> >> > > > > >> > >>> +-----------------------------
> >> --+---------------------------
> >> > > > > >> > ----+-------------------------------+
> >> > > > > >> > >>> Fetched 1 row(s) in 6.02s
> >> > > > > >> > >>>
> >> > > > > >> > >>> *Tajo:*
> >> > > > > >> > >>>
> >> > > > > >> > >>> *default*> select sum (cast(movie_vv
as bigint)),
> >> > > > > sum(cast(movie_cv
> >> > > > > >> as
> >> > > > > >> > >>> bigint)),sum(cast(movie_pt as
bigint)) from snappy;
> >> > > > > >> > >>> Progress: 0%, response time:
1.598 sec
> >> > > > > >> > >>> Progress: 0%, response time:
1.6 sec
> >> > > > > >> > >>> Progress: 0%, response time:
2.003 sec
> >> > > > > >> > >>> Progress: 0%, response time:
2.806 sec
> >> > > > > >> > >>> Progress: 37%, response time:
3.808 sec
> >> > > > > >> > >>> Progress: 100%, response time:
4.792 sec
> >> > > > > >> > >>> ?sum_3,  ?sum_4,  ?sum_5
> >> > > > > >> > >>> -------------------------------
> >> > > > > >> > >>> 22557920,  19648838,  2005366694576
> >> > > > > >> > >>> (1 rows, 4.792 sec, 32 B selected)
> >> > > > > >> > >>>
> >> > > > > >> > >>>
> >> > > > > >> > >>>
> >> > > > > >> > >>>
> >> > > > > >> > >>>
> >> > > > > >> > >>>
> >> > > > > >> > >>>
> >> > > > > >> > >>>
> >> > > > > >> > >>>
> >> > > > > >> > >>>
> >> > > > > >> > >>
> >> > > > > >> >
> >> > > > > >>
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message