tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hyunsik Choi <hyun...@apache.org>
Subject Re: [DISCUSS] 0.8.0 release and next roadmap
Date Tue, 15 Apr 2014 05:55:55 GMT
Thank you for votes! Let's go ahead!

Cheers,
Hyunsik


On Tue, Apr 15, 2014 at 9:03 AM, ktpark <sirpkt@apache.org> wrote:

> +1
>
> I agree with Hyunsik.
> Sorry for late reply.
>
> 2014. 4. 15., 오전 5:05, Min Zhou <coderplay@gmail.com> 작성:
>
> > Until today realized that my reply haven't been sent.
> >
> > +1
> >
> > Totally agree with Hyunsik. 0.9 is more appropriate for the next release.
> >
> > Min
> >
> >
> > On Mon, Apr 14, 2014 at 12:31 PM, David Chen <dchen@linkedin.com> wrote:
> >
> >> +1
> >>
> >> I agree with Hyunsik as well. I think since 1.0 increments the major
> >> version number, it should be used for a particularly significant
> release. :)
> >>
> >> Thanks,
> >> David
> >>
> >>
> >> On Apr 13, 2014, at 7:51 PM, Alvin Henrick <share.code@aol.com> wrote:
> >>
> >>> +1 Hyunsik.
> >>>
> >>> Thanks!
> >>> Warm Regards,
> >>> Alvin.
> >>>
> >>> On Apr 11, 2014, at 8:30 AM, Hyunsik Choi wrote:
> >>>
> >>>> Hi folks,
> >>>>
> >>>> I'd like to discuss the next version number. In Jira, we have
> >> provisionally
> >>>> used 1.0, and we didn't decide the next major version. I propose 0.9
> as
> >> the
> >>>> next major version. What do you think about this?
> >>>>
> >>>> Regards,
> >>>> Hyunsik
> >>>>
> >>>>
> >>>> On Thu, Apr 10, 2014 at 11:05 AM, Jihoon Son <jihoonson@apache.org>
> >> wrote:
> >>>>
> >>>>> Min, thanks for reminding us!
> >>>>> It's a mandatory issue.
> >>>>> We need to implement that feature ASAP.
> >>>>>
> >>>>> Thanks,
> >>>>> Jihoon
> >>>>>
> >>>>>
> >>>>> 2014-04-10 3:19 GMT+09:00 Hyunsik Choi <hyunsik@apache.org>:
> >>>>>
> >>>>>> Min,
> >>>>>>
> >>>>>> Yes, you are right. I'm thinking it everyday, but I missed it.
Thank
> >> you
> >>>>>> for reminding me. It would be achieved by modifying Query class
to
> >>>>> execute
> >>>>>> independent execution blocks in parallel. I'll add it to the
wiki.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Hyunsik
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Apr 10, 2014 at 2:43 AM, Min Zhou <coderplay@gmail.com>
> >> wrote:
> >>>>>>
> >>>>>>> Yeah.. Another issue,  seems a query like A join B. Tajo
will scan
> A
> >> at
> >>>>>>> first stage, after that in the 2nd stage scan B. Doesn't
run it in
> >>>>>>> parallel, right?
> >>>>>>>
> >>>>>>>
> >>>>>>> Min
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Apr 9, 2014 at 10:10 AM, Hyunsik Choi <hyunsik@apache.org>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> I've just updated the roadmap page. Please take a look
at the
> >> section
> >>>>>>>> 'After 0.8.0'
> >>>>>>>> https://cwiki.apache.org/confluence/display/TAJO/Tajo+Roadmap
> >>>>>>>>
> >>>>>>>> If there are missed or additional ideas, feel free to
add them on
> >>>>> that
> >>>>>>>> page or suggest them here. After we discuss them more,
we would
> >>>>> decide
> >>>>>>>> their priorities.
> >>>>>>>>
> >>>>>>>> Best regards,
> >>>>>>>> Hyunsik
> >>>>>>>>
> >>>>>>>> On Sat, Apr 5, 2014 at 12:06 AM, Hyunsik Choi <hyunsik@apache.org
> >
> >>>>>>> wrote:
> >>>>>>>>> Hi Hyoungjun,
> >>>>>>>>>
> >>>>>>>>> Yes, TPC-H and TPC-DS scripts for Tajo are necessary.
If we
> provide
> >>>>>>>>> users with some prepared benchmark environment,
users can test
> Tajo
> >>>>>>>>> easily. I'll file your idea on the wiki. Thank you
for your
> >>>>>>>>> suggestion.
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Hyunsik
> >>>>>>>>>
> >>>>>>>>> On Fri, Apr 4, 2014 at 11:48 PM, 김형준 <babokim@gmail.com>
wrote:
> >>>>>>>>>> Hi Hyunsik ,
> >>>>>>>>>>
> >>>>>>>>>> I did benchmark test with TPC-H, TPC-DS data.
Benchmark script
> >>>>> like
> >>>>>>> hive
> >>>>>>>>>> and impala is more helpful to test.
> >>>>>>>>>>
> >>>>>>>>>> https://github.com/rxin/TPC-H-Hive
> >>>>>>>>>> https://github.com/cartershanklin/hive-testbench
> >>>>>>>>>> https://github.com/cloudera/impala-tpcds-kit
> >>>>>>>>>>
> >>>>>>>>>> Thanks!
> >>>>>>>>>> Hyoungjun
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 2014-04-04 23:40 GMT+09:00 Hyunsik Choi <hyunsik@apache.org>:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Jihoon,
> >>>>>>>>>>>
> >>>>>>>>>>> CUBE and ROLL-UP are key features for analytic
problems. I
> filed
> >>>>> it
> >>>>>>> on
> >>>>>>>> the
> >>>>>>>>>>> wiki.
> >>>>>>>>>>>
> >>>>>>>>>>> TAJO-266 and TAJO-161 will give more optimization
opportunities
> >>>>> to
> >>>>>>>>>>> logical planning and distributed query planning.
But, I'm not
> >>>>> sure
> >>>>>> it
> >>>>>>>>>>> can be included in short-term roadmap. They
are necessary, but
> >>>>> they
> >>>>>>>>>>> are not required right now. In my view,
it would be reasonable
> to
> >>>>>>>>>>> schedule them on long-term roadmap.
> >>>>>>>>>>>
> >>>>>>>>>>> Warm regards,
> >>>>>>>>>>> Hyunsik
> >>>>>>>>>>>
> >>>>>>>>>>> On Fri, Apr 4, 2014 at 3:01 PM, Jihoon Son
<
> jihoonson@apache.org
> >>>>>>
> >>>>>>>> wrote:
> >>>>>>>>>>>> Hi Hyunsik,
> >>>>>>>>>>>> I'm very glad that we can release the
next version, soon.
> >>>>>>>>>>>> Also, appreciate for the guideline of
the next roadmap.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Addition to the aforementioned features,
I have the two
> >>>>>>> suggestions.
> >>>>>>>>>>>> First is the support of CUBE operator
(TAJO-259). Acutally, I
> >>>>>>>> started it
> >>>>>>>>>>>> quite a long time ago, but it is delayed
due to the lower
> >>>>>> priority
> >>>>>>>> than
> >>>>>>>>>>>> other stability issues. But, since this
operator is widely
> used
> >>>>>> in
> >>>>>>>>>>> analytic
> >>>>>>>>>>>> applications, we need to add this feature
as soon as possible.
> >>>>>> So,
> >>>>>>>> in my
> >>>>>>>>>>>> opinion, it would be good to add this
feature to the next
> >>>>>> roadmap.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Second is the advanced query optimization.
TAJO-266 is an
> issue
> >>>>>> for
> >>>>>>>>>>> making
> >>>>>>>>>>>> the query plan more flexible. After
that, we can employ the
> >>>>>> plenty
> >>>>>>>>>>>> optimization opportunities like described
in TAJO-161.
> >>>>>>>>>>>>
> >>>>>>>>>>>> How do you guys think about these issues?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>> Jihoon
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2014-04-04 14:24 GMT+09:00 Hyunsik Choi
<hyunsik@apache.org>:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi folks,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'm very happy to see that our community
is growing! Also,
> >>>>> It's
> >>>>>> a
> >>>>>>>>>>> pleasure
> >>>>>>>>>>>>> to discuss the Tajo 0.8.0 release.
Recently, I've tested
> >>>>> various
> >>>>>>>>>>> features
> >>>>>>>>>>>>> in various contexts, and tried to
figure out if there are any
> >>>>>>>> critical
> >>>>>>>>>>>>> problems. I think that there are
only a few issues and we can
> >>>>>>>> release
> >>>>>>>>>>> 0.8.0
> >>>>>>>>>>>>> next week. If there are further
issues to be solved before
> the
> >>>>>>> 0.8.0
> >>>>>>>>>>>>> release, feel free to suggest ideas.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Also, I'd like to discuss our next
roadmap. We are open to
> any
> >>>>>>>>>>> suggestion
> >>>>>>>>>>>>> from users, contributors, and committers.
Please fire away!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'm thinking that our next stage
should focus on improving
> the
> >>>>>> way
> >>>>>>>> Tajo
> >>>>>>>>>>>>> runs in thousands of large cluster
nodes and for a number of
> >>>>>>>> concurrent
> >>>>>>>>>>>>> users. The key issues associated
with this include the
> >>>>>> following:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> * High availability
> >>>>>>>>>>>>> * Multi-tenancy scheduling
> >>>>>>>>>>>>> * More stability
> >>>>>>>>>>>>> * Improved shuffle
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The current work status is as follows.
Min is working on
> >>>>> Tajo's
> >>>>>>> new
> >>>>>>>>>>>>> scheduler (TAJO-540) based on sparrow.
I'll support him. As
> >>>>> far
> >>>>>>> as I
> >>>>>>>>>>> know,
> >>>>>>>>>>>>> Alvin is working on TajoMaster HA
(TAJO-704). Also, some guys
> >>>>>>>> including
> >>>>>>>>>>>>> myself are investigating and solving
the issues which occur
> in
> >>>>>>> large
> >>>>>>>>>>>>> clusters. These issues should be
solved in order to make Tajo
> >>>>> a
> >>>>>>>> complete
> >>>>>>>>>>>>> enterprise-ready production.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> In addition, there are some SQL
feature support issues. Many
> >>>>>>>> analytic
> >>>>>>>>>>>>> problems require window functions.
Also, in-subquery and
> >>>>> scalar
> >>>>>>>> subquery
> >>>>>>>>>>>>> should be supported. So, I'd like
to schedule them with high
> >>>>>>>> priority.
> >>>>>>>>>>> In
> >>>>>>>>>>>>> my view, there will be very few
SQL support issues if Tajo
> >>>>>>> provides
> >>>>>>>>>>> these
> >>>>>>>>>>>>> features.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Besides those areas, David is working
on a nested schema and
> >>>>> its
> >>>>>>>> related
> >>>>>>>>>>>>> work (TAJO-710). I guess this will
take quite a while because
> >>>>> it
> >>>>>>>>>>> requires a
> >>>>>>>>>>>>> lot of hard work. So, it would be
great to schedule the
> nested
> >>>>>>>> schema
> >>>>>>>>>>>>> loosely. That's just my thoughts,
anyhow.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Aside from the discussion of our
roadmap, I'd like to suggest
> >>>>>> that
> >>>>>>>> we
> >>>>>>>>>>> need
> >>>>>>>>>>>>> to release more frequently after
the 0.8.0 release. So far,
> >>>>>> there
> >>>>>>>> has
> >>>>>>>>>>> been
> >>>>>>>>>>>>> a long period between each release
because Tajo is undergoing
> >>>>>>> heavy
> >>>>>>>>>>>>> development. By 'releasing early,
releasing often', we will
> >>>>> make
> >>>>>>>> more
> >>>>>>>>>>>>> tighter feedback loop between users
and developers.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I think that there are many additional
many interesting
> issues
> >>>>>> to
> >>>>>>> be
> >>>>>>>>>>>>> included in our roadmap. Feel free
to suggest your idea. We
> >>>>> will
> >>>>>>>> arrange
> >>>>>>>>>>>>> our short-term roadmap and long-term
roadmap based on your
> >>>>>>>> suggestions.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thank you all so much for your contribution!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Warm Regards,
> >>>>>>>>>>>>> Hyunsik
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Tajo - Big Data Warehouse System on Hadoop
> >>>>>>>>>> http://tajo.apache.org/
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> My research interests are distributed systems, parallel
computing
> and
> >>>>>>> bytecode based virtual machine.
> >>>>>>>
> >>>>>>> My profile:
> >>>>>>> http://www.linkedin.com/in/coderplay
> >>>>>>> My blog:
> >>>>>>> http://coderplay.javaeye.com
> >>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
> >>
> >
> >
> > --
> > My research interests are distributed systems, parallel computing and
> > bytecode based virtual machine.
> >
> > My profile:
> > http://www.linkedin.com/in/coderplay
> > My blog:
> > http://coderplay.javaeye.com
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message