tajo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christian Schwabe" <Christian.Schw...@gmx.com>
Subject Re: Tajo History, Technical Background
Date Fri, 29 Aug 2014 12:02:54 GMT

Hello Hyunsik

I've found this presentation (http://www.slideshare.net/gruter/hadoop-summit-2014-query-optimization-and-jitbased-vectorized-execution-in-apache-tajo)
 which explained detailed the processing for Tajo a while ago, but wanted to first deal with
the basics of Tajo. I think to have understood this now and would now like to ask more detailed
questions.
However, still unanswered questions stay op to this presentation I would like to clarify here.


Page 6: Can you explain in more detail what exact tasks the modules in the Tajo Master have?
Page 7: What is a "DAG"-Controller? What does the shortcut "DAG" means? Can you explain the
figure in more details what exactly happens in every step?
Page 12: What is a "BMT"-Controller? What does the shortcut "BMT" means?
Page 21: What is a "GC"-Controller? What does the shortcut "GC" means?

I thank you and your team for your versatile help and that they could answer all questions
I had in the past.


P.S.: While I am writing currently on my thesis, but at the same I would like to gave something
back to this support I have also received from this community. Is it possible to accept smaller
tasks, such as grooming the Documentation or other things are accessible to me?


Warm regards,
Chris



Am 27.08.2014 11:03:56, schrieb Christian Schwabe:
> Hello Hyunsik,
> 
> Thank you very for your detailed descriptions of the creation of Tajo.
> Tajo became to an Apache Top-Level Project in March 2014. What exactly mean this status?
What added value does this mean for you?
> The current progress of Tajo is very promising. What exactly did you have done for the
near future?
> 
> On the roadmap (> http://wiki.apache.org/tajo/Roadmap> ) all entries are outdated.
This is quite a problem for the rapid progress of Tajo. The documentation and transparency
should not lose sight of ;)
> 
> Warm regards,
> Chris
> 
> 
> Am 26.08.2014 um 04:42 schrieb Hyunsik Choi <> hyunsik@apache.org> >:
> 
> > Hi,
> > 
> > I'm sorry for late.  My name is Hyunsik Choi who is one of the
> > founders of Tajo and now is the PMC chair of Tajo project.
> > 
> > I'm going to explain the origin of Tajo. It was a research project in
> > Database Lab., Korea University. It started in May, 2010. At the first
> > time, we started it as an alternative to Hive. We designed Tajo to
> > take advantages of both shared-nothing parallel database and
> > specialized distributed data processing systems, like MapReduce,
> > Dryad, and Dremel.
> > 
> > Jihoon Son and I mainly had worked on Tajo prototype. Later, Tajo
> > became the subject of my Ph.D. dissertation. At that time, I were also
> > working on some paper work, Parallel data processing with MapReduce: a
> > survey, ACM SIGMOD Record 2011
> > (> > http://dl.acm.org/citation.cfm?id=2094118> > ). I were investigating
lots
> > of distributed processing systems and learned many things from them.
> > So, I made an effort to reflect great design considerations of other
> > distributed processing systems to the design of Tajo.
> > 
> > At the first time, the design goals were scalability, high throughput,
> > advanced query optimization, and fault tolerance. So far, we still
> > have pursued them.
> > 
> > Since 2013, Gruter, a big data company, have supported Tajo project,
> > and it is employing some full time contributors (i.e., 3 PMC and one
> > committer), including me.
> > 
> > As you mentioned, Tajo documentation does not follow the current
> > status of Tajo project because Tajo is very rapidly evolving and we do
> > not have contributors enough to update continuously documentations.
> > We've just periodically updated the documentation for each release. We
> > are recruiting contributors for code and documentation.
> > 
> > Q. How did you come to the name of Tajo?
> > 
> > When we decided to propose Tajo as an ASF incubation project, the
> > members in the DB Lab. voted for proper name suited for Hadoop eco
> > systems. We wanted to use some animal name like other systems in
> > Hadoop eco system. Finally, we chose Tajo, meaning Ostrich in Korean.
> > 
> > If you have more questions about Tajo, feel free to ask anything.
> > 
> > Best regards,
> > Hyunsik
> > 
> > On Sun, Aug 24, 2014 at 2:29 AM, Hyunsik Choi <> > hyunsik@apache.orgwr>
> ote:
> > > Hi Chris,
> > > 
> > > Nice question! Tajo also has interesting history. I'll give the
> > > details of history tomorrow because here is too late :)
> > > 
> > > Best regards,
> > > Hyunsik
> > > 
> > > On Sat, Aug 23, 2014 at 1:28 AM, Christian Schwabe
> > > <> > > Christian.Schwabe@gmx.com> > > > wrote:
> > > > Hello everyone,
> > > > 
> > > > For about three months now I am dealing with Tajo. Here, I received an
> > > > insight into the documentation especially now know how to start with Tajo,
> > > > which error it can be committed, have made me an overview of the Jira
> > > > tickets and read existing documentation.
> > > > 
> > > > I'm fascinated by how fast this community has grown and how far you're
come
> > > > previously and caused the potential Tajo.
> > > > What I would like to employ me now closer is the historical and technical
> > > > view of Tajo.
> > > > That means I ask myself questions like: How did you come to the name of
> > > > Tajo? When was indeed set the first milestone? Everywhere I read the year
> > > > 2013. But is this actually the first time at which the first time was
> > > > thought about Tajo? Who is the initiator of this project?
> > > > Above all technical processes would be interested in me and certainly
other
> > > > very much. Apart from a few presentations on tajo.apache.org >>
News there
> > > > is little documentation, or I have not found it yet.
> > > > In addition to the Jira tickets and documentation
> > > > (> > > > https://cwiki.apache.org/confluence/display/TAJO/Apache+TAJO+Home,
> > > > http://tajo.apache.org/docs/0.8.0/index.html> > > >  ) I have
the impression that
> > > > her somewhat neglected transparency in addition to the rapid technological
> > > > developments. This is only my own personal opinion and does not criticize
> > > > any individual.
> > > > I appreciate your work very much and can understand as Computer Science
with
> > > > Business what it means for a development work.
> > > > 
> > > > Can you give me more information on the points mentioned above?
> > > > 
> > > > P.S.: I hope I was not misunderstood. I want to look more behind the scenes
> > > > of Tajo and learn to understand the technical background and the birth
and
> > > > historical development of Tajo.
> > > > 
> > > > Best regards,
> > > > Chris





Mime
View raw message