tajo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Schwabe <Christian.Schw...@gmx.com>
Subject Re: Tajo History, Technical Background
Date Wed, 27 Aug 2014 09:03:56 GMT
Hello Hyunsik,

Thank you very for your detailed descriptions of the creation of Tajo.
Tajo became to an Apache Top-Level Project in March 2014. What exactly mean this status? What
added value does this mean for you?
The current progress of Tajo is very promising. What exactly did you have done for the near

On the roadmap (http://wiki.apache.org/tajo/Roadmap) all entries are outdated. This is quite
a problem for the rapid progress of Tajo. The documentation and transparency should not lose
sight of ;)

Warm regards,

Am 26.08.2014 um 04:42 schrieb Hyunsik Choi <hyunsik@apache.org>:

> Hi,
> I'm sorry for late.  My name is Hyunsik Choi who is one of the
> founders of Tajo and now is the PMC chair of Tajo project.
> I'm going to explain the origin of Tajo. It was a research project in
> Database Lab., Korea University. It started in May, 2010. At the first
> time, we started it as an alternative to Hive. We designed Tajo to
> take advantages of both shared-nothing parallel database and
> specialized distributed data processing systems, like MapReduce,
> Dryad, and Dremel.
> Jihoon Son and I mainly had worked on Tajo prototype. Later, Tajo
> became the subject of my Ph.D. dissertation. At that time, I were also
> working on some paper work, Parallel data processing with MapReduce: a
> survey, ACM SIGMOD Record 2011
> (http://dl.acm.org/citation.cfm?id=2094118). I were investigating lots
> of distributed processing systems and learned many things from them.
> So, I made an effort to reflect great design considerations of other
> distributed processing systems to the design of Tajo.
> At the first time, the design goals were scalability, high throughput,
> advanced query optimization, and fault tolerance. So far, we still
> have pursued them.
> Since 2013, Gruter, a big data company, have supported Tajo project,
> and it is employing some full time contributors (i.e., 3 PMC and one
> committer), including me.
> As you mentioned, Tajo documentation does not follow the current
> status of Tajo project because Tajo is very rapidly evolving and we do
> not have contributors enough to update continuously documentations.
> We've just periodically updated the documentation for each release. We
> are recruiting contributors for code and documentation.
> Q. How did you come to the name of Tajo?
> When we decided to propose Tajo as an ASF incubation project, the
> members in the DB Lab. voted for proper name suited for Hadoop eco
> systems. We wanted to use some animal name like other systems in
> Hadoop eco system. Finally, we chose Tajo, meaning Ostrich in Korean.
> If you have more questions about Tajo, feel free to ask anything.
> Best regards,
> Hyunsik
> On Sun, Aug 24, 2014 at 2:29 AM, Hyunsik Choi <hyunsik@apache.org> wrote:
>> Hi Chris,
>> Nice question! Tajo also has interesting history. I'll give the
>> details of history tomorrow because here is too late :)
>> Best regards,
>> Hyunsik
>> On Sat, Aug 23, 2014 at 1:28 AM, Christian Schwabe
>> <Christian.Schwabe@gmx.com> wrote:
>>> Hello everyone,
>>> For about three months now I am dealing with Tajo. Here, I received an
>>> insight into the documentation especially now know how to start with Tajo,
>>> which error it can be committed, have made me an overview of the Jira
>>> tickets and read existing documentation.
>>> I'm fascinated by how fast this community has grown and how far you're come
>>> previously and caused the potential Tajo.
>>> What I would like to employ me now closer is the historical and technical
>>> view of Tajo.
>>> That means I ask myself questions like: How did you come to the name of
>>> Tajo? When was indeed set the first milestone? Everywhere I read the year
>>> 2013. But is this actually the first time at which the first time was
>>> thought about Tajo? Who is the initiator of this project?
>>> Above all technical processes would be interested in me and certainly other
>>> very much. Apart from a few presentations on tajo.apache.org >> News there
>>> is little documentation, or I have not found it yet.
>>> In addition to the Jira tickets and documentation
>>> (https://cwiki.apache.org/confluence/display/TAJO/Apache+TAJO+Home,
>>> http://tajo.apache.org/docs/0.8.0/index.html ) I have the impression that
>>> her somewhat neglected transparency in addition to the rapid technological
>>> developments. This is only my own personal opinion and does not criticize
>>> any individual.
>>> I appreciate your work very much and can understand as Computer Science with
>>> Business what it means for a development work.
>>> Can you give me more information on the points mentioned above?
>>> P.S.: I hope I was not misunderstood. I want to look more behind the scenes
>>> of Tajo and learn to understand the technical background and the birth and
>>> historical development of Tajo.
>>> Best regards,
>>> Chris

View raw message