incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron McCurry <amccu...@gmail.com>
Subject Supporting multiple version of Hadoop
Date Wed, 30 Oct 2013 02:51:02 GMT
There has been some interest lately about getting Blur to run successfully
on CDH4.  While I think the code will run correctly I know that configuring
Blur in that environment is challenging.  There are other versions on the
horizon as well, HDP, CDH5 (at some point), IBM's version as well as all
the official Apache versions 0.20.x through 2.2.0.  Another big problem
beyond just configuration is all the different supporting libraries that
Hadoop brings with it (or doesn't anymore in the case CDH4 and Hadoop 2.+).

So I prepose that for 0.3.0 that we make it a goal to support both legacy
1.x Hadoop and Hadoop 2.x.  I hope everyone has some ideas on how to
achieve this goal, but I will throw one out here and see what people think.

I believe that we need isolate our dependency on Hadoop through some well
defined interfaces (not just talking about Java interfaces).  Interface for
storage primarily and another for write ahead logging as well.  With a
modular approach and a nice classloader to isolate all the Hadoop
dependences from Blur that would also give us the ability to update library
versions that would normally collide with versions in Hadoop namely jetty.

Let me know what think.

Thanks!

Aaron

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message