hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: [Discuss] project chop up
Date Wed, 07 Aug 2013 19:57:02 GMT
I think that is a good idea. I have been thinking about it a lot. I
especially hate how the offline build is now broken.

However I think it is going to take some time. There are some tricks like
how we build hive-exec jar that are not very clean to do in maven. I am
very interested

The last initiative we spoke about on list was moving from forest, I would
like to finish/start that before we get onto the project chop up.


On Wed, Aug 7, 2013 at 3:06 PM, Brock Noland <brock@cloudera.com> wrote:

> Thus far there hasn't been any dissent to managing our modules with maven.
>  In addition there have been several comments positive on a move towards
> maven. I'd like to add Ivy seems to have issues managing multiple versions
> of libraries. For example in HIVE-3632 Ivy cache had to be cleared when
> testing patches that installed the new version of DataNucleus  I have had
> the same issue on HIVE-4388. Requiring the deletion of the ivy cache
> is extremely painful for developers that don't have access to high
> bandwidth connections or live in areas far from California where most of
> these jars are hosted.
>
> I'd like to propose we move towards Maven.
>
>
> On Sat, Jul 27, 2013 at 1:19 PM, Mohammad Islam <mislam77@yahoo.com>
> wrote:
>
> >
> >
> > Yes hive build and test cases got convoluted as the project scope
> > gradually increased. This is the time to take action!
> >
> > Based on my other Apache experiences, I prefer the option #3 "Breakup the
> > projects within our own source tree". Make multiple modules or
> > sub-projects. By default, only key modules will be built.
> >
> > Maven could be a possible candidate.
> >
> > Regards,
> > Mohammad
> >
> >
> >
> > ________________________________
> >  From: Edward Capriolo <edlinuxguru@gmail.com>
> > To: "dev@hive.apache.org" <dev@hive.apache.org>
> > Sent: Saturday, July 27, 2013 7:03 AM
> > Subject: Re: [Discuss] project chop up
> >
> >
> > Or feel free to suggest different approach. I am used to managing
> software
> > as multi-module maven projects.
> > From a development standpoint if I was working on beeline, it would be
> nice
> > to only require some of the sub-projects to be open in my IDE to do that.
> > Also managing everything globally is not ideal.
> >
> > Hive's project layout, build, and test infrastructure is just funky. It
> has
> > to do a few interesting things (shims, testing), but I do not think what
> we
> > are doing justifies the massive ant build system we have. Ant is so ten
> > years ago.
> >
> >
> >
> > On Sat, Jul 27, 2013 at 12:04 AM, Alan Gates <gates@hortonworks.com>
> > wrote:
> >
> > > But I assume they'd still be a part of targets like package, tar, and
> > > binary?  Making them compile and test separately and explicitly load
> the
> > > core Hive jars from maven/ivy seems reasonable.
> > >
> > > Alan.
> > >
> > > On Jul 26, 2013, at 8:40 PM, Brock Noland wrote:
> > >
> > > > Hi,
> > > >
> > > > I think thats part of it but I'd like to decouple the downstream
> > projects
> > > > even further so that the only connection is the dependency on the
> hive
> > > jars.
> > > >
> > > > Brock
> > > > On Jul 26, 2013 10:10 PM, "Alan Gates" <gates@hortonworks.com>
> wrote:
> > > >
> > > >> I'm not sure how this is different from what hcat does today.  It
> > needs
> > > >> Hive's jars to compile, so it's one of the last things in the
> compile
> > > step.
> > > >> Would moving the other modules you note to be in the same category
> be
> > > >> enough?  Did you want to also make it so that the default ant target
> > > >> doesn't compile those?
> > > >>
> > > >> Alan.
> > > >>
> > > >> On Jul 26, 2013, at 4:09 PM, Edward Capriolo wrote:
> > > >>
> > > >>> My mistake on saying hcat was a fork metastore. I had a brain
fart
> > for
> > > a
> > > >>> moment.
> > > >>>
> > > >>> One way we could do this is create a folder called downstream.
In
> our
> > > >>> release step we can execute the downstream builds and then copy
the
> > > files
> > > >>> we need back. So nothing downstream will be on the classpath of
the
> > > main
> > > >>> project.
> > > >>>
> > > >>> This could help us breakup ql as well. Things like exotic file
> > formats
> > > ,
> > > >>> and things that are pluggable like zk locking can go here. That
> might
> > > be
> > > >>> overkill.
> > > >>>
> > > >>> For now we can focus on building downstream and hivethrift1might
be
> > the
> > > >>> first thing to try to downstream.
> > > >>>
> > > >>>
> > > >>> On Friday, July 26, 2013, Thejas Nair <thejas@hortonworks.com>
> > wrote:
> > > >>>> +1 to the idea of making the build of core hive and other
> downstream
> > > >>>> components independent.
> > > >>>>
> > > >>>> bq.  I was under the impression that Hcat and hive-metastore
was
> > > >>>> supposed to merge up somehow.
> > > >>>>
> > > >>>> The metastore code was never forked. Hcat was just using
> > > >>>> hive-metastore and making the metadata available to rest of
hadoop
> > > >>>> (pig, java MR..).
> > > >>>> A lot of the changes that were driven by hcat goals were being
> made
> > in
> > > >>>> hive-metastore. You can think of hcat as set of libraries
that let
> > pig
> > > >>>> and java MR use hive metastore. Since hcat is closely tied
to
> > > >>>> hive-metastore, it makes sense to have them in same project.
> > > >>>>
> > > >>>>
> > > >>>> On Fri, Jul 26, 2013 at 6:33 AM, Edward Capriolo <
> > > edlinuxguru@gmail.com
> > > >>>
> > > >>> wrote:
> > > >>>>> Also i believe hcatalog web can fall into the same designation.
> > > >>>>>
> > > >>>>> Question , hcatalog was initily a big hive-metastore fork.
I was
> > > under
> > > >>> the
> > > >>>>> impression that Hcat and hive-metastore was supposed to
merge up
> > > >> somehow.
> > > >>>>> What is the status on that? I remember that was one of
the core
> > > reasons
> > > >>> we
> > > >>>>> brought it in.
> > > >>>>>
> > > >>>>> On Friday, July 26, 2013, Edward Capriolo <edlinuxguru@gmail.com
> >
> > > >> wrote:
> > > >>>>>> I prefer option 3 as well.
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Fri, Jul 26, 2013 at 12:52 AM, Brock Noland <
> > brock@cloudera.com>
> > > >>> wrote:
> > > >>>>>>>
> > > >>>>>>> On Thu, Jul 25, 2013 at 9:48 PM, Edward Capriolo
<
> > > >> edlinuxguru@gmail.com
> > > >>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> I have been developing my laptop on a duel
core 2 GB Ram
> laptop
> > > for
> > > >>>>> years
> > > >>>>>>>> now. With the addition of hcatalog, hive-thrift2,
and some
> other
> > > >>> growth
> > > >>>>>>>> trying to develop hive in a eclipse on this
machine craws,
> > > >> especially
> > > >>>>> if
> > > >>>>>>>> 'build automatically' is turned on. As we
look to add on more
> > > things
> > > >>>>> this
> > > >>>>>>>> is only going to get worse.
> > > >>>>>>>>
> > > >>>>>>>> I am also noticing issues like this:
> > > >>>>>>>>
> > > >>>>>>>> https://issues.apache.org/jira/browse/HIVE-4849
> > > >>>>>>>>
> > > >>>>>>>> What I think we should do is strip down/out
optional parts of
> > > hive.
> > > >>>>>>>>
> > > >>>>>>>> 1) Hive Hbase
> > > >>>>>>>> This should really be it's own project to
do this right we
> > really
> > > >>>>> have to
> > > >>>>>>>> have multiple branches since hbase is not
backwards
> compatible.
> > > >>>>>>>>
> > > >>>>>>>> 2) Hive Web Interface
> > > >>>>>>>> Now really a big project but not really critical
can be just
> as
> > > >>> easily
> > > >>>>> be
> > > >>>>>>>> build separately
> > > >>>>>>>>
> > > >>>>>>>> 3) hive thrift 1
> > > >>>>>>>> We have hive thrift 2 now, it is time for
the sun to set on
> > > >>>>> hivethrift1,
> > > >>>>>>>>
> > > >>>>>>>> 4) odbc
> > > >>>>>>>> Not entirely convinced about this one but
it is really not
> > > critical
> > > >>> to
> > > >>>>>>>> running hive.
> > > >>>>>>>>
> > > >>>>>>>> What I think we should do is create sub-projects
for the above
> > > >> things
> > > >>>>> or
> > > >>>>>>>> simply move them into directories that do
not build with hive.
> > > >>> Ideally
> > > >>>>> they
> > > >>>>>>>> would use maven to pull dependencies.
> > > >>>>>>>>
> > > >>>>>>>> What does everyone think?
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>> I agree that projects like the HBase handler and
probably
> others
> > as
> > > >>> well
> > > >>>>>>> should somehow be "downstream" projects which
simply depend on
> > the
> > > >> hive
> > > >>>>>>> jars.  I see a couple alternatives for this:
> > > >>>>>>>
> > > >>>>>>> * Take the "module" in question to the Apache
Incubator
> > > >>>>>>> * Move the "module" in question to the Apache
Extras
> > > >>>>>>> * Breakup the projects within our own source tree
> > > >>>>>>>
> > > >>>>>>> I'd prefer the third option at this point.
> > > >>>>>>>
> > > >>>>>>> Brock
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> Brock
> > > >>>>>>
> > > >>>>>>
> > > >>>>
> > > >>
> > > >>
> > >
> > >
> >
>
>
>
> --
> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message