incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthias Friedrich <>
Subject Re: New module to share user functions
Date Thu, 27 Sep 2012 16:09:43 GMT

I'm fine with any that makes remotely sense to a non-native speaker :)


On Thursday, 2012-09-27, Rahul wrote:
> I have named it crunch-bytes, but I like crunch-bars as well. J
> Pool in your suggestions.
> regards
> Rahul
> On 26-09-2012 21:36, Matthias Friedrich wrote:
> >OK, then let's do it! As soon as we've agreed on a name, of course :)
> >
> >Regards,
> >   Matthias
> >
> >On Wednesday, 2012-09-26, Rahul wrote:
> >>Hi,
> >>
> >>I believe every project has a bunch of interesting users which can
> >>provide additional food for thought to others. Hadoop provides lots
> >>of random opportunities to people and the same should be possible
> >>with crunch. I would be delighted to see what people are able to
> >>pull off using the existing things. These contributions should be
> >>kept in crunch as we are pretty young and at times we will go under
> >>various refactorings, keeping them in crunch will keep them up-to
> >>date.
> >>
> >>And yes, +1 to the idea of keeping dependencies to crunch-core only.
> >>
> >>regards,
> >>rahul
> >>On 26-09-2012 04:32, Josh Wills wrote:
> >>>I like the idea of having a place in the project that showcases the
> >>>cool things that you can do with it-- something more advanced and
> >>>broadly applicable than the starter pipelines we have in
> >>>crunch-examples, the kind of stuff that you can't easy do using tools
> >>>like Hive and Pig.
> >>>
> >>>I also agree that we don't want to get into dependency creep, so I'd
> >>>be inclined to limit crunch-bytes (crunch-berries? crunch-bars?
> >>>crunch-abs?) to just those dependencies that are also in crunch-core.
> >>>I think the Bloom Filter stuff meets this criteria.
> >>>
> >>>The project is still young enough that our problem is much more likely
> >>>to be attracting new folks than it is to be getting overwhelmed with
> >>>random contributions, so my inclination is to be welcoming.
> >>>
> >>>On Tue, Sep 25, 2012 at 11:29 AM, Matthias Friedrich <>
> >>>>Hi Rahul,
> >>>>
> >>>>I think it would be really great to have an ecosystem of
> >>>>micro-libraries around Crunch for all kinds of cool stuff that is
> >>>>relevant for smaller audiences, just like your Bloom filters.
> >>>>
> >>>>But since I expect most of this stuff to be so extremely special, it
> >>>>would in my opinion make more sense to put this into small, focused
> >>>>and independent projects that can be released separately from each
> >>>>other and don't need to go through Crunch's review process. It would
> >>>>make dependency management easier for users, too, in case a library
> >>>>needs additional dependencies.
> >>>>
> >>>>We could maintain a registry of these projects on Crunch's homepage
> >>>>so people can find them easily (I expect most of them would end up
> >>>>at GitHub because it's perfect for this kind of thing). If a project
> >>>>turns out to be interesting for a larger audience, we can still add it
> >>>>to Crunch core.
> >>>>
> >>>>Regards,
> >>>>   Matthias
> >>>>
> >>>>On Tuesday, 2012-09-25, Rahul wrote:
> >>>>>There can be interesting use-cases like BloomFilters which do not
> >>>>>have a place in the current set of Crunch modules. These functions
> >>>>>are kind of utility functions that can be used in Crunch. We need
> >>>>>create a place where users can share such functions. In the earlier
> >>>>>discussion for BloomFilters we thought of some thing that is well
> >>>>>along the lines of PiggyBank. I had a look at the module but in
> >>>>>Pig's structure the module is branched under contrib module as there
> >>>>>are other modules like peeny for monitering and zebra for storage.
> >>>>>
> >>>>>I have created a module name *crunch-bytes* , for issue
> >>>>>, which is direct
> >>>>>sub-module in crunch-parent. I named it so because I felt it will
> >>>>>providing a space to have all those interesting data computations
> >>>>>that we can not have in core.
> >>>>>
> >>>>>Please share your thoughts for the same.
> >>>>>
> >>>>>regards,
> >>>>>rahul
> >>>>>
> >>>

View raw message