incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <>
Subject Re: BloomFilters in Crunch
Date Mon, 20 Aug 2012 14:59:37 GMT
Hey Rahul,

Very cool use case. A thought: isn't the name of the file that
contains the bloom filter a better key than the boolean? That way, I
could point the input at an entire directory of files and have it
build bloom filters for all of them for me.

It seems useful to me in general, but I'm not quite sure where to put
it-- it's more useful than an example, but not such a common use case
that we would put it in core. We need something like the equivalent of
Pig's piggybank.


On Mon, Aug 20, 2012 at 12:58 AM, Rahul <> wrote:
> Hi,
> Today I tried to create BloomFilters using Crunch,  attached is the testcase
> for the same. I do not know if there is  a better way of accomplishing the
> same.
> I think APIs to create/load BloomFilters could be a good add-on to Crunch's
> existing set. If people feel like it could be added then I can make a patch
> for the same.
> regards,
> Rahul

View raw message