apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chandni Singh <chan...@datatorrent.com>
Subject Re: BloomFilter in Malhar
Date Thu, 10 Dec 2015 23:17:45 GMT
Any takers for MinHash?

On Wed, Dec 9, 2015 at 3:13 AM, Chaitanya Chebolu <chaitanya@datatorrent.com
> wrote:

> Hi Chandni,
>
>    Yes. I have the implementation of BloomFilter and this can be added to
> Malhar.
> Needs to update the branch and then will open a PR.
>
> Regards,
> Chaitanya
>
> On Wed, Dec 9, 2015 at 1:42 PM, Chandni Singh <chandni@datatorrent.com>
> wrote:
>
> > Chaitanya,
> >
> > I believe you have an implementation of BloomFilter in your folk. Do you
> > think that can be added to Malhar?
> >
> > Chandni
> >
> > On Tue, Dec 8, 2015 at 9:02 PM, David Yan <david@datatorrent.com> wrote:
> >
> > > Bloom Filter, MinHash, and HyperLogLog are some of the commonly used
> > > algorithms in Big Data.  I think having them in the Malhar library
> would
> > be
> > > a good idea.
> > >
> > > There's a ticket for HyperLogLog created long time ago:
> > > https://malhar.atlassian.net/browse/MLHR-1822
> > >
> > > On Tue, Dec 8, 2015 at 5:42 PM, Chandni Singh <chandni@datatorrent.com
> >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > We need to add a BloomFilter implementation in Malhar. ManagedState
> > has a
> > > > use for it and I am pretty sure we will come up more and more use
> cases
> > > > that will need it. Tim's suggestion on Spill-able/Spooled data
> > structures
> > > > may use it too.
> > > >
> > > > Chandni
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message