couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: multiview using bloom filters
Date Sat, 25 Sep 2010 03:43:55 GMT
I would say if you find it somewhere with an EPL header, then that's a
good sign. But best to check with the original author that it was his
intent. As an interesting aside, I'm not even sure if its possible
that someone that's not an employee of Ericson can release something
under the EPL.

I'm not being a pain on purpose here, but the ASF is particular about
provenance in code and licensing. Alternatively, I haven't studied the
code, so in the worst case I'll read the white paper and try and apply
it to Basho's ebloom project.

Paul Davis

On Fri, Sep 24, 2010 at 11:21 PM, Norman Barker <norman.barker@gmail.com> wrote:
> Paul,
>
> yes, performance is actually much better (for some of our harder
> queries, so all docs over time with field X (two views), 10x faster),
> I am testing with docs that in total emit ~100K of keys (following the
> raindrop megaview).
>
> Some of the scalable bloom filter project contained EPL headers,
> others didn't, googling for the source code I had seen other projects
> add the EPL headers to bit array so I did the same. I will contact the
> author as he seems active on the erlang mailing lists and if not I
> will write a bloom filter from scratch, the theory is well documented,
> though I like his code!
>
> thanks for your help, let me know any suggestions you may have.
>
> thanks,
>
> Norman
>
>
>
> On Fri, Sep 24, 2010 at 7:16 PM, Paul Davis <paul.joseph.davis@gmail.com> wrote:
>> Norman,
>>
>> Just glanced through. Looks better. Any feeling for a performance differences?
>>
>> Also, I glanced at the original files that you linked to. The bit
>> array files didn't have a license, but what you've got there does have
>> EPL headers. We need to make sure we have permission to do so. I would
>> assume as much, but we have to be careful about such things in the
>> ASF. You only need to get an email from the original author saying its
>> ok.
>>
>> I'm a bit caught up with some other code at the moment, I'll give a
>> more thorough combing over tomorrow.
>>
>> Paul
>>
>> On Fri, Sep 24, 2010 at 7:54 PM, Norman Barker <norman.barker@gmail.com> wrote:
>>> Hi,
>>>
>>> thanks to Paul's excellent suggestion I have rewritten the multiview
>>> to use bloom filters, I had a concern that a bloom filter per view
>>> would use too much memory but thanks in the main to excellent
>>> implementation of bloom filters in erlang
>>> (http://sites.google.com/site/scalablebloomfilters/) they seem to be
>>> very space efficient.
>>>
>>> New code is here
>>>
>>> http://github.com/normanb/couchdb/
>>>
>>> The code is simple, all one process, once we have agreed the approach
>>> we can decide if there is any benefit in making the bloom filter
>>> generation occur a separate process (using a genserver).
>>>
>>> Comments as always appreciated, I will continue adding to the test suite.
>>>
>>> thanks for the help,
>>>
>>> Norman
>>>
>>
>

Mime
View raw message