couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Volker Mische <volker.mis...@gmail.com>
Subject Re: Multi-dimensional queries?
Date Tue, 09 Nov 2010 14:57:30 GMT
Hi Norman, Micheal,

I spent a lot of time reading about multi dimensional data structures.
There are several interesting ones (I'd go for an RR*-tree if there were
single inserts only). Though one thing you have to consider (which I
haven't since recently) is the bulk insertion. For CouchDB, because of
the append only nature, you need to be able to perform efficient bulk
insertions. This is really hard stuff.

There are not so many bulk insertion algorithms out there (I'm not
talking about bulk loading, which would mean you create the whole
structure from scratch).

Having a good insertion algorithm is the key to have a fast queries. As
previously mentioned, almost all data structures I read about are about
single inserts only.

Cheers,
  Volker

On 11/09/2010 03:45 PM, Norman Barker wrote:
> Hi Michael, Volker
> 
> I have been thinking for a while about hacking a ub-tree for couch,
> http://en.wikipedia.org/wiki/UB-tree, but am concerned about the
> patent issues, http://archives.postgresql.org/pgsql-general/2002-04/msg00349.php,
> I know this was discussed a lot on the postgresql lists, wonder if
> anyone here knows any more about it?
> 
> Norman
> 
> On Tue, Nov 9, 2010 at 2:28 AM, Volker Mische <volker.mische@gmail.com> wrote:
>> Hi Michael,
>>
>> On 11/09/2010 07:14 AM, Michael Zedeler wrote:
>>> Hi CouchDB-users (and developers).
>>>
>>> Thanks for a great product and very useful discussions on the mailing list.
>>>
>>> Does anyone know if there has been an effort into making
>>> multi-dimensional queries possible?
>>>
>>> Given a d-dimensional keyspace, I'd like to be able to get all elements
>>> in a d-dimensinal hypercube. It could be implemented using kd-trees.
>>>
>>> Any thoughts?
>>>
>>> Regards,
>>>
>>> Michael.
>>>
>>
>> You could use an kd-tree, though I'd prefer a multidimensional variation
>> of the r-tree. The reason is, that the kd-tree is space partitioning,
>> which means you need to know the range the keys are in upfront (or do
>> some tricks which will make it more like an r-tree).
>>
>> In an r-tree you partition the data, so your data structure grows as
>> data from outside the current maximum range comes in.
>>
>> I'd love to see a multidimensional index, but atm I'm still busy with my
>> 2-dimensional index. So if you'd like to hack on it and need any help,
>> let me know.
>>
>> Cheers,
>>  Volker
>>


Mime
View raw message