Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 55875 invoked from network); 24 Mar 2009 21:09:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 24 Mar 2009 21:09:23 -0000 Received: (qmail 3313 invoked by uid 500); 24 Mar 2009 21:09:22 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 3225 invoked by uid 500); 24 Mar 2009 21:09:22 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 3215 invoked by uid 99); 24 Mar 2009 21:09:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Mar 2009 21:09:22 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of wout.mertens@gmail.com designates 74.125.78.25 as permitted sender) Received: from [74.125.78.25] (HELO ey-out-2122.google.com) (74.125.78.25) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Mar 2009 21:09:14 +0000 Received: by ey-out-2122.google.com with SMTP id 25so495038eya.29 for ; Tue, 24 Mar 2009 14:08:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:from:to :in-reply-to:content-type:content-transfer-encoding:mime-version :subject:date:references:x-mailer; bh=E04wpc4oMV8zSJIMVez0qA9zh+v0PKplMoLhaowJ2c8=; b=SwrATgSDs36XBVFqNqOZmMt0A+XKyw5u72J5bgz/BXaEfYRseuLnFdfJVmk9nMQcqC n4ntBUFVCUHsoBLMMtwo550vagnow5MsBM9r/UYh5o699xgI0IvEa3YEUI1rj78puObo iSbFTpz8rtzK1ieqUPxgHb5UxsoDIMOP9O9XY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:in-reply-to:content-type :content-transfer-encoding:mime-version:subject:date:references :x-mailer; b=sbGifr8aHQdP6Vv67q17/AuzJQ2EYHVB6h6gKSzdfhy51Z1ScMzHTFg5W+OfunF235 oZykkhyLx3GA7IiafWa3qvHZ1R0TgJS0uudX3mQslghh4tlUPRrfzQJ+Hlwzdia+W8ai 2eqgBzGdJcPqYFDBwJEpCKCNOCB6ClrQSiNnQ= Received: by 10.216.38.68 with SMTP id z46mr3346336wea.6.1237928932671; Tue, 24 Mar 2009 14:08:52 -0700 (PDT) Received: from ?192.168.0.6? (94.56-136-217.adsl-dyn.isp.belgacom.be [217.136.56.94]) by mx.google.com with ESMTPS id 10sm132599eyd.3.2009.03.24.14.08.49 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 24 Mar 2009 14:08:50 -0700 (PDT) Message-Id: <40D121A8-F445-4471-B56F-52CEE27FED30@gmail.com> From: Wout Mertens To: user@couchdb.apache.org In-Reply-To: <49C79AB9.8020202@gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: using couchdb for 100GB geodata from openstreetmap Date: Tue, 24 Mar 2009 22:08:47 +0100 References: <186b32bc0903221439k7bdc1c15vf1a960aa2047d4c1@mail.gmail.com> <49C6BC80.4090504@gmail.com> <186b32bc0903221545k7f41a2ccl62b55902cf740557@mail.gmail.com> <49C6C1B8.9090605@gmail.com> <1C3347CF-EBFE-47A1-A2CE-EA5681AD343E@apache.org> <49C6C4C2.6080202@gmail.com> <20090323055830.GK31847@translab.its.uci.edu> <49C79AB9.8020202@gmail.com> X-Mailer: Apple Mail (2.930.3) X-Virus-Checked: Checked by ClamAV on apache.org How about this: 1. Create view functions that map coordinates in square cells that contain "not too many" results, create an X,Y and Y,X index 2. Create reduce functions that count number of items in those cells 3. Now query for everything in a rectangle or square by asking an _external process 4. The _external process does local queries. 4a. First it finds out if there's less results when querying on X or on Y by getting the count 4b. Then for the selected range (let's say X), query [startX +i,startY...endY] where i goes from 0 to endX-startX 4c. Filters out the types of documents that are requested and returns them. If there's not too many types you can make X,Y views per type. Is that naive? As long as the cells are not too large and not too small, you wouldn't be processing much more documents than you would actually need to process and using an _external process keeps the back- and-forth view traffic very local. Wout. On Mar 23, 2009, at 3:20 PM, Volker Mische wrote: > James Marca wrote: >> On Mon, Mar 23, 2009 at 12:07:46AM +0100, Volker Mische wrote: >>> Is there a difference between a custom view engine and the >>> approach I >>> try to pursue with GeoCouch? It still comes down to the fact that >>> you >>> need to store the spatial information in some kind of (external) >>> spatial >>> index. >> > [...] >> >> The Geo Couch approach, as I understand it (again, forgive any >> ignorance on my part) is an external process that responds to >> queries, >> right? So there isn't an index that CouchDB understands, and you >> can't run map/reduce type queries over it. At least, that's why I >> didn't try to load up GeoCouch. I was looking for a view engine that >> was a bit more native to the map/reduce aspects of CouchDB. > > This is how it works at the moment. The next thing I'll be working > on is > _external and view intersection. This means that the external process > returns some results which will be combined with the results of a > view. > This way you get (hopefully) the power of CouchDB for attribute > queries, > but high speed for the spatial part of the query. > > Cheers, > Volker > > >