Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 91541 invoked from network); 24 Mar 2009 22:15:07 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 24 Mar 2009 22:15:07 -0000 Received: (qmail 3438 invoked by uid 500); 24 Mar 2009 22:15:06 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 3355 invoked by uid 500); 24 Mar 2009 22:15:06 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 3345 invoked by uid 99); 24 Mar 2009 22:15:06 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Mar 2009 22:15:06 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [83.97.50.139] (HELO jan.prima.de) (83.97.50.139) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Mar 2009 22:14:58 +0000 Received: from [192.168.1.100] (dhcp-077-249-044-252.chello.nl [::ffff:77.249.44.252]) (AUTH: LOGIN jan, TLS: TLSv1/SSLv3,128bits,AES128-SHA) by jan.prima.de with esmtp; Tue, 24 Mar 2009 22:14:35 +0000 Message-Id: <6CE5154F-A2B9-49C6-96F5-8C329346591B@apache.org> From: Jan Lehnardt To: user@couchdb.apache.org In-Reply-To: <5017258D295FBE41917880488689F7B80FA4F32E63@VCSSBS.visionarycs.local> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: A couple of couchdb questions. Date: Tue, 24 Mar 2009 23:14:04 +0100 References: <5017258D295FBE41917880488689F7B80FA4F32E62@VCSSBS.visionarycs.local>,<05094CDC-C1F3-4028-9095-B061E263FD87@apache.org> <5017258D295FBE41917880488689F7B80FA4F32E63@VCSSBS.visionarycs.local> X-Mailer: Apple Mail (2.930.3) X-Virus-Checked: Checked by ClamAV on apache.org On 24 Mar 2009, at 23:03, Gary Smith wrote: > Jan, > > I understand your responses below. I think the implementation of > the write/pull slave makes sense. The reason I was thinking about > using Apache httpd proxy is that we already have a set of these > setup via HA doing something similar. All of the servers will be > public in the sense that it's on our private network and available > to everything on the network. You can still use the proxy for reads outside of your network. > The replication isn't the biggest problem. We could code it to do > both at the same time and only consider it accepted if both succeed. The problem with this is that you'll end up with different revision numbers for each write, replication ensures that both nodes have the same rev ids. > Problem is sometimes there it network latency, but then that > affects any type of gaurenteed commit process. > > Anyway, I will probably setup a test environment in the next week or > so and see how well it plays with what I want to do. Good luck, keep sharing your results :) Cheers Jan -- > > Gary > > ________________________________________ > From: Jan Lehnardt [jan@apache.org] > Sent: Tuesday, March 24, 2009 2:46 PM > To: user@couchdb.apache.org > Subject: Re: A couple of couchdb questions. > > Hi Gary, > > > On 24 Mar 2009, at 22:14, Gary Smith wrote: > >> Hello, >> >> I'm working to implement a document warehouse for about 15m >> documents. These documents range between 10kb - 500kb (legacy >> archived pdf's). Currently we do this by maintaining a mysql >> database with the document stored on a variety of servers >> (consisting of about 6TB). Most of the problems that we encounter >> are a) backups and b) physical access to the documents as they are >> on a private network. Since not much changes backups aren't really >> that much of a problem (but restoring is very slow). We are now >> looking to add documents to this regularly (about 10k per week). So >> we are looking to implement something new, or at least, more useful. >> >> So we thought about using Amazon S3 for storage but these documents >> fall under HIPA constraints so we have decided to do this in house. >> >> Looking at couchdb, it pretty much does what we are looking to do. >> We really only want to store a document and maybe some very basic >> metadata (which we currently do by having both a PDF and a metadata >> file). Implementation doesnt look like a problem with the >> documented API. > > This sounds like a good application. > > >> So, the questions. >> >> I would like to break this down into multiple servers and >> incorporate replication at the same time. The document says that >> pull is recommended over push but doesn't mention why. > > Pull replication is faster, more reliable and is better at picking up > failed replications again. > > >> Does push replication require the slave (or other node) to accept >> the put/post request as completed? > > Not sure what you mean here. > > >> If we choose pull replication instead of push, I assume that this is >> something we will need to crontab out to schedule it, or does it >> have a background process that constantly syncs? API looks like >> just a single get request. > > It's a POST request, but yeah, there's no automatism. See below. > > >> Either way, here is what we are looking to do at this time. At two >> seperate locations we will have multiple servers, setup in a master/ >> master configuration. We should not run into any conflicts are >> updates are not allowed. ID's are unique (MD5 checksum and some >> other unique information). > > Good. > > >> We wanted to use 4 servers at each location, partly because each >> server has 4TB of space (actually 3TB of raid 5). Each server will >> hold files based upon the first digit of the MD5 checksum (0-3 on >> server A, 4-7 on server B, 8-A on server C, and B-F on server D). >> We were thinking of using Apache's URL rewriting to proxy the >> request to the proper server. This should work for both get/put/ >> post. > > This sounds sensible. Other proxies can handle that as well, but if > your are comfortable with Apache httpd, no reason to not use it. > > >> We will also have the backup server at the second location (which >> will be the active for their location) using the same ideology. >> >> What would be most useful is to be able to ensure that before a >> commit is accepted on a server we could gaurentee that it has been >> replicated to a second box. >> >> Any ideas or suggestions on that? > > I think it is easiest when your application knows about the fact that > there is no single server when doing a write. Reads can still be > served through a single interface, but to make a proxy understand the > replication protocol is probably not worth the effort. > > Here's how I'd set it up: Your application sends a PUT request to your > Apache proxy (or to the correct backend node directly), the document > will end up on one of the four backend servers. Your application also > knows about the backend nodes, the partition rules and can access them > directly, in both locations. After the PUT request that creates the > document completes in location 1 you send a POST request to the > backend server in location 2 telling it to replicate from the node > that the doc was written to in location 1 (issuing a pull > replication). When this call returns successfully (if not, you just > retry), you return to the original caller that wanted to save the > document. > > Downsides: > > - You need to deal with the fact that the communication between back > ends might not work and decide what to do (save anyway, allowing > reduced reliability and weak consistency) or deny the write because > the consistency and availability requirements can no longer be > guaranteed). > > - Your backend nodes must be public. > > > In the Future, CouchDB will be able to do the partitioning for you > (solving the second downside). The first one can never be circumvented > in a distributed system. > > > Cheers > Jan > --