Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 40937E2FF for ; Thu, 7 Mar 2013 21:59:12 +0000 (UTC) Received: (qmail 14641 invoked by uid 500); 7 Mar 2013 21:59:10 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 14611 invoked by uid 500); 7 Mar 2013 21:59:10 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 14602 invoked by uid 99); 7 Mar 2013 21:59:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Mar 2013 21:59:10 +0000 X-ASF-Spam-Status: No, hits=0.3 required=5.0 tests=FREEMAIL_REPLY,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of rkalla@gmail.com designates 209.85.212.44 as permitted sender) Received: from [209.85.212.44] (HELO mail-vb0-f44.google.com) (209.85.212.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Mar 2013 21:59:06 +0000 Received: by mail-vb0-f44.google.com with SMTP id fr13so382240vbb.17 for ; Thu, 07 Mar 2013 13:58:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:references:from:in-reply-to:mime-version:date:message-id :subject:to:content-type; bh=Nz+bEnvlG4wme/UpwOy+l2/WuCDTJKy8eU+QKkjwmZM=; b=FP5PILn7MjJuZulBsqlpPplvQ/2kgdqm3C126SHLPxUZh0lMDejwMTufpyHy/AiGaP fXu0WE5ho8ELRud3Y/jlnsQhZBeIeDa9abuzZDHmNI6VPcH9ZRIdgyrEHimxGptvSb6R MHx5OfN92QXALpYFBa9Wy/mAALrcHv3oWmitJ6xKdiW3YEJtvJi+ZqxuCZ0EjxfvmeoE bD6hs6d68lGdI+xfua5ZKlBweSrbqUQMj69jfWqZxKntj5uVzwdScWcjCE51MsASsoXF 5lH9RculozElqzPxkmohYFQRrJrf1kPfdhQiJgMy/DRSuAQKjSE4cfEa7WgbH2+me8H+ M/Og== X-Received: by 10.52.34.69 with SMTP id x5mr12023488vdi.31.1362693525246; Thu, 07 Mar 2013 13:58:45 -0800 (PST) References: <5C479B0B97494BD9946794658DED8392@cloudant.com> <-9146800202554357347@unknownmsgid> From: Riyad Kalla In-Reply-To: Mime-Version: 1.0 (1.0) Date: Thu, 7 Mar 2013 13:58:44 -0800 Message-ID: <4048687810933313783@unknownmsgid> Subject: Re: CouchDB compaction not catching up. To: "user@couchdb.apache.org" Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Will be very curious how you end up solving this, please keep us posted! Sent from my iPhone On Mar 7, 2013, at 1:47 PM, Nicolas Peeters wrote: > See my answers in the text. I know there are all kinds of workarounds > possible and it seems that this is actually not such a big problem for all > other users. > Maybe this "extreme" case warrants more practical workarounds indeed. > > On Thu, Mar 7, 2013 at 4:12 PM, Riyad Kalla wrote: > >> To Simon's point, exactly where I was headed. Your issue is that >> compaction cannot catch up due to write velocity, so you need to avoid >> compaction (and by extension replication since the issue is that your >> background writes cannot catch up) The only way to do that is some >> working model where you simple discard the data file when done and >> start anew. > Indeed. Unless the file actually gets so big that you can't possibly do > anything. But then again, maybe a design issue in the amount of stuff being > logged. > > >> You mentioned clearing a few 100 records at a time after a tx >> completes, so it sounds like over the period of a week, you should be >> turning over your entire data set completely right? > > Typically, yes. > >> >> I wonder there could be a solution here like fronting a few CouchDB >> instances with nginx and using a cron job, on day 5 or 7, flipping >> inbound traffic to a hot (empty standby) while processing the >> remaining data off the old master an then clearing it out which writes >> are directed to the new master for the next week? > > Wow. That's an impressive workaround but that would work indeed. I'd prefer > using standards features (that can also be easily driven by a web app or > something (which is the case)). > > Again, this only makes sense depending on data usage and if the >> pending data off the slave would need to stay accessible to a front >> end like search. Ultimately what I am suggesting here is a solution >> where you always have a CouchDB instance to write logs to, but you are >> never trying to compact which would require some clever juggling >> between instances. >> >> Alternatively... Your problem is write performance, I would be curious >> if IOPS instances would cure this for you right out of the box with no >> engineering work. >> >> Longer term? Probably check out aws redline. > > At the moment, we're looking at alternatives which is to use Logstash and > write either to files and/or stream to ElasticSearch. Delete would be > achieved by deleted in bulk a whole "index" (a bit like the solution > mentioned above). We'll keep CouchDB for the "important" logs and > transactions logs are possibly going to be dealt in a different way. > > >> Sent from my iPhone >> >> On Mar 7, 2013, at 1:58 AM, Nicolas Peeters wrote: >> >>> Simon, >>> >>> That's actually a very suggestion and we actually implemented that (we >> had >>> one DB per "process"). The problem that the size of the DB sometimes >>> outgrew our disks (1TB!) (and sometimes, we needed to keep the data >> around >>> for longer periods), so we discarded it at the end. >>> >>> This is however a workaround. And the main question was about the >>> compaction not catching up (which may be a problem in some other cases). >>> >>> >>> On Thu, Mar 7, 2013 at 9:58 AM, Simon Metson wrote: >>> >>>> What about making a database per day/week and dropping the whole lot in >>>> one go? >>>> >>>> >>>> On Thursday, 7 March 2013 at 08:50, Nicolas Peeters wrote: >>>> >>>>> So the use case is some kind of transactional log associated with some >>>> kind >>>>> of long running process (1 day). For each process, a few 100 thousands >>>>> lines of "logging" are inserted. When the process has completed (user >>>>> approval), we would like to delete all the associated "logs". Marking >>>> items >>>>> as deleted is not really the issue. Recovering the space is. >>>>> >>>>> The data should ideally be available for up to a week or so. >>>>> >>>>> >>>>> On Thu, Mar 7, 2013 at 9:24 AM, Riyad Kalla wrote: >>>>> >>>>>> Nicolas, >>>>>> Can you provide some insight into how you decide which large batches >> of >>>>>> records to delete and roughly how big (MB/GB wise) those batches are? >>>> What >>>>>> is the required longevity of this tx information in this couch store? >>>> Is >>>>>> this just temporary storage or is this the system of record and what >>>> you >>>>>> are deleting in large batches are just temporary intermediary data? >>>>>> >>>>>> Understanding how you are using the data and turning over the data >>>> could >>>>>> help assess some alternative strategies. >>>>>> >>>>>> Best, >>>>>> Riyad >>>>>> >>>>>> On Thu, Mar 7, 2013 at 12:19 AM, Nicolas Peeters >>>>>> wrote: >>>>>> >>>>>> >>>>>>> Hi CouchDB Users, >>>>>>> >>>>>>> *Disclaimer: I'm very aware that the use case is definitely not the >>>> best >>>>>>> for CouchDB, but for now, we have to deal with it.* >>>>>>> >>>>>>> *Scenario:* >>>>>>> >>>>>>> We have a fairly large (~750Gb) CouchDB (1.2.0) database that is >>>> being >>>>>>> used for transactional logs (very write heavy) (bad idea/design, I >>>> know, >>>>>>> but that's besides the point of this question - we're looking at >>>>>>> alternative designs). Once in a while, we delete some of the records >>>> in >>>>>>> large batches and we have scheduled auto compaction, checking every 2 >>>>>>> hours. >>>>>>> >>>>>>> This is the compaction config: >>>>>>> >>>>>>> [image: Inline image 1] >>>>>>> >>>>>>> From what I can see, the DB is being hammered significantly every 12 >>>>>> hours >>>>>>> and the compaction is taking (sometimes 24 hours (with a size of >>>> 100GB of >>>>>>> log data, sometimes much more (up to 500GB)). >>>>>>> >>>>>>> We run on EC2. Large instances with EBS. No striping (yet), no IOPS. >>>> We >>>>>>> tried fatter machines, but the improvement was really minimal. >>>>>>> >>>>>>> ** >>>>>>> >>>>>>> *The problem:* >>>>>>> >>>>>>> The problem is that compaction takes a very long time (e.g. 12h+) and >>>>>>> reduces the performance of the entire stack. The main issue seems to >>>> be >>>>>>> that it's hard for the compaction process to "keep up" with the >>>>>> >>>>>> insertions, >>>>>>> hence why it takes so long. Also, the compaction of the view takes >>>> long >>>>>>> time (sometimes the view is 100GB). During the re-compaction of the >>>> view, >>>>>>> clients don't get a response, which is blocking the processes. >>>>>>> >>>>>>> [image: Inline image 2] >>>>>>> >>>>>>> The view compaction takes approx. 8 hours and the indexing for the >>>> view >>>>>>> are therefore slower and during the time that view indexes, another >>>> 300k >>>>>>> insertions have been done (and it doesn't catch up). The only way to >>>>>> >>>>>> solve >>>>>>> the problem was to throttle the number of inserts from the app >>>> itself and >>>>>>> then eventually the view compaction resolved. If we would have >>>> continued >>>>>> >>>>>> to >>>>>>> insert at the same rate, it would not have finished (and ultimately, >>>> we >>>>>>> would have run out of disk space). >>>>>>> >>>>>>> Any recommendations to set it up on EC2 is welcome. Also >>>> configuration >>>>>>> settings for the compaction would be helpful. >>>>>>> >>>>>>> Thanks. >>>>>>> >>>>>>> Nicolas >>>>>>> >>>>>>> PS: We are happily using CouchDB for other (more traditional) use >>>> case >>>>>>> where it does go very well. >>