Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D481B200D4E for ; Thu, 7 Dec 2017 21:04:01 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id D2D20160C0C; Thu, 7 Dec 2017 20:04:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id CAE61160C08 for ; Thu, 7 Dec 2017 21:04:00 +0100 (CET) Received: (qmail 92971 invoked by uid 500); 7 Dec 2017 20:03:59 -0000 Mailing-List: contact user-help@kudu.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@kudu.apache.org Delivered-To: mailing list user@kudu.apache.org Received: (qmail 92955 invoked by uid 99); 7 Dec 2017 20:03:59 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Dec 2017 20:03:59 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 0E9A118070F for ; Thu, 7 Dec 2017 20:03:59 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.201 X-Spam-Level: X-Spam-Status: No, score=-1.201 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, KAM_LINEPADDING=1.2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-2.8, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=cloudera.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id icW3-cfB2wkn for ; Thu, 7 Dec 2017 20:03:56 +0000 (UTC) Received: from mail-pf0-f182.google.com (mail-pf0-f182.google.com [209.85.192.182]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id E65B26382C for ; Thu, 7 Dec 2017 19:48:05 +0000 (UTC) Received: by mail-pf0-f182.google.com with SMTP id n6so5460453pfa.4 for ; Thu, 07 Dec 2017 11:48:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudera.com; s=google; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding; bh=v5uLFO9URo4iGvAjhdOOJHv0OK60kMDzTeeDStFgxFA=; b=FUgfmP6ATbzFgSd3h1WVGKglda98lWpDetUGGM1cqUB2crQh5H5EO/cn03Vtmy5A8Y OHdnIBEh3tRk2VVz+KOF50PdbQQaFPwr7QjIVvpbzGq8uP8G4jHjV0zTbhPKA5TrI6gU 0m9nzygfjeSLcHAttH+plOlt1ogj5xoZsYXMnN7UWf9CtbxMZ3tNzJ5oz19zdxOiTK4Q LRg2vA/ErwRgw32bkg6zssWwOhefQA+kubd9ZqkdJgPp/29vYM7ih3r45dh/h6iasZnf 8IxodN9ZQBXcCQzgnvnKiBYMmMYYeIA8w2sAKu4/guSkKR5HfyRh4/IN/YjUeKDKDjMV lJQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=v5uLFO9URo4iGvAjhdOOJHv0OK60kMDzTeeDStFgxFA=; b=RCDYGguffu3WYHizSUh2aDDEGXAce/VDHSHEYJyKVk1JqdIxVIcPsjjfM32+03RwQa 0MOEcB8jnwN54LJ977eWI1saRb7iUnrgeG4GJ7i650OpyfSJ0ppg5EBnN6Mv6KXdlm1I cXZZ5eZfLunqT4aBOWxLi2D4UMhRSUVfPqJxBoSkU0mvc7FWJZB3JfdmDtjIJO2mcmYm 7731+Z+uEfLfF8aA59/KNXJOgj07AG1BCanuzxLmGr4MFf3iYxNLqpm01sC/mjFgytZK iwnPrdd1SkHDugL8Vu0aVQJ0l13DET6fwReRalz2n2NSeqeLQe/uMXjFTxsQepvD3nlu 7Eaw== X-Gm-Message-State: AJaThX4I4R+hSIfyNEzeNgWOzicUwwXco3h3CPYaMuPe1VguAyDh1JlI HKXo6Tnpawb64oNUIRCzShnFHKjvUhg= X-Google-Smtp-Source: AGs4zMbbMDYQttsduKfZ6eiSm0M95xvlS7ZZgcci7qk8YsSi+wXIBqsPSAbKcwW6RtMUruE9gQlAzA== X-Received: by 10.99.182.2 with SMTP id j2mr26484772pgf.116.1512676083594; Thu, 07 Dec 2017 11:48:03 -0800 (PST) Received: from dhcp-10-16-1-175.pa.cloudera.com ([64.125.238.131]) by smtp.gmail.com with ESMTPSA id e26sm9023594pfi.10.2017.12.07.11.48.02 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 Dec 2017 11:48:02 -0800 (PST) Subject: Re: Data inconsistency after restart To: user@kudu.apache.org References: From: Alexey Serbin Message-ID: Date: Thu, 7 Dec 2017 11:48:01 -0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit archived-at: Thu, 07 Dec 2017 20:04:02 -0000 Hi Petter, Before going too deep in attempts to find the place where the data was lost, I just wanted to make sure we definitely know that the data was delivered from the client to the server side. Did you verified the client didn't report any errors during data ingestion? Most likely you did, but I just wanted to make sure. BTW, what kind of client did you use for the data ingestion? Thanks! Kind regards, Alexey On 12/6/17 3:56 PM, Andrew Wong wrote: > Hi Petter, > > Before we shut down we could only see the following in the logs. > I.e., no sign that ingestion was still ongoing. > > > Interesting. Just to be sure, was that seen on one tserver, or did you > see them across all of them? > > But if the maintenance_manager performs important jobs that are > required to ensure that all data is inserted then I can understand > why we ended up with inconsistent data. > > > The maintenance manager's role is somewhat orthogonal to writes: data > is first written to the on-disk write-ahead log and also kept > in-memory to be accessible by scans. The maintenance manager > periodically shuttles this in-memory data to disk, among various other > tasks like cleaning up WAL segments, compacting rowsets, etc. Given > that, a lack of maintenance ops shouldn't cause incorrectness in data, > even after restarting. > > I would assume this means that it does not guarantee consistency > if new data is inserted but should give valid (and same) results > if no new data is inserted? > > > Right, if /all/ tservers a truly caught up and done processing the > writes, with no tablet copies going on, and with no new data coming > in, then the results should be consistent. > > > Hope this helped, > Andrew > > On Wed, Dec 6, 2017 at 7:33 AM, Boris Tyukin > wrote: > > this is smart, we are doing the same thing but the best part that > attracts me to Kudu is replacing our main HDFS storage with Kudu > to enable near RT use cases and not to deal with HBase and a > Lambda architecture mess so reliability and scalability is a big > deal for us as we are looking to move most of our data to Kudu. > > On Wed, Dec 6, 2017 at 9:58 AM, Petter von Dolwitz (Hem) > > wrote: > > Hi Boris, > > we do not have a Cloudera contract at the moment. Until we > gained more Kudu experience we keep our master data in parquet > format so that we can rebuild Kudu-tables upon errors. We are > still in the early learning phase. > > Br, > Petter > > > > 2017-12-06 14:35 GMT+01:00 Boris Tyukin >: > > this is definitely concerning thread for us looking to use > Impala for storing mission-critical company data. Petter, > are you paid Cloudera customer btw? I wonder if you opened > support ticket as well > > On Wed, Dec 6, 2017 at 7:26 AM, Petter von Dolwitz (Hem) > > wrote: > > Thanks for your reply Andrew! > > >How did you verify that all the data was inserted and > how did you find some data missing? > This was done using Impala. We counted the rows for > groups representing the chunks we inserted. > > >Following up on what I posted, take a look at > https://kudu.apache.org/docs/transaction_semantics.html#_read_operations_scans > . > It seems definitely possible that not all of the rows > had finished inserting when counting, or that the > scans were sent to a stale replica. > Before we shut down we could only see the following in > the logs. I.e., no sign that ingestion was still ongoing. > > kudu-tserver.ip-xx-yyy-z-nnn.root.log.INFO.20171201-065232.90314:I1201 > 07:27:35.010694 90793 maintenance_manager.cc:383] P > a38902afefca4a85a5469d149df9b4cb: we have exceeded our > soft memory limit (current capacity is 67.52%). > However, there are no ops currently runnable which > would free memory. > > Also the (cloudera) metric > total_kudu_rows_inserted_rate_across_kudu_replicas > showed zero. > > Still it seems like some data became inconsistent > after restart. But if the maintenance_manager performs > important jobs that are required to ensure that all > data is inserted then I can understand why we ended up > with inconsistent data. But, if I understand you > correct, you are saying that these jobs are not > critical for ingestion. In the link you provided I > read "Impala scans are currently performed as > READ_LATEST and have no consistency guarantees.". I > would assume this means that it does not guarantee > consistency if new data is inserted but should give > valid (and same) results if no new data is inserted? > > I have not tried the ksck tool yet. Thank you for > reminding. I will have a look. > > Br, > Petter > > > 2017-12-06 1:31 GMT+01:00 Andrew Wong > >: > > How did you verify that all the data was > inserted and how did you find some data > missing? I'm wondering if it's possible that > the initial "missing" data was data that Kudu > was still in the process of inserting (albeit > slowly, due to memory backpressure or somesuch). > > > Following up on what I posted, take a look at > https://kudu.apache.org/docs/transaction_semantics.html#_read_operations_scans > . > It seems definitely possible that not all of the > rows had finished inserting when counting, or that > the scans were sent to a stale replica. > > On Tue, Dec 5, 2017 at 4:18 PM, Andrew Wong > > > wrote: > > Hi Petter, > > When we verified that all data was > inserted we found that some data was > missing. We added this missing data and on > some chunks we got the information that > all rows were already present, i.e impala > says something like Modified: 0 rows, > nnnnnnn errors. Doing the verification > again now shows that the Kudu table is > complete. So, even though we did not > insert any data on some chunks, a count(*) > operation over these chunks now returns a > different value. > > > How did you verify that all the data was > inserted and how did you find some data > missing? I'm wondering if it's possible that > the initial "missing" data was data that Kudu > was still in the process of inserting (albeit > slowly, due to memory backpressure or somesuch). > > Now to my question. Will data be > inconsistent if we recycle Kudu after > seeing soft memory limit warnings? > > > Your data should be consistently written, even > with those warnings. AFAIK they would cause a > bit of slowness, not incorrect results. > > Is there a way to tell when it is safe to > restart Kudu to avoid these issues? Should > we use any special procedure when > restarting (e.g. only restart the tablet > servers, only restart one tablet server at > a time or something like that)? > > > In general, you can use the `ksck` tool to > check the health of your cluster. See > https://kudu.apache.org/docs/command_line_tools_reference.html#cluster-ksck > > for more details. For restarting a cluster, I > would recommend taking down all tablet servers > at once, otherwise tablet replicas may try to > replicate data from the server that was taken > down. > > Hope this helped, > Andrew > > On Tue, Dec 5, 2017 at 10:42 AM, Petter von > Dolwitz (Hem) > wrote: > > Hi Kudu users, > > We just started to use Kudu > (1.4.0+cdh5.12.1). To make a baseline for > evaluation we ingested 3 month worth of > data. During ingestion we were facing > messages from the maintenance threads that > a soft memory limit were reached. It seems > like the background maintenance threads > stopped performing their tasks at this > point in time. It also so seems like the > memory was never recovered even after > stopping ingestion so I guess there was a > large backlog being built up. I guess the > root cause here is that we were a bit too > conservative when giving Kudu memory. > After a reststart a lot of maintenance > tasks were started (i.e. compaction). > > When we verified that all data was > inserted we found that some data was > missing. We added this missing data and on > some chunks we got the information that > all rows were already present, i.e impala > says something like Modified: 0 rows, > nnnnnnn errors. Doing the verification > again now shows that the Kudu table is > complete. So, even though we did not > insert any data on some chunks, a count(*) > operation over these chunks now returns a > different value. > > Now to my question. Will data be > inconsistent if we recycle Kudu after > seeing soft memory limit warnings? > > Is there a way to tell when it is safe to > restart Kudu to avoid these issues? Should > we use any special procedure when > restarting (e.g. only restart the tablet > servers, only restart one tablet server at > a time or something like that)? > > The table design uses 50 tablets per day > (times 90 days). It is 8 TB of data after > 3xreplication over 5 tablet servers. > > Thanks, > Petter > > > > > > -- > Andrew Wong > > > > > -- > Andrew Wong > > > > > > > > > -- > Andrew Wong