Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 78250200B57 for ; Fri, 22 Jul 2016 21:40:05 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 76B08160A5A; Fri, 22 Jul 2016 19:40:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C136B160A6D for ; Fri, 22 Jul 2016 21:40:04 +0200 (CEST) Received: (qmail 78370 invoked by uid 500); 22 Jul 2016 19:40:04 -0000 Mailing-List: contact dev-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@impala.incubator.apache.org Delivered-To: mailing list dev@impala.incubator.apache.org Received: (qmail 78312 invoked by uid 99); 22 Jul 2016 19:40:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Jul 2016 19:40:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 3C2CBCADD6 for ; Fri, 22 Jul 2016 19:40:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.362 X-Spam-Level: X-Spam-Status: No, score=0.362 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id i43POcnE8thj for ; Fri, 22 Jul 2016 19:40:01 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id 0A76460CC4 for ; Fri, 22 Jul 2016 19:40:00 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id u6MJdp8q028324; Fri, 22 Jul 2016 19:39:51 GMT Date: Fri, 22 Jul 2016 19:39:51 +0000 From: "David Knupp (Code Review)" To: impala-cr@cloudera.com, dev@impala.incubator.apache.org Message-ID: Reply-To: dknupp@cloudera.com X-Gerrit-MessageType: newchange Subject: =?UTF-8?Q?=5BImpala-CR=5D=28cdh5-trunk=29_IMPALA-2013=3A_Issue_Hbase_queries_individually_during_data-load=2E=0A?= X-Gerrit-Change-Id: I911d972ba8ad3a2a084c8195074556153722c7e2 X-Gerrit-ChangeURL: X-Gerrit-Commit: aec951dad49cfc7e24323415e9462373dfbc0bd1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.12.2 archived-at: Fri, 22 Jul 2016 19:40:05 -0000 David Knupp has uploaded a new change for review. http://gerrit.cloudera.org:8080/3728 Change subject: IMPALA-2013: Issue Hbase queries individually during data-load. ...................................................................... IMPALA-2013: Issue Hbase queries individually during data-load. Loading data into HBase has traditionally been a bit flaky, with problems being hard to diagnose from existing logs. I think this is at least in part due to the fact that we have been relying on a command file to send queries to the HBase shell. When sending a series of queries in a file, the HBase shell does not check or halt operation after each query. From https://hbase.apache.org/book.html#_read_hbase_shell_commands_from_a_command_file "There is no way to programmatically check each individual command for success or failure. Also, though you see the output for each command, the commands themselves are not echoed to the screen so it can be difficult to line up the command with its output." Even if the HBase process dies completely, our data load process goes through the laborious process of continuin to send commands to the shell. Instead, the command file generated by generate-schema-statements.py should be iterated line-by-line, with each query being passed individually to the HBase shell, checking for errors in the output each time. If we get an error message, fail fast and loudly. Also, fix several flake8 linter complaints, and replace print statements with specific log level output. Change-Id: I911d972ba8ad3a2a084c8195074556153722c7e2 --- M bin/load-data.py 1 file changed, 102 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/28/3728/1 -- To view, visit http://gerrit.cloudera.org:8080/3728 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I911d972ba8ad3a2a084c8195074556153722c7e2 Gerrit-PatchSet: 1 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: David Knupp