Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4B2E6200C28 for ; Mon, 13 Mar 2017 20:17:09 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 49D02160B5D; Mon, 13 Mar 2017 19:17:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 94F87160B6C for ; Mon, 13 Mar 2017 20:17:08 +0100 (CET) Received: (qmail 81875 invoked by uid 500); 13 Mar 2017 19:17:07 -0000 Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.incubator.apache.org Received: (qmail 81857 invoked by uid 99); 13 Mar 2017 19:17:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Mar 2017 19:17:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 0E18518610E for ; Mon, 13 Mar 2017 19:17:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.362 X-Spam-Level: X-Spam-Status: No, score=0.362 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id LTNwodNuU2cA for ; Mon, 13 Mar 2017 19:17:05 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id D2DA35F397 for ; Mon, 13 Mar 2017 19:17:04 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id v2DJH4xi015184; Mon, 13 Mar 2017 19:17:04 GMT Message-Id: <201703131917.v2DJH4xi015184@ip-10-146-233-104.ec2.internal> Date: Mon, 13 Mar 2017 19:17:03 +0000 From: "Attila Jeges (Code Review)" To: impala-cr@cloudera.com, reviews@impala.incubator.apache.org CC: Marcel Kornacker , Michael Ho Reply-To: attilaj@cloudera.com X-Gerrit-MessageType: newpatchset Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-3079=3A_Fix_sequence_file_writer=0A?= X-Gerrit-Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487 X-Gerrit-ChangeURL: X-Gerrit-Commit: e5e9a2277028b2483836d13dd75b8cd253884a71 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.12.7 archived-at: Mon, 13 Mar 2017 19:17:09 -0000 Attila Jeges has uploaded a new patch set (#4). Change subject: IMPALA-3079: Fix sequence file writer ...................................................................... IMPALA-3079: Fix sequence file writer Before the fix, sequence file writer produced corrupt files in some cases. Steps to reproduce: SET ALLOW_UNSUPPORTED_FORMATS=1; create table store_sales_seq_snap like tpcds_parquet.store_sales stored as SEQUENCEFILE; insert into store_sales_seq_snap partition(ss_sold_date_sk) select * from tpcds_parquet.store_sales where ss_sold_date_sk between 2450816 and 2451200; The insert statement produces a corrupt file that cannot be read back. This change fixes: - The implementation of zero-compressed encoding in ReadWriteUtil class. - The calculation of block sizes in SnappyBlockCompressor class. - Creating record/block compressed sequence files in HdfsSequenceTableWriter class. Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487 --- M be/src/exec/hdfs-sequence-table-writer.cc M be/src/exec/hdfs-sequence-table-writer.h M be/src/exec/read-write-util-test.cc M be/src/exec/read-write-util.h M be/src/util/compress.cc M be/src/util/decompress-test.cc M testdata/workloads/functional-query/queries/QueryTest/seq-writer.test M tests/query_test/test_compressed_formats.py 8 files changed, 396 insertions(+), 78 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/6107/4 -- To view, visit http://gerrit.cloudera.org:8080/6107 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho