Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D5E4B200BBD for ; Tue, 8 Nov 2016 10:53:18 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id D4CF0160B0C; Tue, 8 Nov 2016 09:53:18 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id CFEA7160AFA for ; Tue, 8 Nov 2016 10:53:17 +0100 (CET) Received: (qmail 74701 invoked by uid 500); 8 Nov 2016 09:53:17 -0000 Mailing-List: contact dev-help@hawq.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hawq.incubator.apache.org Delivered-To: mailing list dev@hawq.incubator.apache.org Received: (qmail 74690 invoked by uid 99); 8 Nov 2016 09:53:17 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Nov 2016 09:53:17 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 900C41A9A63 for ; Tue, 8 Nov 2016 09:53:16 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -6.219 X-Spam-Level: X-Spam-Status: No, score=-6.219 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-2.999] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id x1S-mBD94EGC for ; Tue, 8 Nov 2016 09:53:15 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id 7262B5F23C for ; Tue, 8 Nov 2016 09:53:14 +0000 (UTC) Received: (qmail 68991 invoked by uid 99); 8 Nov 2016 09:51:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Nov 2016 09:51:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 865332C0276 for ; Tue, 8 Nov 2016 09:51:58 +0000 (UTC) Date: Tue, 8 Nov 2016 09:51:58 +0000 (UTC) From: "Chunling Wang (JIRA)" To: dev@hawq.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HAWQ-1149) Built-in function gp_persistent_build_all loses data in gp_relfile_node and gp_persistent_relfile_node MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 08 Nov 2016 09:53:19 -0000 Chunling Wang created HAWQ-1149: ----------------------------------- Summary: Built-in function gp_persistent_build_all loses data in gp_relfile_node and gp_persistent_relfile_node Key: HAWQ-1149 URL: https://issues.apache.org/jira/browse/HAWQ-1149 Project: Apache HAWQ Issue Type: Bug Components: Core Reporter: Chunling Wang Assignee: Lei Chang When we create a new table, and insert data into it. There will be records in gp_relfile_node, gp_persistent_relfile_node and gp_persistent_relation_node. But if we run the HAWQ build-in function gp_persistent_build_all, we will find that the record in gp_relfile_node and gp_persistent_relfile_node for this table is lost. And if there are more than 1 file in this talbe, we will get error when we drop this table. Here are the steps to recur this bug: 1. Create table a, and insert data into a with two concurrent process: {code} postgres=# create table a(id int); CREATE TABLE postgres=# insert into a select generate_series(1, 10000000); INSERT 0 10000000 {code} {code} postgres=# insert into a select generate_series(10000000, 20000000); INSERT 0 10000001 {code} 2. Check the persistent table and find two files in this table's directory: {code} postgres=# select oid from pg_class where relname='a'; oid --------- 3017232 (1 row) postgres=# select * from gp_relfile_node where relfilenode_oid=3017232; relfilenode_oid | segment_file_num | persistent_tid | persistent_serial_num -----------------+------------------+----------------+----------------------- 3017232 | 1 | (4,128) | 855050 3017232 | 2 | (4,129) | 855051 (2 rows) postgres=# select * from gp_persistent_relation_node where relfilenode_oid=3017232; tablespace_oid | database_oid | relfilenode_oid | persistent_state | reserved | parent_xid | persistent_serial_num | previous_free_tid ----------------+--------------+-----------------+------------------+----------+------------+-----------------------+------------------- 16385 | 16387 | 3017232 | 2 | 0 | 0 | 158943 | (0,0) (1 row) postgres=# select * from gp_persistent_relfile_node where relfilenode_oid=3017232; tablespace_oid | database_oid | relfilenode_oid | segment_file_num | relation_storage_manager | persistent_state | relation_bufpool_kind | parent_xid | persistent_serial_num | previous_free_tid ----------------+--------------+-----------------+------------------+--------------------------+------------------+-----------------------+------------+-----------------------+------------------- 16385 | 16387 | 3017232 | 1 | 2 | 2 | 0 | 0 | 855050 | (0,0) 16385 | 16387 | 3017232 | 2 | 2 | 2 | 0 | 0 | 855051 | (0,0) (2 rows) hadoop fs -ls /hawq_default/16385/16387/3017232 -rw------- 3 wangchunling supergroup 100103584 2016-11-08 17:02 /hawq_default/16385/16387/3017232/1 -rw------- 3 wangchunling supergroup 100103600 2016-11-08 17:02 /hawq_default/16385/16387/3017232/2 {code} 3. Rebuilt persistent tables. {code} postgres=# insert into a select generate_series(10000000, 20000000); INSERT 0 10000001 postgres=# select gp_persistent_reset_all(); gp_persistent_reset_all ------------------------- 1 (1 row) postgres=# select gp_persistent_build_all(false); gp_persistent_build_all ------------------------- 1 (1 row) {code} 4. Check persistent table and find data lost in gp_relfile_node and gp_persistent_relfile_node. {code} postgres=# select * from gp_relfile_node where relfilenode_oid=3017232; relfilenode_oid | segment_file_num | persistent_tid | persistent_serial_num -----------------+------------------+----------------+----------------------- (0 rows) postgres=# select * from gp_persistent_relation_node where relfilenode_oid=3017232; tablespace_oid | database_oid | relfilenode_oid | persistent_state | reserved | parent_xid | persistent_serial_num | previous_free_tid ----------------+--------------+-----------------+------------------+----------+------------+-----------------------+------------------- 16385 | 16387 | 3017232 | 2 | 0 | 0 | 159020 | (0,0) (1 row) postgres=# select * from gp_persistent_relfile_node where relfilenode_oid=3017232; tablespace_oid | database_oid | relfilenode_oid | segment_file_num | relation_storage_manager | persistent_state | relation_bufpool_kind | parent_xid | persistent_serial_num | previous_free_tid ----------------+--------------+-----------------+------------------+--------------------------+------------------+-----------------------+------------+-----------------------+------------------- (0 rows) {code} 5. Drop talbe a and get error. {code} postgres=# DROP TABLE a; ERROR: TID for persistent 'Relation Directory: '16385/16387/3017232'' tuple is invalid (0,0) (index 2, transaction kind 'Commit') (persistentendxactrec.c:264) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)