Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DA0E0200C0E for ; Wed, 1 Feb 2017 16:41:52 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id D8937160B44; Wed, 1 Feb 2017 15:41:52 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 27CB6160B43 for ; Wed, 1 Feb 2017 16:41:52 +0100 (CET) Received: (qmail 44642 invoked by uid 500); 1 Feb 2017 15:41:51 -0000 Mailing-List: contact issues-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list issues@flink.apache.org Received: (qmail 44633 invoked by uid 99); 1 Feb 2017 15:41:51 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Feb 2017 15:41:51 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id E778E1A04F9 for ; Wed, 1 Feb 2017 15:41:50 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -7.018 X-Spam-Level: X-Spam-Status: No, score=-7.018 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-2.999, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id nxR0_bN9OQtI for ; Wed, 1 Feb 2017 15:41:49 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id CDA6B5F19B for ; Wed, 1 Feb 2017 15:41:47 +0000 (UTC) Received: (qmail 44615 invoked by uid 99); 1 Feb 2017 15:41:47 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Feb 2017 15:41:47 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id EFE57DFC63; Wed, 1 Feb 2017 15:41:46 +0000 (UTC) From: wuchong To: issues@flink.incubator.apache.org Reply-To: issues@flink.incubator.apache.org References: In-Reply-To: Subject: [GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource Content-Type: text/plain Message-Id: <20170201154146.EFE57DFC63@git1-us-west.apache.org> Date: Wed, 1 Feb 2017 15:41:46 +0000 (UTC) archived-at: Wed, 01 Feb 2017 15:41:53 -0000 Github user wuchong commented on the issue: https://github.com/apache/flink/pull/3149 Sorry for the late response. Regarding to the `HBaseTableSchema`, I agree with that to move the `addColumn(...)` method into `HBaseTableSource`. Regarding to the nested vs flat schema, I prefer the nested schema. It is more intuitive to use. As for the nested schema doesn't support to push projections down, I think we should extend `ProjectableTableSource` to support push projections down to a composite type. We can keep the interface unchanged, i.e. `def projectFields(fields: Array[Int]): ProjectableTableSource[T]`. But the index of `fields` should be the flat index. We can use the flat field indexes to do projection pushdown even if it is a nested schema. For example, a table source with schema `a: Int, b: Row, c: Boolean`, the flat indexes of `a, b.b1, b.b2, c` are `0, 1, 2, 3`. So a project `SELECT b.b1, c FROM T` will result a `fields` `Array(1,3)`. What do you think ? For me the biggest drawback of a nested schema is the lacking support to push projections down. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastructure@apache.org or file a JIRA ticket with INFRA. ---