Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 53C91200CB0 for ; Fri, 23 Jun 2017 22:47:26 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 524F6160BE5; Fri, 23 Jun 2017 20:47:26 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 71A81160BE2 for ; Fri, 23 Jun 2017 22:47:25 +0200 (CEST) Received: (qmail 39051 invoked by uid 500); 23 Jun 2017 20:47:24 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 39039 invoked by uid 99); 23 Jun 2017 20:47:24 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Jun 2017 20:47:24 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 9C2BCC0922 for ; Fri, 23 Jun 2017 20:47:23 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.147 X-Spam-Level: X-Spam-Status: No, score=-0.147 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-2.796, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 7KfmlK3q_Ekl for ; Fri, 23 Jun 2017 20:47:21 +0000 (UTC) Received: from mail-qk0-f172.google.com (mail-qk0-f172.google.com [209.85.220.172]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 11E355F2A8 for ; Fri, 23 Jun 2017 20:47:21 +0000 (UTC) Received: by mail-qk0-f172.google.com with SMTP id 16so44759047qkg.2 for ; Fri, 23 Jun 2017 13:47:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=U9kfk09+TTzXiiX+sW8vUuwFHeCriWRT6w4wFH6Et1o=; b=Lwlt0joQOphnjc/uBSVMdWNKCpurMJBG+NgfeTXLVUno8C3m/7ZsZGkJAk/Kmfh2Hi V0I0wGu4T6Ts9u9N45WqPrZhJW57oub667JFQJJzWy7jxazx8hZI6SyP3zZS98HqakLw oDx3P6p6Z49SNPfZXW5aowHA3kBgn2X6jrvJtPg+PPd7EaIVVwH4Kj+jEmpgQG8HJ+FW dCQBV8ddq2+QIdPlDcNMPnfNqHGKRDkj17JLX/eMZ2wSCDgAq7XHcKfUC3ciLJEd7Bb5 8gRtBB7uu1LinsioKQA44HVLwHyCTJ1HH9R6v0uddlhkw2Qabh7YxoZXjl67Gd5nrHGl BSWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=U9kfk09+TTzXiiX+sW8vUuwFHeCriWRT6w4wFH6Et1o=; b=DQHs/vzu3pAKbrcwxHiobCCQXq3mCRpAm/GgSbRYzDjxNIgUAARJJqBq5PVH1K5bBm hfoOvd/6yLjvVVBvTgCUGgrhqjSinRf4uOzkU742jNa6BjkEkO4NDw8ZHDavNUhdi1r9 yykghdyOOQSBQ8KssSghGIKnkSgLy3w5Sx8tSetFhfKvFprTEqqczMEAlJU9c3aOgCe4 +a2me5luqJPc97sGN2ujFivO6L8n2ShFCm91Pf8wKXs9VcDkfOPDmvlN/Zs3zT1MkFDA 0NwIQR8X9FjbMe5LPJtb3QtzrCxsui05bA2yhafmCt4gHwdeW+hadkzwpSsZsl1m3CEr EybQ== X-Gm-Message-State: AKS2vOwpT3+YIczi3JPnL/xuEFycEhqORsKfOoWMSReZlrCw1ZtNan0O t0GpaJJHAbRrxH/N1fmvxId6xkGHE33I X-Received: by 10.55.148.67 with SMTP id w64mr11311347qkd.160.1498250834331; Fri, 23 Jun 2017 13:47:14 -0700 (PDT) MIME-Version: 1.0 Received: by 10.200.34.57 with HTTP; Fri, 23 Jun 2017 13:47:13 -0700 (PDT) In-Reply-To: References: From: Weiqing Yang Date: Fri, 23 Jun 2017 13:47:13 -0700 Message-ID: Subject: Re: [DISCUSS] status of and plans for our hbase-spark integration To: dev@hbase.apache.org Content-Type: multipart/alternative; boundary="94eb2c084e84c3ce410552a6b4d0" archived-at: Fri, 23 Jun 2017 20:47:26 -0000 --94eb2c084e84c3ce410552a6b4d0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Not yet. I'll create one after we decide next move. Thanks, weiqing On Fri, Jun 23, 2017 at 1:08 PM, Mike Drob wrote: > Weiqing, do you have JIRA issues filed for your progress that folks can > follow along? > > Sean, another item to watch might be the state of HBase+PySpark. Our > friends at Huawei have done this work [1][2] and maybe it is another > candidate for inclusion and first class support? I'm not sure who the bes= t > contact for this would be, however. > > Mike > > [1]: http://huaweibigdata.github.io/astro/ > [2]: https://github.com/Huawei-Spark/Spark-SQL-on-HBase > > On Fri, Jun 23, 2017 at 2:42 PM, Weiqing Yang > wrote: > > > Thanks, Sean! > > > > You are right, SHC has its own pluggable system for data > encoding/decoding. > > Only phoenix encoding is not in Apache Hbase Spark. > > > > We have shepherded almost all of SHC changes from SHC Git repo to Apach= e > > Hbase except the features of supporting multiple secure Hbase clusters > and > > Phoenix data coder. Next week we=E2=80=99ll have a discussion to decide= whether > > these latest code changes will be shepherded to Apache Hbase since SHC > has > > used Phoenix encoding/decoding to support Phoenix data. I'll update the > > next steps here after the discussion next week. > > > > For Composite Key, the current patch is still under reviewing, but it > > brings some concerns. That's also one of the reasons to bring the Phoen= ix > > encoding/decoding in SHC. > > > > Regards, > > > > Weiqing > > > > On Fri, Jun 23, 2017 at 12:20 PM, Stack wrote: > > > > > On Fri, Jun 23, 2017 at 10:30 AM, Sean Busbey > wrote: > > > > > > > On Fri, Jun 23, 2017 at 12:06 PM, Stack wrote: > > > > > On Wed, Jun 21, 2017 at 9:31 AM, Sean Busbey > > > wrote: > > > > >.... > > > > > I don't know enough about the integration but is the 'handling of > > > Phoenix > > > > > encoded data' about mapping spark types to a serialization in > hbase? > > If > > > > > not, where is the need for seamless transforms between spark type= s > > and > > > a > > > > > natural hbase serialization listed. We need this IIRC. > > > > > > > > > > > > > It's a subtask, really. We already have a pluggable system for > mapping > > > > between spark types and a couple of serialization options (the docs > > > > need improvement?). > > > > > > > > > > > > > > SHC has its own pluggable system and has the addition of a phoenix > > > > encoding. The set seems like the most likely out-of-the-box formats > > > > folks might have something in. (I thinkMaybe Kite? I think it's > > > > different than the rest.) > > > > > > > > Or are you saying we can just map all of it the the hbase-common > > > > "types" and then do the pluggable part under it? > > > > > > > > > > > > > Not making any prescription. Was just worried about type marshalling = in > > and > > > out of spark concerned that the serialization would be other than > > something > > > 'natural' for hbase, that it not performant, and that we might have a > > > profusion of mechanisms. > > > > > > If a noted subtask, thats grand. > > > > > > Thanks, > > > S > > > > > > --94eb2c084e84c3ce410552a6b4d0--