From dev-return-19124-archive-asf-public=cust-asf.ponee.io@nifi.apache.org Fri Apr 26 00:04:53 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 6CE0F180638 for ; Fri, 26 Apr 2019 02:04:53 +0200 (CEST) Received: (qmail 93780 invoked by uid 500); 26 Apr 2019 00:04:52 -0000 Mailing-List: contact dev-help@nifi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@nifi.apache.org Delivered-To: mailing list dev@nifi.apache.org Received: (qmail 93764 invoked by uid 99); 26 Apr 2019 00:04:52 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Apr 2019 00:04:52 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id E05C6C24C2 for ; Fri, 26 Apr 2019 00:04:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.198 X-Spam-Level: X-Spam-Status: No, score=-0.198 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id NEnvnBhM9Wsg for ; Fri, 26 Apr 2019 00:04:45 +0000 (UTC) Received: from mail-qt1-f193.google.com (mail-qt1-f193.google.com [209.85.160.193]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id F304B5F568 for ; Fri, 26 Apr 2019 00:04:44 +0000 (UTC) Received: by mail-qt1-f193.google.com with SMTP id s10so2187993qtc.11 for ; Thu, 25 Apr 2019 17:04:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:content-transfer-encoding:mime-version:subject:date:references :to:in-reply-to:message-id; bh=0uzrutSonThkeZmvJoyMYcKkXiIMq6LUHk1mNvB8paU=; b=BgBwnWYwSBlE+ngZYcj2T9Y0jsm7rEjXxOu7YWoCz8GUHXFYDmcMK7dqyruwLZDgJ/ 4iU0oS7QFaUb0CACV2M9t7ugLG64zaZjbNqqWsCmOw/Q4ZmqvaytLjDx42mgo8HyZo6O YghGmwYLtBf7NfjxLglAXpTVYz6XDm/tbcfdKydxt4hKss/523Uf9QOrWKLbzAKGGB9M 7O1DjPvGWamMOP1MW6Yv23FRzG6cfH12fmzdM9tw9wVuF0LcBPtYC518QXXivxwC72C4 6wP/cc3sWARRgIlahwSnNbyTXCY9EhR/hnMtOaxAcbjvcBmjXymHFxtBCIZSwqyGB0Pf ttFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:date:references:to:in-reply-to:message-id; bh=0uzrutSonThkeZmvJoyMYcKkXiIMq6LUHk1mNvB8paU=; b=p3w7fQBsfI9nwSYLzErztGYPu/xsncFS0FRD53VzErXAECV1f89q+WzIioRd0jMm2p 9ZDFfj/2lD/DxCJ5gQWjupESCQaXA2XXOuT2l/gHFI8vEsC86U4/Qrrjv13NinpVhi/K z45zV93nd11PlOik8gvhsg2PpW7bxFmdKhK9cIELX4JI3RvpB7PTI5vX4qthZMF5Sjfy ALPH8dTyNYKXZUOs7Oc2F4brz33AeEmSA7owcmI4RZnZvtv0qL89AtTEpZOEvap1ZunJ k5H+VZJdHET33j1YRrZhQLiufLCDIDalwDc53SYfAh36AUfgr/tGmN0kVUIgNF1Rd5UG gqvA== X-Gm-Message-State: APjAAAVnUT0TJ8Xiv+w8LLB4ajjyo7izE3XY2l5rXFWRt69pJT9qz//H 2q/wrZNkGz+UItGBFAlpkhW3SXUB X-Google-Smtp-Source: APXvYqx+tGgKPpc3+ud0dEyd4cfrpez8W506JdGZlV7QPg6o65rmBpT8JMhn+nUe0gsNOIFS1hP9pw== X-Received: by 2002:ac8:3241:: with SMTP id y1mr19881298qta.114.1556237084097; Thu, 25 Apr 2019 17:04:44 -0700 (PDT) Received: from [10.99.10.6] ([194.59.251.61]) by smtp.gmail.com with ESMTPSA id u15sm14963944qth.54.2019.04.25.17.04.42 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 17:04:43 -0700 (PDT) From: Bryan Bende Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: Adding HBase Support for AtomicDistributedMapCacheClient Date: Thu, 25 Apr 2019 20:04:40 -0400 References: To: dev@nifi.apache.org In-Reply-To: Message-Id: <7144D548-34E3-48A9-B0EB-B0AAA51822AB@gmail.com> X-Mailer: Apple Mail (2.3445.102.3) Should be available through the existing scan methods, they take a = ResultHandler which gets passed an array of ResultCells, and each one = has the timestamp.=20 > On Apr 25, 2019, at 7:52 PM, Shawn Weeks = wrote: >=20 > I haven't looked at the other side of equation yet and that's how to = get the timestamp on fetch. That will probably require a change or new = scan method. >=20 > Thanks > Shawn >=20 > -----Original Message----- > From: Bryan Bende =20 > Sent: Thursday, April 25, 2019 4:29 PM > To: dev@nifi.apache.org > Subject: Re: Adding HBase Support for AtomicDistributedMapCacheClient >=20 > Also just realized that we do have two versions of the HBase DMC = client service, so they could each do different things. >=20 > The HBase_1_1_2_ClientMapCacheService could call the original = checkAndPut, and the HBase_2_x_ClientMapCacheService could call the = method. >=20 > In this approach the 1_1_2 client service could throw unsupported for = the new method since it would never be used. >=20 > On Thu, Apr 25, 2019 at 5:25 PM Bryan Bende wrote: >>=20 >> Thanks, I'm following now... >>=20 >> I think adding the new method to the interface and throwing=20 >> UnsupportedOperationException for 1_1_2, or using the original=20 >> checkAndPut and implementing it in both services, would both be fine=20= >> solutions. >>=20 >> I guess another variation might be to introduce the new method in the=20= >> interface, but in the 1_1_2 implementation just delegate back to the=20= >> original checkAndPut and ignore the timestamp, and document that it=20= >> isn't used in that implementation. I don't love this, but it does=20 >> allow both services to implement the functionality and still leverage=20= >> the better solution for 2_x. >>=20 >>=20 >> On Thu, Apr 25, 2019 at 3:54 PM Shawn Weeks = wrote: >>>=20 >>> Here is what I think the new checkAndPut or checkAndMutate method = would look like. This also shows what the new mutate api looks like. >>>=20 >>> @Override >>> public boolean checkAndPut(String tableName, byte[] rowId, byte[] = family, byte[] qualifier, byte[] value, long timestamp, PutColumn = column) throws IOException { >>> try (final Table table =3D = connection.getTable(TableName.valueOf(tableName))) { >>> Put put =3D new Put(rowId); >>> put.addColumn( >>> column.getColumnFamily(), >>> column.getColumnQualifier(), >>> column.getBuffer()); >>> return table.checkAndMutate(rowId, = family).qualifier(qualifier).ifEquals(value).timeRange(TimeRange.at(timest= amp)).thenPut(put); >>> } >>> } >>>=20 >>> If the atomic guarantee for the original checkAndPut is good enough = then there is no reason I can't implement the atomic map cache for both = versions of HBase. >>>=20 >>> Thanks >>> Shawn >>>=20 >>> -----Original Message----- >>> From: Bryan Bende >>> Sent: Thursday, April 25, 2019 12:39 PM >>> To: dev@nifi.apache.org >>> Subject: Re: Adding HBase Support for=20 >>> AtomicDistributedMapCacheClient >>>=20 >>> I'm not totally if would matter if there were changes in between, as = long as the current value is what we thought it was then the changes we = are sending back should be accurate as a replacement. As a simplified = scenario, if the current value is 1 and thread-A retrieves that value, = thread-B then changes it to 2 and back to 1 before thread-A can do = anything, then thread-A sends in 2 with a previous of 1, that is still = the correct replacement. >>>=20 >>> I can see the argument for using the timestamp though... can you = show the method signature of the new checkAndMutate method that would = need to be added to the client service, and also which method of the = HBase client it needs to call? >>>=20 >>> Just so I can get an idea of the differences between 1.x and 2.x. >>>=20 >>> On Thu, Apr 25, 2019 at 1:00 PM Shawn Weeks = wrote: >>>>=20 >>>> While checkAndPut is atomic as it's built now it doesn't support = also checking the timestamp range which is included in the new = checkAndMutate API. I had planned on using the cell's timestamp as the = revision along with the value to ensure not only that the value hadn't = been changed but that there hadn't been changes in between that just = happened to put the value back. >>>>=20 >>>> As I was looking at everything I had another question. Why is the = cache currently using a scan instead of a get to fetch values from = HBase. It seems like that would be much less performant considering we = know the row key we're looking for. >>>>=20 >>>>=20 >>>> Thanks >>>> Shawn >>>>=20 >>>> -----Original Message----- >>>> From: Bryan Bende >>>> Sent: Thursday, April 25, 2019 11:56 AM >>>> To: dev@nifi.apache.org >>>> Subject: Re: Adding HBase Support for=20 >>>> AtomicDistributedMapCacheClient >>>>=20 >>>> Can it not be done with the existing checkAndPut method? [1] >>>>=20 >>>> I think if you use the value as the revision it should work. Would = be similar to how the Redis implementation works [2]. >>>>=20 >>>> [1] >>>> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-s >>>> tand=20 >>>> ard-services/nifi-hbase-client-service-api/src/main/java/org/apach >>>> e/ni >>>> fi/hbase/HBaseClientService.java#L65 >>>> [2] >>>> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-r >>>> edis=20 >>>> -bundle/nifi-redis-extensions/src/main/java/org/apache/nifi/redis/ >>>> serv >>>> ice/RedisDistributedMapCacheClientService.java#L271 >>>>=20 >>>> On Thu, Apr 25, 2019 at 12:38 PM Shawn Weeks = wrote: >>>>>=20 >>>>> I'll need to add a check and mutate method to the = HBaseClientService Interface, should I just extend with a = HBase2ClientService or add checkAndMutate to the existing interface and = just make it raise an exception if you try and use it against hbase 1? = While Hbase 1.x supports checkAndMutate it doesn't provide a way to = filter on timestamp which is part of how I was going to implement the = revision requirement for AtomicMapCache. >>>>>=20 >>>>> Thanks >>>>> Shawn >>>>>=20 >>>>> -----Original Message----- >>>>> From: Bryan Bende >>>>> Sent: Thursday, April 25, 2019 9:11 AM >>>>> To: dev@nifi.apache.org >>>>> Subject: Re: Adding HBase Support for=20 >>>>> AtomicDistributedMapCacheClient >>>>>=20 >>>>> I'm not aware of a JIRA, so I'd say go for it. >>>>>=20 >>>>> On Wed, Apr 24, 2019 at 9:27 PM Shawn Weeks = wrote: >>>>>>=20 >>>>>> Seems like this should be fairly easy for HBase 2.x with the = checkAndMutate functionality and I was wondering if there is already a = Jira for this. Otherwise I might make an attempt at it. It would be good = to be able to support Wait/Notify and other things that need = AtomicDistributedMapCacheClient using an Apache developed product = commonly found in a Hadoop Cluster. >>>>>>=20 >>>>>> Thanks >>>>>> Shawn