nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Weeks <swe...@weeksconsulting.us>
Subject RE: Adding HBase Support for AtomicDistributedMapCacheClient
Date Thu, 25 Apr 2019 23:52:01 GMT
I haven't looked at the other side of equation yet and that's how to get the timestamp on fetch.
That will probably require a change or new scan method.

Thanks
Shawn

-----Original Message-----
From: Bryan Bende <bbende@gmail.com> 
Sent: Thursday, April 25, 2019 4:29 PM
To: dev@nifi.apache.org
Subject: Re: Adding HBase Support for AtomicDistributedMapCacheClient

Also just realized that we do have two versions of the HBase DMC client service, so they could
each do different things.

The HBase_1_1_2_ClientMapCacheService could call the original checkAndPut, and the  HBase_2_x_ClientMapCacheService
could call the method.

In this approach the 1_1_2 client service could throw unsupported for the new method since
it would never be used.

On Thu, Apr 25, 2019 at 5:25 PM Bryan Bende <bbende@gmail.com> wrote:
>
> Thanks, I'm following now...
>
> I think adding the new method to the interface and throwing 
> UnsupportedOperationException for 1_1_2, or using the original 
> checkAndPut and implementing it in both services, would both be fine 
> solutions.
>
> I guess another variation might be to introduce the new method in the 
> interface, but in the 1_1_2 implementation just delegate back to the 
> original checkAndPut and ignore the timestamp, and document that it 
> isn't used in that implementation. I don't love this, but it does 
> allow both services to implement the functionality and still leverage 
> the better solution for 2_x.
>
>
> On Thu, Apr 25, 2019 at 3:54 PM Shawn Weeks <sweeks@weeksconsulting.us> wrote:
> >
> > Here is what I think the new checkAndPut or checkAndMutate method would look like.
This also shows what the new mutate api looks like.
> >
> >     @Override
> >     public boolean checkAndPut(String tableName, byte[] rowId, byte[] family, byte[]
qualifier, byte[] value, long timestamp, PutColumn column) throws IOException {
> >         try (final Table table = connection.getTable(TableName.valueOf(tableName)))
{
> >             Put put = new Put(rowId);
> >             put.addColumn(
> >                     column.getColumnFamily(),
> >                     column.getColumnQualifier(),
> >                     column.getBuffer());
> >             return table.checkAndMutate(rowId, family).qualifier(qualifier).ifEquals(value).timeRange(TimeRange.at(timestamp)).thenPut(put);
> >         }
> >     }
> >
> > If the atomic guarantee for the original checkAndPut is good enough then there is
no reason I can't implement the atomic map cache for both versions of HBase.
> >
> > Thanks
> > Shawn
> >
> > -----Original Message-----
> > From: Bryan Bende <bbende@gmail.com>
> > Sent: Thursday, April 25, 2019 12:39 PM
> > To: dev@nifi.apache.org
> > Subject: Re: Adding HBase Support for 
> > AtomicDistributedMapCacheClient
> >
> > I'm not totally if would matter if there were changes in between, as long as the
current value is what we thought it was then the changes we are sending back should be accurate
as a replacement. As a simplified scenario, if the current value is 1 and thread-A retrieves
that value, thread-B then changes it to 2 and back to 1 before thread-A can do anything, then
thread-A sends in 2 with a previous of 1, that is still the correct replacement.
> >
> > I can see the argument for using the timestamp though... can you show the method
signature of the new checkAndMutate method that would need to be added to the client service,
and also which method of the HBase client it needs to call?
> >
> > Just so I can get an idea of the differences between 1.x and 2.x.
> >
> > On Thu, Apr 25, 2019 at 1:00 PM Shawn Weeks <sweeks@weeksconsulting.us> wrote:
> > >
> > > While checkAndPut is atomic as it's built now it doesn't support also checking
the timestamp range which is included in the new checkAndMutate API. I had planned on using
the cell's timestamp as the revision along with the value to ensure not only that the value
hadn't been changed but that there hadn't been changes in between that just happened to put
the value back.
> > >
> > > As I was looking at everything I had another question. Why is the cache currently
using a scan instead of a get to fetch values from HBase. It seems like that would be much
less performant considering we know the row key we're looking for.
> > >
> > >
> > > Thanks
> > > Shawn
> > >
> > > -----Original Message-----
> > > From: Bryan Bende <bbende@gmail.com>
> > > Sent: Thursday, April 25, 2019 11:56 AM
> > > To: dev@nifi.apache.org
> > > Subject: Re: Adding HBase Support for 
> > > AtomicDistributedMapCacheClient
> > >
> > > Can it not be done with the existing checkAndPut method? [1]
> > >
> > > I think if you use the value as the revision it should work. Would be similar
to how the Redis implementation works [2].
> > >
> > > [1]
> > > https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-s
> > > tand 
> > > ard-services/nifi-hbase-client-service-api/src/main/java/org/apach
> > > e/ni
> > > fi/hbase/HBaseClientService.java#L65
> > > [2]
> > > https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-r
> > > edis 
> > > -bundle/nifi-redis-extensions/src/main/java/org/apache/nifi/redis/
> > > serv
> > > ice/RedisDistributedMapCacheClientService.java#L271
> > >
> > > On Thu, Apr 25, 2019 at 12:38 PM Shawn Weeks <sweeks@weeksconsulting.us>
wrote:
> > > >
> > > > I'll need to add a check and mutate method to the HBaseClientService Interface,
should I just extend with a HBase2ClientService or add checkAndMutate to the existing interface
and just make it raise an exception if you try and use it against hbase 1? While Hbase 1.x
supports checkAndMutate it doesn't provide a way to filter on timestamp which is part of how
I was going to implement the revision requirement for AtomicMapCache.
> > > >
> > > > Thanks
> > > > Shawn
> > > >
> > > > -----Original Message-----
> > > > From: Bryan Bende <bbende@gmail.com>
> > > > Sent: Thursday, April 25, 2019 9:11 AM
> > > > To: dev@nifi.apache.org
> > > > Subject: Re: Adding HBase Support for 
> > > > AtomicDistributedMapCacheClient
> > > >
> > > > I'm not aware of a JIRA, so I'd say go for it.
> > > >
> > > > On Wed, Apr 24, 2019 at 9:27 PM Shawn Weeks <sweeks@weeksconsulting.us>
wrote:
> > > > >
> > > > > Seems like this should be fairly easy for HBase 2.x with the checkAndMutate
functionality and I was wondering if there is already a Jira for this. Otherwise I might make
an attempt at it. It would be good to be able to support Wait/Notify and other things that
need AtomicDistributedMapCacheClient using an Apache developed product commonly found in a
Hadoop Cluster.
> > > > >
> > > > > Thanks
> > > > > Shawn
Mime
View raw message