accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-1551) Introduce Generic Supertypes to Replace Text
Date Fri, 05 Jul 2013 16:01:48 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13700952#comment-13700952
] 

Keith Turner commented on ACCUMULO-1551:
----------------------------------------

I took a look at the changes.  This API change builds around EntryConverter, which converts
a Key+Value to an arbitrary Java object.  As you mentioned this approach may not work well
w/ heterogeneous data.   Typo suffered from the same issue, however I think Typo was more
rigid than EntryConverter.   At first glance I think EntryConverter is an improvement over
typo, it seems more flexible.  I think some examples of using the API would make it easier
to evaluate and understand it.    

Do you plan to address writing data?

Does anyone know other APIs that abstract Accumulo's API?  Typo and Gora were mentioned in
the description.   I have also seen [Accumulo-Fluent|https://github.com/Berico-Technologies/Accumulo-Fluent]
When experimenting w/ Typo I took the approach of building a prototype API on top of the existing
Accumulo API.  I found this to be an easier way to explore the concept. 



                
> Introduce Generic Supertypes to Replace Text
> --------------------------------------------
>
>                 Key: ACCUMULO-1551
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1551
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Ed Kohlwey
>
> I wanted to create a new ticket for my thoughts on this. I'd like to introduce a paradigm
similar to the object inspectors used in HIVE to get data in and out of accumulo.
> The base motivation for this is that the accumulo API is inconsistent. It is difficult
to use for application developers and creates a lot of confusion to new developers because
of the inconsistent use of Text, CharSequence, and byte[] for representing various parts of
the keys. This is totally unnecessary and is in my mind a huge black eye.
> Aside from providing a mechanism that could eventually be used to increase read performance
in the client, this would also provide a simpler paradigm for application developers and would
accomplish some aspects of ORM, a-la the Typo and Gora (although distinct from the goals and
scope of Gora).
> I've attached an initial pull request/code review outlining how I think the refactoring
would work in scanner. Basically, the old API would be preserved by introducing generic supertypes,
and a class that allows serialization directly from the ByteSequence objects.
> While it may be true that some people have highly heterogenous data in their table, the
worst case scenario here is that you just use the ByteSequences directly. This will, however,
allow substantially simpler access even in that base case by making the access pattern consistent.
In other cases, where a scan is only done over a particular column, or the data is very homogenous,
the benefit is even greater.
> https://github.com/ekohlwey/accumulo/compare/apache:trunk...ACCUMULO-1551

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message