accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Tubbs <>
Subject Re: feedback on Typo
Date Mon, 13 Aug 2012 21:12:04 GMT
Am I right in assuming that this is about simplifying the API for
storing typed data in the key, and not about providing a mechanism for
query. Isn't this really just about storing stuff you've already
decided was a good structure for whatever your query mechanism is?

On Mon, Aug 13, 2012 at 6:03 PM, Josh Elser <> wrote:
> Even with something as simple as a pair, things can start getting difficult.
> I suppose it really revolves around the level of support you want to provide
> at scan time, e.g. "find all pairs where the second is 'x'?".
> Spending a few minutes thinking about it, an index could be a separate table
> but wouldn't necessarily have to be. It depends on the complexity of the
> structure you're trying to index. Using the Pair example again, you could
> reserve a column (family) to place index records in which simply inverts the
> Pair in the colqual.
> On 08/13/2012 11:06 AM, Keith Turner wrote:
>> On Sun, Aug 12, 2012 at 9:36 PM, Josh Elser<>  wrote:
>>> Neat idea, Keith.
>>> Have you thought about how to support more complex types? Specifically,
>>> arrays, hashes and the nesting of those? Any thoughts about indexing for
>>> those complex types?
>> Yeah I was thinking that would be nice.  I see a lot of users putting
>> multiple types into the row and/or columns.  Could have something like
>> TupleEncoder<List<A>>.   TupleEncoder would need to encode it elements
>> such that it sorts correctly.  However, this may be cumbersome to use
>> if you want to use different types.  For example I want a row composed
>> of a Long and String.  I was thinking of having the following types to
>> handle this case.
>> class Pair<A,B>  extends LexEncoder{
>>     Pair(LexEncoder<A>  enc1, LexEncoder<B>  enc2);
>>     A getFirst(){}
>>     B getSecond(){}
>> }
>> class Triple<A,B,C>{//follows same pattern as Pair}
>> class Quadruple<A,B,C,D>{//follows same pattern as Pair}
>> This would allow a user to write code like the following that makes it
>> easy to work with a row composed of a Long and String.
>> Pair<Long, String>  pair;
>> long l = pair.getFirst();
>> String s = pair.getSecond();
>> I am still thinking the tuple concept through.
>> I was not considering indexing.  I assuming you mean creating an index
>> in another table?
>>> Initial thoughts are that it would make the most sense to place Typo at
>>> the
>>> contrib level (or something equivalent). The reason being: Typo doesn't
>>> change the underlying functionality of Accumulo; it only provides a layer
>>> on
>>> top of it that makes life easier for developers.
>> I think putting it in contrib makes sense.
>>> On 08/10/2012 07:07 PM, Keith Turner wrote:
>>>> I put together a simple abstraction layer for Accumulo that makes it
>>>> easier to read and write Java objects to Accumulo key and value
>>>> fields.  The data written to Accumulo sort correctly
>>>> lexicographically.
>>>> I put the code on github and would like some feedback on the design
>>>> and whether it should be included with Accumulo.
>>>> Its still a little rough and I need to add encoder for all of the
>>>> primitive types.
>>>> Keith

View raw message