accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <>
Subject Re: AccumuloInputFormat getters
Date Thu, 17 Jul 2014 02:05:18 GMT
(sorry for the bifurcation of the thread -- had another thought I wanted 
to write)

Yeah, I can respect the difficulty in a user being unable to get back 
the same object that was serialized. At a glance, I would think that the 
getters would best mirror the "current" way that we recommend the API is 
interacted with (as in, not a deprecated method). I don't know if that's 
possible across all of the methods, but for some of these, I think there 
is much worth.

For context, this limitation has bitten me working in both Hive and Pig. 
I don't have control over the Configuration object that I'm handed from 
the framework, and I have no way to determine whether or not the proper 
configuration still exists within the current Configuration. I want to 
be able to see if the proper Accumulo configuration items are still 
present in the Configuration (to debug an issue), but I have no way to 
do it easily.

Subclassing AIF/AOF seems unnecessarily complex to me. I'll see if I can 
come up with something simple to add.

On 7/16/14, 9:59 PM, Christopher wrote:
> Well, you can subclass to introspect. And, if you feel the API can be
> improved by offering stronger getter/setter support with the stability
> guarantees that we care about for public API, go ahead. (It probably
> wouldn't change much anyway, since we now treat protected as public API,
> too, I think). I won't object to the improvements... just explaining why
> it's like that. My concern if you were to do this would be whether this
> would actually add too much bloat or not to consumers of the API who don't
> need to subclass, and the lack of 1-to-1 in many cases... but if you can
> address those things sufficiently, I wouldn't object.
> --
> Christopher L Tubbs II
> On Wed, Jul 16, 2014 at 9:55 PM, Josh Elser <> wrote:
>> Ultimately, I feel like there's a big problem when I, an "experienced
>> Accumulo developer", am getting frustrated with the API.
>> As it stands right now, I have no way to introspect the contents of a
>> Configuration to ensure that the state is as I expect it to be. I'm stuck
>> dumping the entire configuration, and grep'ing it to see if the values I
>> expect are in there with *some* key. If so, I then have to try to unravel
>> what exactly is the appropriate key that the value should be paired with.
>> I can understand the complexity in the storage of relevant data within the
>> Configuration, but this seems unnecessarily complicated to me.
>> On 7/16/14, 9:48 PM, Josh Elser wrote:
>>> The value of the name of the table that the AccumuloInputFormat is going
>>> to read is subject to change? Isn't the point of a getter that it can
>>> unwrap the specifics of the serialization within the configuration and
>>> present the high-level constructs (username, AuthenticationToken, table
>>> name, IteratorSettings, etc) that users expect?
>>> On 7/16/14, 9:46 PM, Christopher wrote:
>>>> Because those things represent internals of the configuration that are
>>>> subject to change, and we don't want end users becoming dependent on
>>>> them.
>>>> They are protected, because they may be needed for subclassing, where the
>>>> subclass assumes some greater risk than an end user of the API.
>>>> --
>>>> Christopher L Tubbs II
>>>> On Wed, Jul 16, 2014 at 9:43 PM, Josh Elser <>
>>>> wrote:
>>>>   Why are all of the getters on the AccumuloInputFormat protected (really,
>>>>> InputFormatBase) instead of public?
>>>>> This has repeatedly infuriated me as it makes it impossible for me to
>>>>> verify that the Configuration actually has the data in it as needed.
>>>>> It seems intentional so I figured I would ask before making a ticket
>>>>> changing it.
>>>>> - Josh

View raw message