accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Pros and Cons of moving SKVI to public API
Date Wed, 11 May 2016 01:22:35 GMT
Oy, that really did not come across well for me in email-form. Can you 
use paste.a.o or something?

+1 for moving "internal-only" iterators and IteratorUtils. Neither are 
things that we intend for users to need.

IMO, IteratorEnvironment was also kind of hokey/goofy (I never really 
used it either). I could go either way on it.

Keith Turner wrote:
> I modified the apilyzer config in the 1.8 branch to include iterators.   It
> spit out the following problems with types.
>
> I think we could move IteratorUtil out of
> org.apache.accumulo.core.iterators.  Also we could possibly move the
> iterators in org.apache.accumulo.core.iterators.system somewhere else (or
> declare it as an exception).  Could possibly deprecate IteratorEnvironment.
> getConfig() and add a replacement method that returns a Map.
>
> Problems :
>
>    CONTEXT
> TYPE
> FIELD/METHOD                        NON-PUBLIC REFERENCE
>
>    METHOD_RETURN
> org.apache.accumulo.core.iterators.IteratorEnvironment
> getConfig(...)
> org.apache.accumulo.core.conf.AccumuloConfiguration
>    METHOD_PARAM
> org.apache.accumulo.core.iterators.IteratorUtil
> parseIterConf(...)
> org.apache.accumulo.core.conf.AccumuloConfiguration
>    METHOD_PARAM
> org.apache.accumulo.core.iterators.IteratorUtil
> loadIterators(...)
> org.apache.accumulo.core.data.impl.KeyExtent
>    METHOD_PARAM
> org.apache.accumulo.core.iterators.IteratorUtil
> loadIterators(...)
> org.apache.accumulo.core.conf.AccumuloConfiguration
>    METHOD_PARAM
> org.apache.accumulo.core.iterators.IteratorUtil
> loadIterators(...)
> org.apache.accumulo.core.data.impl.KeyExtent
>    METHOD_PARAM
> org.apache.accumulo.core.iterators.IteratorUtil
> loadIterators(...)
> org.apache.accumulo.core.conf.AccumuloConfiguration
>    METHOD_PARAM
> org.apache.accumulo.core.iterators.IteratorUtil
> loadIterators(...)
> org.apache.accumulo.core.data.impl.KeyExtent
>    METHOD_PARAM
> org.apache.accumulo.core.iterators.IteratorUtil
> loadIterators(...)
> org.apache.accumulo.core.conf.AccumuloConfiguration
>    METHOD_PARAM
> org.apache.accumulo.core.iterators.IteratorUtil
> loadIterators(...)
> org.apache.accumulo.core.data.impl.KeyExtent
>    METHOD_PARAM
> org.apache.accumulo.core.iterators.IteratorUtil
> loadIterators(...)
> org.apache.accumulo.core.conf.AccumuloConfiguration
>    METHOD_PARAM
> org.apache.accumulo.core.iterators.IteratorUtil
> loadIterators(...)
> org.apache.accumulo.core.data.impl.KeyExtent
>    METHOD_PARAM
> org.apache.accumulo.core.iterators.IteratorUtil
> loadIterators(...)
> org.apache.accumulo.core.conf.AccumuloConfiguration
>    CTOR_PARAM
> org.apache.accumulo.core.iterators.system.MapFileIterator
> (...)
> org.apache.accumulo.core.conf.AccumuloConfiguration
>    METHOD_RETURN
> org.apache.accumulo.core.iterators.system.MapFileIterator
> getSample(...)
> org.apache.accumulo.core.file.FileSKVIterator
>    METHOD_PARAM
> org.apache.accumulo.core.iterators.system.MapFileIterator
> getSample(...)
> org.apache.accumulo.core.sample.impl.SamplerConfigurationImpl
>    CTOR_PARAM
> org.apache.accumulo.core.iterators.system.MultiIterator
> (...)
> org.apache.accumulo.core.data.impl.KeyExtent
>    CTOR_PARAM
> org.apache.accumulo.core.iterators.system.SampleIterator
> (...)                               org.apache.accumulo.core.sample.Sampler
>    CTOR_PARAM
> org.apache.accumulo.core.iterators.system.SequenceFileIterator
> (...)                               org.apache.hadoop.io.SequenceFile$Reader
>    METHOD_RETURN
> org.apache.accumulo.core.iterators.system.SequenceFileIterator
> getSample(...)
> org.apache.accumulo.core.file.FileSKVIterator
>    METHOD_PARAM
> org.apache.accumulo.core.iterators.system.SequenceFileIterator
> getSample(...)
> org.apache.accumulo.core.sample.impl.SamplerConfigurationImpl
>    FIELD
> org.apache.accumulo.core.iterators.system.VisibilityFilter
> cache
> org.apache.commons.collections.map.LRUMap
>    FIELD
> org.apache.accumulo.core.iterators.user.TransformingIterator
> log                                 org.slf4j.Logger
>
> Total : 22
>
>
> On Mon, May 9, 2016 at 4:54 PM, Christopher<ctubbsii@apache.org>  wrote:
>
>> Hey Keith. Just wanted to ping you on this to see if you've had a chance to
>> look.
>>
>> On Fri, Mar 25, 2016 at 12:17 PM Keith Turner<keith@deenlo.com>  wrote:
>>
>>> On Thu, Mar 24, 2016 at 8:27 PM, Christopher<ctubbsii@apache.org>
>> wrote:
>>>> It seems there's a general agreement that we treat it like public API
>> and
>>>> we should just call it public API.
>>>>
>>>> I just don't know how we're going to actually say this is public API,
>>>> without addressing the issue of the other iterators in the same
>> package,
>>>> unless we're okay with going back to a more complicated API definition
>>>> which calls out specific classes instead of whole packages.
>>>
>>>> Keith, you spent the most time thinking about how to convey the public
>>> API
>>>> in the README. What do you think about how to actually make this
>> happen?
>>>
>>> I will look into it at the end of next week and try to come up with a
>>> strategy for incorporating it into the public API.  Need to analyze what
>>> types are used by the existing iterators.  In 1.7.0 I tried to make API
>>> types only reference other API types.  Things that violated this type
>>> constraint were deprecated and a replacement that met the type constraint
>>> was introduced.  Maybe we can follow this path with iterators in 1.8.0,
>> not
>>> sure.
>>>
>>>
>>>> Also, what would we want to say about the 1.6 and 1.7 branches API?
>> Maybe
>>>> if we can guarantee that there aren't any changes in those
>>>> classes/packages, we can just add them to the README as a new
>> declaration
>>>> of API (but not actually an addition in the semver sense), but if there
>>> are
>>>> any changes, that's going to complicate things, because semver
>> guarantees
>>>> should mean no public API changes between bugfix releases. I guess I
>>> should
>>>> actually check to see if there have been any changes...
>>>>
>>>> On Thu, Mar 24, 2016 at 7:15 PM Keith Turner<keith@deenlo.com>  wrote:
>>>>
>>>>> On Thu, Mar 24, 2016 at 5:54 PM, Billie Rinaldi<
>>>> billie.rinaldi@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On Thu, Mar 24, 2016 at 1:15 PM, Christopher<ctubbsii@apache.org>
>>>>> wrote:
>>>>>>> We do have the opportunity to move to a new improved API, if
>>> somebody
>>>>>> were
>>>>>>> to put time into it. I guess that's true whether we put this
in
>> the
>>>>>> public
>>>>>>> API officially or not.
>>>>>>
>>>>>> Agreed.  Even if we do create a new API, we can't change or drop
>> the
>>>>>> existing API without breaking a lot of people's code.  I feel like
>>> SKVI
>>>>> is
>>>>>> in a category of things that we treat as though they're in the
>> public
>>>>> API,
>>>>>> so we might as well say it is.
>>>>>>
>>>>> +1 to that
>>>>>
>>>>> Ensuring what exists is stable is a separate issue from creating a
>> new
>>>>> iterator API.
>>>>>
>>>>>
>>>>>>
>>>>>>> I think maybe the hardest part is that we don't
>>>>>>> really want to put just the interface in the API... but it exists
>>> in
>>>> a
>>>>>>> package with a bunch of other classes which probably shouldn't
be
>>>>> public
>>>>>>> API. So, some thought needs to be put into *how* we're going
to
>> do
>>>> it,
>>>>>> too.
>>>>>>> On Thu, Mar 24, 2016 at 3:27 PM William Slacum<
>> wslacum@gmail.com>
>>>>>> wrote:
>>>>>>>> It should be public API. It's one of the core reasons for
>>> choosing
>>>>>>> Accumulo
>>>>>>>> over a similar project like HBase or Cassandra. Allegedly,
Jeff
>>>> "Mean
>>>>>>> Gene"
>>>>>>>> Dean said we got the concept correct as well :)
>>>>>>>>
>>>>>>>> Personally I hate the current API from a usability standpoint
>>> (ie,
>>>>> the
>>>>>>>> generic types are useless and already encoded in the name,
it
>>>>>> needlessly
>>>>>>>> diverges from the standard java Iterator calling standards),
>> but
>>>>> it's a
>>>>>>>> strong, identifying feature we have.
>>>>>>>>
>>>>>>>> On Thu, Mar 24, 2016 at 2:50 PM, Christopher<
>>> ctubbsii@apache.org>
>>>>>>> wrote:
>>>>>>>>> Accumulators,
>>>>>>>>>
>>>>>>>>> What are the pros and cons that you can see for moving
the
>>>>>>>>> SortedKeyValueIterator into the public API?
>>>>>>>>>
>>>>>>>>> Right now, I think there's still some need for improvement
in
>>> the
>>>>>>>> Iterator
>>>>>>>>> API, and many of the iterators may not be stable enough
to
>>> really
>>>>>>>> recommend
>>>>>>>>> people use without some serious caveats (because we may
not
>> be
>>>> able
>>>>>> to
>>>>>>>> keep
>>>>>>>>> their API stable very easily). So, there's a con.
>>>>>>>>>
>>>>>>>>> In the pros side, iterators are a core feature of Accumulo,
>> and
>>>>>> nearly
>>>>>>>> all
>>>>>>>>> of Accumulo's distributed processing capabilities are
>> dependent
>>>>> upon
>>>>>>>> them.
>>>>>>>>> It is reasonable to expect users to take advantage of
them,
>> and
>>>>> we've
>>>>>>> at
>>>>>>>>> least tried to be cautious about changing the iterators
in
>>>>>> incompatible
>>>>>>>>> ways, even if they aren't in the public API.
>>>>>>>>>
>>>>>>>>> Recently, this came up when we stripped out all the
>> non-public
>>>> API
>>>>>>>> javadocs
>>>>>>>>> from the website. (reported by Dan Blum on the user list
on
>>> March
>>>>>> 4th:
>>>>>>>>>
>> http://mail-archives.apache.org/mod_mbox/accumulo-user/201603.mbox/%3C066a01d17658%24bc9dc1b0%2435d94510%24%40bbn.com%3E
>>>>>>>>> )
>>>>>>>>>
>>>>>>>>> What would it take for us to feel comfortable moving
them to
>>> the
>>>>>> public
>>>>>>>>> API? Do we need a better interface first, or should we
>> isolate
>>>> the
>>>>>>> other
>>>>>>>>> iterators into another package (some of that has already
been
>>>>> done),
>>>>>> or
>>>>>>>>> should we wait for a proper public API package (2.0?)
to
>>> provide
>>>>> this
>>>>>>>>> interface in?
>>>>>>>>>
>

Mime
View raw message