accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Donald Miner <dmi...@clearedgeit.com>
Subject Re: [VOTE] Apache Accumulo 1.5.1-RC3
Date Fri, 28 Mar 2014 17:24:20 GMT
Ah, if all I need to do is change the class name to
org.apache.accumulo.core.client.mapreduce.RangeInputSplit .... I feel kind
of dumb. I didn't realize it was renamed. I can do that.

On a separate note (maybe more appropriate for the user list) but keeping
in here for continuity sake:

We have an application that has daemons running on every Hadoop node.
When we kick off a query in our application, this is what happens:
   - we figure out what splits there are on the table we want and what
tablet servers those splits are on
   - we tell our application's daemons to do a range scan across the
tablets that are colocated on the same node (1 range scan per tablet)
   - daemon processes data
   - etc.

So, we need both the split information and the locality information to get
this to work right.

Is there a better way to do this? It seems like teasing out information
from an inputformat class seems like kind of a hack. We could use
TabletLocator, but that's not in the public API either? Is there a right
way for a client to get locality information?

-d


On Fri, Mar 28, 2014 at 1:12 PM, Sean Busbey <busbey+lists@cloudera.com>wrote:

> The README is already clear that everything under those packages is
> included, with the exception of the impl pacakge.
>
> In my reading, that means all Classes and Interfaces in any package under
> the client package, and everything in those classes that is at either
> public and protected access.
>
> I guess this should be included in our pending discussion about
> compatibility across versions?
>
>
> On Fri, Mar 28, 2014 at 12:02 PM, Josh Elser <josh.elser@gmail.com> wrote:
>
> > Also, reading back through this chain, it was state as unclear as to
> > whether or not an inner class of a class in the public API is also,
> itself,
> > in the public API.
> >
> > This should also be clarified in our definition of public API in the
> > README. Obviously, Don and Sean both agree that it should be. The
> > discussion of those on the vote didn't. Doesn't really matter to me
> either
> > way.
> >
> >
> > On 3/28/14, 9:50 AM, Josh Elser wrote:
> >
> >> Ah, I missed the recursiveness of the o.a.a.c.c.
> >>
> >> But, like I mentioned in the other message, I don't think binary compat
> >> was achieved, but the package name, constructors, and methods existing
> >> in 1.5.0 were maintained AFAIK. Are we asserting binary compat here as
> >> well?
> >>
> >> I'm trying to understand if we actually didn't follow our own rules, or
> >> if the expectations of the community are exceeding the rules we have for
> >> ourselves. I think we're in the latter right now.
> >>
> >> On 3/28/14, 9:41 AM, Sean Busbey wrote:
> >>
> >>> According to the definition of the public API in version 1.5.0,
> >>> RangeInputSplit is a part of the public API.
> >>>
> >>>
> >>> On Fri, Mar 28, 2014 at 11:26 AM, Josh Elser <josh.elser@gmail.com>
> >>> wrote:
> >>>
> >>>  Devil's advocate: RangeInputSplit isn't part of the public API
> >>>> either, so
> >>>> it comes with the same risks that TabletLocator would.
> >>>>
> >>>> It sounds more like the definition of "public api" should be expanded
> to
> >>>> prevent this in future cases. I need to look at what exactly broke
> >>>> for Don.
> >>>>
> >>>>
> >>>> On 3/28/14, 9:12 AM, Sean Busbey wrote:
> >>>>
> >>>>  Don,
> >>>>>
> >>>>> If you can file a jira with some example code that covers what parts
> of
> >>>>> the
> >>>>> 1.5.0 API you hit, I can see if I can a patch to get you working.
> >>>>>
> >>>>> That would give you a patch you could apply on top of 1.5.1 now
and
> >>>>> when
> >>>>> 1.5.2 comes out it would correctly support the API.
> >>>>>
> >>>>> -Sean
> >>>>>
> >>>>> On Fri, Mar 28, 2014 at 8:49 AM, Donald Miner <
> dminer@clearedgeit.com
> >>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>
> >>>>>   I'm starting to dig around for a workaround and figured someone
> >>>>> might be
> >>>>>
> >>>>>> able to help me right away.
> >>>>>>
> >>>>>> In digging deeper, we were using RangeInputSplit because it
gave us
> >>>>>> the
> >>>>>> splits AND the locations. We use the locations for some data
> locality
> >>>>>> placing in our distributed application. listSplits only gives
us
> >>>>>> splits.
> >>>>>>
> >>>>>> Is there an easy way to get both of these pieces of information
> >>>>>> together?
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Mar 27, 2014 at 3:28 PM, Josh Elser <josh.elser@gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>   Ack, sorry about that, Don.
> >>>>>>
> >>>>>>>
> >>>>>>> We probably should have been more strict about that. It's
tough to
> >>>>>>> make
> >>>>>>> a
> >>>>>>> call about a public class that someone *might* be using.
> >>>>>>>
> >>>>>>>
> >>>>>>> On 3/27/14, 12:26 PM, Donald Miner wrote:
> >>>>>>>
> >>>>>>>   Sorry to necro this thread, just wanted to throw my 2
cents in.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> We had some user code referencing this code directly
and our
> >>>>>>>> application
> >>>>>>>> no
> >>>>>>>> longer works in 1.5.1. Just found out today when installing
on
> >>>>>>>> 1.5.1.
> >>>>>>>> In
> >>>>>>>> retrospect, we should have been using .listSplits from
> >>>>>>>> TableOperatons,
> >>>>>>>>
> >>>>>>>>  but
> >>>>>>>
> >>>>>>
> >>>>>>  instead we were using the RangeInputSplit method to get the
splits
> >>>>>>> for a
> >>>>>>>
> >>>>>>>> table.
> >>>>>>>>
> >>>>>>>> I guess since we probably shouldn't have been doing
that, I don't
> >>>>>>>> know
> >>>>>>>>
> >>>>>>>>  if
> >>>>>>>
> >>>>>>
> >>>>>>  that's a case for this not being deleted without going to
> >>>>>>> deprecated...
> >>>>>>>
> >>>>>>>> but
> >>>>>>>> we did have a nasty surprise and a deprecation warning
would have
> >>>>>>>> been
> >>>>>>>> nice.
> >>>>>>>>
> >>>>>>>> -d
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Tue, Feb 25, 2014 at 11:33 PM, Adam Fuchs <afuchs@apache.org>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>    I'll buy that the RangeInputSplit is probably not
referenced
> >>>>>>>> directly
> >>>>>>>>
> >>>>>>>>  in
> >>>>>>>
> >>>>>>
> >>>>>>  user code. In this case it's probably not a big enough change
to
> >>>>>>> delay
> >>>>>>>
> >>>>>>>> the
> >>>>>>>>> release.
> >>>>>>>>>
> >>>>>>>>> Adam
> >>>>>>>>>     On Feb 25, 2014 6:19 PM, "Christopher" <ctubbsii@apache.org>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>    I don't know that this inner class used for M/R
should be
> >>>>>>>>> considered
> >>>>>>>>>
> >>>>>>>>>  public API... nor do I imagine it will cause compatibility
> >>>>>>>>>> problems
> >>>>>>>>>> if
> >>>>>>>>>> users aren't referencing it in their code (which
there's no
> >>>>>>>>>> reason to
> >>>>>>>>>> expect them to). I don't know if anybody is
subclassing
> >>>>>>>>>> RangeInputSplit, but I'd suspect that it's an
acceptable risk.
> >>>>>>>>>> Re-adding an inner class that subclasses the
now external one
> >>>>>>>>>> may be
> >>>>>>>>>> a
> >>>>>>>>>> good workaround. I don't think this would require
recompilation
> >>>>>>>>>> for
> >>>>>>>>>> runtime compatibility, but if it does, I think
that's probably
> >>>>>>>>>> acceptable.
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Christopher L Tubbs II
> >>>>>>>>>> http://gravatar.com/ctubbsii
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Feb 25, 2014 at 6:13 PM, Josh Elser
<
> josh.elser@gmail.com
> >>>>>>>>>> >
> >>>>>>>>>>
> >>>>>>>>>>   wrote:
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>   I haven't checked what would happen. If you subclassed
the
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>   RangeInputSplit,
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>   it's rather likely that you'd need a recompilation.
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On 2/25/14, 5:59 PM, John Vines wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>  Will it? Clients don't interact with that
code at all
> directly.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, Feb 25, 2014 at 5:57 PM, Adam
Fuchs <
> afuchs@apache.org>
> >>>>>>>>>>>>
> >>>>>>>>>>>>   wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>      Thanks for running that checker, Keith. Should
we not be
> >>>>>>>>>> worried
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>>   about
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>      the
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>  removal of InputFormatBase.RangeInputSplit?
If I read correctly
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>>  this
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>
> >>>>>>>    will
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>    break both binary (runtime) compatibility
and code
> >>>>>>>>>> (compile-time)
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>  compatibility. Can somebody make an argument
for why this is
> >>>>>>>>>>>> not a
> >>>>>>>>>>>>
> >>>>>>>>>>>>> problem
> >>>>>>>>>>>>> in a minor release with no previous
deprecation?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Is there a quick way to fix this,
like by subclassing the
> >>>>>>>>>>>>> org.apache.accumulo.core.client.mapred.RangeInputSplit
in a
> >>>>>>>>>>>>> o.a.a.c.c.mapred.InputFormatBase.RangeInputSplit
that we
> >>>>>>>>>>>>> mark as
> >>>>>>>>>>>>> deprecated?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Adam
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, Feb 25, 2014 at 5:17 PM,
Keith Turner
> >>>>>>>>>>>>> <keith@deenlo.com>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>   wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>      I ran a utility [1] to analyze API diffs
[2] between 1.5.0
> >>>>>>>>>>> and
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>>  1.5.1-RC3.
> >>>>>>>>>>>>>> The configs I used are the two
xml files in the parent [3]
> >>>>>>>>>>>>>> of the
> >>>>>>>>>>>>>> report.
> >>>>>>>>>>>>>> I think the diff looks ok. 
I used jars from 1.5.0 and
> >>>>>>>>>>>>>> 1.5.1-RC3
> >>>>>>>>>>>>>> bin.tar.gz.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> [1] :
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> http://ispras.linuxbase.org/index.php/Java_API_Compliance_
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> Checker
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>      [2] :
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>     http://people.apache.org/~kturner/1.5.0_to_1.5.1-RC3/
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>  compat_report.html
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>     [3] : http://people.apache.org/~kturner/1.5.0_to_1.5.1-RC3/
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Mon, Feb 24, 2014 at 8:01
PM, Josh Elser <
> >>>>>>>>>>>>>> josh.elser@gmail.com
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>
> >>>>>>>>>>>>>>  wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>    All,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Please consider the following
candidate as Apache Accumulo
> >>>>>>>>>>>>>>> 1.5.1
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>  --
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>
> >>>>>>>     now
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>      with 100% more CHANGES changes.
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>  Git artifacts: The staging repository
was built from the tag
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>    "1.5.1-rc3"
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>    (3478f71a).
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Maven Staging Repo:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>    https://repository.apache.org/content/repositories/
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>    orgapacheaccumulo-1002
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Source tarball:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>  http://repository.apache.org/content/repositories/
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>    orgapacheaccumulo-1002/org/apache/accumulo/accumulo/1.5.
> >>>>>>>
> >>>>>>>>  1/accumulo-1.5.1-src.tar.gz
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Binary tarball:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>  http://repository.apache.org/content/repositories/
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>    orgapacheaccumulo-1002/org/apache/accumulo/accumulo/1.5.
> >>>>>>>
> >>>>>>>>  1/accumulo-1.5.1-bin.tar.gz
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Changes since 1.5.1-RC2:
ACCUMULO-2324, ACCUMULO-2361,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>   ACCUMULO-2369,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>      ACCUMULO-2378, ACCUMULO-2379,
ACCUMULO-2380,
> >>>>>>>>> ACCUMULO-2385,
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>>>>  ACCUMULO-2387,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>  ACCUMULO-2390
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Keys: http://www.apache.org/dist/accumulo/KEYS
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Final CHANGES:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> https://git-wip-us.apache.org/repos/asf?p=accumulo.git;a=
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>    blob_plain;f=CHANGES;hb=3478f71ae888f8d73aaa93837319a6
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> dbb4ba0c8a
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Testing: Unit test and auto-tests
passed successfully. Ran
> a
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>  short
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>>>>>>>>>  (~2hrs)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>  CI on 6 node installation.
Ran a brief (~1hr) CI test on
> one
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>   machine
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>    with
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>    the newly-released Hadoop-2.3.0.
Built from src tarball,
> and
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>   verified
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>      functionality with bin tarball.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>  Since there are very minor changes compared
to 1.5.1-RC2,
> >>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>  vote
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>>>>>>>>>    will
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>    be open for the next 72 hours
(2/28/2014 0100 UTC).
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Upon successful completion
of this vote, a 1.5.1
> >>>>>>>>>>>>>>> gpg-signed Git
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>  tag
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>>>>>>>>>    will
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>    be created from 3478f71a and
the above staging repository
> >>>>>>>>>>>>>> will
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>  promoted.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - Josh
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>> --
> >>>>>>
> >>>>>> Donald Miner
> >>>>>> Chief Technology Officer
> >>>>>> ClearEdge IT Solutions, LLC
> >>>>>> Cell: 443 799 7807
> >>>>>> www.clearedgeit.com
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>
>



-- 

Donald Miner
Chief Technology Officer
ClearEdge IT Solutions, LLC
Cell: 443 799 7807
www.clearedgeit.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message