avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Carey (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-545) Move mapreduce bindings out of avro jar
Date Sat, 11 Sep 2010 01:51:35 GMT

    [ https://issues.apache.org/jira/browse/AVRO-545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908263#action_12908263
] 

Scott Carey commented on AVRO-545:
----------------------------------

Related to: https://issues.apache.org/jira/browse/AVRO-647


None of the bits of Avro that Hadoop will use depend on Hadoop.
None of the bits of Avro that use Hadoop will be called by Hadoop.
If this was not so, then moving it out of the jar would not be a possible solution to the
problem.

There is no circular dependency between Hadoop itself and Avro unless Hadoop decides to use
classes in o.a.a.mapred.  
Unless a user decided to call those in a Task. But then this inclusion might actually be desired!

Because of the way that Hadoop works, putting all of its dependencies in the front of the
classpath for a Task, no user user will be able to run a newer version of Avro than what is
in Hadoop.  With the mapred package broken out, at least a user might have a chance of using
a different version of that, provided it was compatible with the 'avro-core' version Hadoop
was using; but the safe bet would be to force the exact same version and bundle it with hadoop.

So at first I thought this was important to break out for Hadoop's sake, but now I don't.
 Its important for Avro's sake for users and applications that don't use Hadoop.

It might be the other way around.  Using Avro in Hadoop is blocked by Hadoop sorting out its
classpath issues since it currently forces all user Tasks to run with its dependencies  (there
is no separate classloader for Tasks, for instance).
There is a Hadoop ticket for that, I can't seem to locate it right now.


Avro does not lie about its dependency on Hadoop.  There is never a time that an Avro user
needs hadoop.  Although "provided" scope might be more appropriate than "optional", its functionally
the same in this case.
The only way that a user can execute any avro.mapred code is to run from inside Hadoop, where
the hadoop jars and dependencies come from hadoop and not from any packaging the user may
attempt.  Specifying the dependency as a runtime dependency would be a lie -- the execution
context (Hadoop) is expected to provide it.



> Move mapreduce bindings out of avro jar
> ---------------------------------------
>
>                 Key: AVRO-545
>                 URL: https://issues.apache.org/jira/browse/AVRO-545
>             Project: Avro
>          Issue Type: Sub-task
>          Components: java
>    Affects Versions: 1.4.0
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>            Priority: Blocker
>             Fix For: 1.4.1
>
>         Attachments: avro-545.patch
>
>
> MapReduce should not depend on any jars (eg. avro's main jar) that also depend on the
MapReduce jar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message