hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-7977) Allow Hadoop clients and services to run in an OSGi container
Date Fri, 24 Feb 2012 14:38:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-7977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13215636#comment-13215636
] 

Steve Loughran commented on HADOOP-7977:
----------------------------------------

To answer Sanjay's questions in HADOOP-6484:

h5. Benefits and target audience: Is this work targeted for managing/running hadoop for developers
or for production use? Briefly describe the benefits.

# it could be used for better deployment/management of the services *in small clusters*, where
the memory requirements of the NN and JT aren't great, and being able to deploy in a single
process the entire set of services for a worker node or a (single) master node would result
in a lighter system load.

# If the TT started (marked) tasks within the OSGi container (or a preloaded peer OSGi container),
Map and Reduce jobs would be able to execute without all the JVM startup delays.

h5. Besides adding the manifests to jar files will it require adding more invasive changes
such as special interfaces for stopping and starting hadoop daemons?

* Adding the headers will have no impact on the existing daemons, because they don't run in
an OSGi container.

* Nor does any of the Hadoop code play games with classloaders, which is one thing that OSGi
does differently.

* HADOOP-5731 shows a problem which existed when trying to run IPC under a security manager;
this may be a barrier to OSGI Container use. If it exists client-side that is something that
may need fixing anyway, if it is still there after a switch to protobuf everywhere.

* the MRv2 service model could be re-used by some OSGi helper code that could manage the lifecycle
of things, because you no longer need per-service code to start/stop services. 

* I'd expect there to be some new entry points needed to start the services under OSGi, but
they should be wrapper layers on the existing code. If they depended on OSGi services they
could be off to one side; if they needed to be in the same package as existing stuff things
might get trickier.

h5. Will this be used for management after deployment has been done through some other mechanism
or will this work also enable the deployment in a cluster?

* Karaf is interesting in that not only is it yet-another-OSGi container, it is one that has
a built in SSHD, so anyone can ssh in remotely, authenticate themselves and issue management
commands: start/stop services, see logs, etc: 
[http://felix.apache.org/site/41-console-and-commands.html] -one that works on Windows too,
which doesn't normally ship with an sshd.

* I wonder if you get at the logs through karaf -including any from jobs stored on the workers?
That would be useful.

* Karaf itself doesn't do remote deployment, AFAIK. Bringing up a zookeeper client on each
karaf instance and waiting for instructions via ZK could always be possible. 

Overall, I think it could be good, adding the headers is low risk, other features could be
useful, though it will take some work to see what problems arise. 
                
> Allow Hadoop clients and services to run in an OSGi container
> -------------------------------------------------------------
>
>                 Key: HADOOP-7977
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7977
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: util
>    Affects Versions: 0.24.0
>         Environment: OSGi client runtime (Spring &c), possibly service runtime (e.g.
Apache Karaf)
>            Reporter: Steve Loughran
>            Priority: Minor
>
> There's been past discussion on running Hadoop client and service code in OSGi. This
JIRA issue exists to wrap up the needs and issues. 
> # client-side use of public Hadoop APIs would seem most important.
> # service-side deployments could offer benefits. The non-standard Hadoop Java security
configuration may interfere with this goal.
> # testing would all be functional with dependencies on external services, to make things
harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message