Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 93264 invoked from network); 12 Jun 2009 10:27:18 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 12 Jun 2009 10:27:18 -0000 Received: (qmail 54495 invoked by uid 500); 12 Jun 2009 10:27:27 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 54420 invoked by uid 500); 12 Jun 2009 10:27:27 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 54410 invoked by uid 99); 12 Jun 2009 10:27:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Jun 2009 10:27:27 +0000 X-ASF-Spam-Status: No, hits=-2.8 required=10.0 tests=RCVD_IN_DNSWL_MED,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [192.6.10.60] (HELO tobor.hpl.hp.com) (192.6.10.60) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Jun 2009 10:27:17 +0000 Received: from localhost (localhost [127.0.0.1]) by tobor.hpl.hp.com (Postfix) with ESMTP id 5A706B7CB0 for ; Fri, 12 Jun 2009 11:26:55 +0100 (BST) X-Virus-Scanned: amavisd-new at hplb.hpl.hp.com Received: from tobor.hpl.hp.com ([127.0.0.1]) by localhost (tobor.hpl.hp.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id XTHc7Xr5YJTD for ; Fri, 12 Jun 2009 11:26:49 +0100 (BST) Received: from 0-imap-br1.hpl.hp.com (0-imap-br1.hpl.hp.com [16.25.144.60]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by tobor.hpl.hp.com (Postfix) with ESMTPS id 86B23B7CAE for ; Fri, 12 Jun 2009 11:26:49 +0100 (BST) MailScanner-NULL-Check: 1245407197.72447@L9ZmQmtSOJLIIkGXK79g3w Received: from [16.25.171.118] (morzine.hpl.hp.com [16.25.171.118]) by 0-imap-br1.hpl.hp.com (8.14.1/8.13.4) with ESMTP id n5CAQbqR023915 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 12 Jun 2009 11:26:37 +0100 (BST) Message-ID: <4A322D5D.4020000@apache.org> Date: Fri, 12 Jun 2009 11:26:37 +0100 From: Steve Loughran User-Agent: Thunderbird 2.0.0.21 (X11/20090302) MIME-Version: 1.0 To: core-user@hadoop.apache.org Subject: Re: Running Hadoop/Hbase in a OSGi container References: <1120593349.327401244745267213.JavaMail.root@zimbra1.mindcentric.com> <814b570906120120w416cfe7csaea62c40afc033a7@mail.gmail.com> In-Reply-To: <814b570906120120w416cfe7csaea62c40afc033a7@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-HPL-MailScanner-Information: Please contact the ISP for more information X-MailScanner-ID: n5CAQbqR023915 X-HPL-MailScanner: Found to be clean X-HPL-MailScanner-From: stevel@apache.org X-Virus-Checked: Checked by ClamAV on apache.org Ninad Raut wrote: > OSGi provides navigability to your components and create a life cycle for > each of those components viz; install. start, stop, un- deploy etc. > This is the reason why we are thinking of creating components using OSGi. > The problem we are facing is our components using mapreduce and HDFS, as > such OSGi container cannot detect hadoop mapred engine or HDFS. > > I have searched through the net and looks like people are working or have > achieved success in running hadoop in OSGi container.... > > Ninad 1. I am doing work on a simple lifecycle for the services, start/stop/ping, which is not OSGI (which worries a lot about classloading and versioning, check out HADOOP-3628 for this. 2. You can run it under OSGi systems, such as the OSGi branch of SmartFrog : http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/branches/core-branch-osgi/, or under non-OSGi tools. Either way, these tools are left dealing with classloading and the like. 3. Any container is going to have to deal with the problem that there are bits of all the services that call System.Exit() by running under a security manager, trapping the call, raising an exception etc. 4. Any container is going to have to then deal with the fact that from 0.20 onwards, Hadoop does things with security policy that are incompatible with normal Java security managers. whatever security manager you have for trapping system exits, can't extend the default one. 5. any container also has to deal with every service (namenode, job tracker, etc) makes a lot of assumptions about singletons, that they have exclusive use of filesystem objects retrieved through FileSystem.get(), and the like. While OSGi can do that with its classloading work, its still fairly complex. 6. There are also lots of JVM memory/thread management issues, see the various Hadoop bugs If you look at the slides of what I've been up to, you can see that it can be done http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/hadoop/doc/dynamic_hadoop_clusters.ppt However, * you really need to run every service in its own process, for memory and reliability alone * It's pretty leading edge * You will have to invest the time and effort to get it working If you want to do the work, start with what I've been doing, bring it up under the OSGi container of your choice. You can come and play with our tooling, I'm cutting a release today of this week's Hadoop trunk merged with my branch, it is of course experimental, as even the trunk is a bit up-and-down on feature stability. -steve