Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 23E98907A for ; Thu, 23 Feb 2012 20:33:21 +0000 (UTC) Received: (qmail 90782 invoked by uid 500); 23 Feb 2012 20:33:17 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 90705 invoked by uid 500); 23 Feb 2012 20:33:17 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 90697 invoked by uid 99); 23 Feb 2012 20:33:17 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Feb 2012 20:33:17 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mike.lyon@gmail.com designates 74.125.82.48 as permitted sender) Received: from [74.125.82.48] (HELO mail-ww0-f48.google.com) (74.125.82.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Feb 2012 20:33:11 +0000 Received: by wgbdq13 with SMTP id dq13so1366257wgb.29 for ; Thu, 23 Feb 2012 12:32:51 -0800 (PST) Received-SPF: pass (google.com: domain of mike.lyon@gmail.com designates 10.180.89.71 as permitted sender) client-ip=10.180.89.71; Authentication-Results: mr.google.com; spf=pass (google.com: domain of mike.lyon@gmail.com designates 10.180.89.71 as permitted sender) smtp.mail=mike.lyon@gmail.com; dkim=pass header.i=mike.lyon@gmail.com Received: from mr.google.com ([10.180.89.71]) by 10.180.89.71 with SMTP id bm7mr6272572wib.20.1330029171092 (num_hops = 1); Thu, 23 Feb 2012 12:32:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=references:from:in-reply-to:mime-version:date:message-id:subject:to :content-type:content-transfer-encoding; bh=rfz8dzmCR85IMrWG1NqNbil9R4X+SVt1GoMBXQfKXAw=; b=Yp9OFveaMdO1mePfHFguACaK91fn+ytMDlwHAe6UYAjJrVGLuJ/oTgZ/B/Bp+MSMex ZZcLfDH5K45k7tO0UNKLuNkWIPMjjvxtEn6gOSF6DGsNuJPpnor0cuj5OBA2gvwjPaOI 4a4IjxJ+AxFOR6AgvJ5U9oTG+OSNDkrIDwFA0= Received: by 10.180.89.71 with SMTP id bm7mr5121953wib.20.1330029171038; Thu, 23 Feb 2012 12:32:51 -0800 (PST) References: From: Mike Lyon In-Reply-To: Mime-Version: 1.0 (1.0) Date: Thu, 23 Feb 2012 10:32:49 -1000 Message-ID: <-4179195338825618980@unknownmsgid> Subject: Re: Experience with Hadoop in production To: "common-user@hadoop.apache.org" Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Just be sure you have that corporate card available 24x7 when you need to call support ;) Sent from my iPhone On Feb 23, 2012, at 10:30, Serge Blazhievsky wrote: > What I have seen companies do often is that they will use free version of > the commercial vendor and only get their support if there are major > problems that they cannot solve on their own. > > > That way you will get free distribution and insurance that you have > support if something goes wrong. > > > Serge > > On 2/23/12 10:42 AM, "Jamack, Peter" wrote: > >> A lot of it depends on your staff and their experiences. >> Maybe they don't have hadoop, but if they were involved with large >> databases, data warehouse, etc they can utilize their skills & experienc= es >> and provide a lot of help. >> If you have linux admins, system admins, network admins with years of >> experience, they will be a goldmine. At the other end, database >> developers who know SQL, programmers who know Java, and so on can really >> help staff up your 'big data' team. Having a few people who know ETL wou= ld >> be great too. >> >> The biggest problem I've run into seems to be how big the Hadoop >> project/team is or is not. Sometimes it's just an 'experimental' >> department and therefore half the people are only 25-50 percent availabl= e >> to help out. And if they aren't really that knowledgeable about hadoop, >> it tends to be one of those, not enough time in the day scenarios. And >> the few people dedicated to the Hadoop project(s) will get the brunt of >> the work. >> >> It's like any ecosystem. To do it right, you might need system/network >> admins, a storage person to actually know how to set up the proper stora= ge >> architecture, maybe a security expert, a few programmers, and a few dat= a >> people. If you're combining analytics, that's another group. Of cours= e >> most companies outside the Google and Facebooks of the world, will have= a >> few people dedicated to Hadoop. Which means you need somebody who knows >> storage, knows networking, knows linux, knows how to be a system admin, >> knows security, and maybe other things(AKA if you have a firewall issue, >> somebody needs to figure out ways to make it work through or around), a= nd >> then you need some programmes who either know MapReduce or can pretty mu= ch >> figure it out because they've done java for years. >> >> Peter J >> >> On 2/23/12 10:17 AM, "Pavel Frolov" wrote: >> >>> Hi, >>> >>> We are going into 24x7 production soon and we are considering whether w= e >>> need vendor support or not. We use a free vendor distribution of Clust= er >>> Provisioning + Hadoop + HBase and looked at their Enterprise version bu= t >>> it >>> is very expensive for the value it provides (additional functionality + >>> support), given that we=B9ve already ironed out many of our performance= and >>> tuning issues on our own and with generous help from the community (e.g= . >>> all of you). >>> >>> So, I wanted to run it through the community to see if anybody can shar= e >>> their experience of running a Hadoop cluster (50+ nodes with Apache >>> releases or Vendor distributions) in production, with in-house support >>> only, and how difficult it was. How many people were involved, etc.. >>> >>> Regards, >>> Pavel >> >