hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "GOEKE, MATTHEW (AG/1000)" <matthew.go...@monsanto.com>
Subject RE: Experience with Hadoop in production
Date Fri, 24 Feb 2012 18:21:42 GMT
I would add that it also depends on how thoroughly you have vetted your use cases. If you have
already ironed out how ad-hoc access works, Kerberos vs Firewall and network segmentation,
how code submission works, procedures for various operational issues, backup of your data,
etc (the list is a couple hundred bullets long at minimum...) on your current cluster then
there might be little need for that support. However if you are hoping to figure that stuff
out still then you could potentially be in a world of hurt when you attempt the transition
with just your own staff. It also helps to have that outside advice in certain situations
to resolve cross department conflicts when it comes to how the cluster will be implemented


-----Original Message-----
From: Mike Lyon [mailto:mike.lyon@gmail.com] 
Sent: Thursday, February 23, 2012 2:33 PM
To: common-user@hadoop.apache.org
Subject: Re: Experience with Hadoop in production

Just be sure you have that corporate card available 24x7 when you need
to call support ;)

Sent from my iPhone

On Feb 23, 2012, at 10:30, Serge Blazhievsky
<Serge.Blazhiyevskyy@nice.com> wrote:

> What I have seen companies do often is that they will use free version of
> the commercial vendor and only get their support if there are major
> problems that they cannot solve on their own.
> That way you will get free distribution and insurance that you have
> support if something goes wrong.
> Serge
> On 2/23/12 10:42 AM, "Jamack, Peter" <PJamack@consilium1.com> wrote:
>> A lot of it depends on your staff and their experiences.
>> Maybe they don't have hadoop, but if they were involved with large
>> databases, data warehouse, etc they can utilize their skills & experiences
>> and provide a lot of help.
>> If you have linux admins, system admins, network admins with years of
>> experience, they will be a goldmine.    At the other end, database
>> developers who know SQL, programmers who know Java, and so on can really
>> help staff up your 'big data' team. Having a few people who know ETL would
>> be great too.
>> The biggest problem I've run into seems to be how big the Hadoop
>> project/team is or is not. Sometimes it's just an 'experimental'
>> department and therefore half the people are only 25-50 percent available
>> to help out.  And if they aren't really that knowledgeable about hadoop,
>> it tends to be one of those, not enough time in the day scenarios.  And
>> the few people dedicated to the Hadoop project(s) will get the brunt of
>> the work.
>> It's like any ecosystem.  To do it right, you might need system/network
>> admins, a storage person to actually know how to set up the proper storage
>> architecture, maybe a security expert,  a few programmers, and a few data
>> people.   If you're combining analytics, that's another group.  Of course
>> most companies outside the Google and Facebooks of the world,  will have a
>> few people dedicated to Hadoop.  Which means you need somebody who knows
>> storage, knows networking, knows linux, knows how to be a system admin,
>> knows security, and maybe other things(AKA if you have a firewall issue,
>> somebody needs to figure out ways to make it work through or around),  and
>> then you need some programmes who either know MapReduce or can pretty much
>> figure it out because they've done java for years.
>> Peter J
>> On 2/23/12 10:17 AM, "Pavel Frolov" <pfrolov@gmail.com> wrote:
>>> Hi,
>>> We are going into 24x7 production soon and we are considering whether we
>>> need vendor support or not.  We use a free vendor distribution of Cluster
>>> Provisioning + Hadoop + HBase and looked at their Enterprise version but
>>> it
>>> is very expensive for the value it provides (additional functionality +
>>> support), given that we┬╣ve already ironed out many of our performance and
>>> tuning issues on our own and with generous help from the community (e.g.
>>> all of you).
>>> So, I wanted to run it through the community to see if anybody can share
>>> their experience of running a Hadoop cluster (50+ nodes with Apache
>>> releases or Vendor distributions) in production, with in-house support
>>> only, and how difficult it was.  How many people were involved, etc..
>>> Regards,
>>> Pavel
This e-mail message may contain privileged and/or confidential information, and is intended
to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please notify the
sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of this e-mail
by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, reading and archival
by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence
of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such
code transmitted by or accompanying
this e-mail or any attachment.

The information contained in this email may be subject to the export control laws and regulations
of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and sanctions regulations
issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you
are obligated to comply with all
applicable U.S. export laws and regulations.

View raw message