hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From daemeon reiydelle <daeme...@gmail.com>
Subject Re: Hadoop on premise versus cloud
Date Mon, 26 Oct 2015 18:16:14 GMT
In addition to the data privacy concern described well below, there are a
couple of other areas you might consider (you can also respond privately
where I can be a bit more candid). . My experience with most banks (I work
with most of the players in the EU and US) are such that (1) below drives
development heavily into the cloud, in spite of (2)

   1. Physical plant processes?
   1. Your culture and processes may add months to a hardware delivery
      cycle for production systems,
      2. Does the data center under your employer's control even have the
      available racks, power, network ports and router software versions to
      support (bonding/stacking/teaming multiple 10gbit ports)
      3. Big data gets much less interesting and viable when layers like
      SAN and heavy hypervisors (VMware, Citrix) get into the mix, ditto "full
      nightly backups" and other interesting confusions about the tech, not to
      mention  heavily committed to this, and requier full DR and zero
data loss
      2. Data issues
   1. Are the data sources "inside" your employer's network which would
      require extra authorizations to allow them to connect to a cloud provider?
      2. Are the consumers of the data going to be able to access the
      cluster (similar questions if an intermediating data manipulation tool is
      access by your employee/consumers)
   3. As to data privacy
      1. There are several data center providers who are legally and
      entirely based within either the continental (Netherlands, Germany) or
      adding UK if you think that is an alternative. To my knowledge Amazon is
      not yet there but will be, I do not know if it is generally available but
      Google Compute does have such ringfenced facilities in the EU, etc.
   4. Now the real motiviations:
      1. Startup costs as you figure out your data ingest complexity and as
      your user expectations get clarified mean you seldom know what you will
      need in a manner that management needs for their planning cycle.
      2. Capital (hardware) costs are zero and all costs can be written off
      in the current period. (Management has no hit to their capital expenses
      3. Chainging (increasing ;{) costs can be directly tied to specific
      activities as they occur (customer wants more X, additional data
sources Y,
      and data Z% dirtier than expected ...  you know the drill, yes?)


*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
(+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Mon, Oct 26, 2015 at 8:01 AM, Leonard, Michael <Michael.Leonard@opco.com>

> Hi,
> I work at a large financial institution. I’m exploring deploying Hadoop
> and I’m trying to understand why I would deploy on premise when the cloud
> is faster and easier. What are the pros/cons of each? How does pricing
> compare between on premise and cloud deployments?
> Any color would be very helpful. Thank you in advance.
> Sincerely,
> Michael
> This communication and any attached files may contain information that is
> confidential or privileged. If this communication has been received in
> error, please delete or destroy it immediately. Please go to
> www.opco.com/EmailDisclosures

View raw message