falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Venkat Ranganathan <vranganat...@hortonworks.com>
Subject Meetup on Data Management and Data Movement
Date Mon, 14 Mar 2016 22:43:36 GMT
There will be Falcon content and also Sqoop content with user stories on using Falcon and Sqoop
along with other components.   Please see the following info

==
Hadoop Data Management and Data Movement

  *   Tell a friend<http://www.meetup.com/futureofdata-siliconvalley/events/229478721/>
  *   Share<http://www.meetup.com/futureofdata-siliconvalley/events/229478721/>

  *
Thursday, March 24, 2016
6:30 PM to 9:30 PM
  *
Hortonworks HQ<https://maps.google.com/maps?f=q&hl=en&q=5470+Great+America+Parkway%2C+Santa+Clara%2C+CA%2C+us>

5470 Great America Parkway, Santa Clara, CA (map<https://maps.google.com/maps?f=q&hl=en&q=5470+Great+America+Parkway%2C+Santa+Clara%2C+CA%2C+us>)

  *

Join us to get the latest scoop on Sqoop, Falcon and learn how customers are doing Hadoop
data management and their use cases.


For many enterprises getting to into a data lake can be a big challenge.  Part of that challenge
is being able to have enterprise grade governance of who is loading or exporting the data
and what are they doing with the data.


Apache Falcon<http://falcon.apache.org/> allows an enterprise to process a single massive
dataset stored in HDFS in multiple ways—for batch, interactive and streaming applications.
With more data and more users of that data, Apache Falcon’s data governance capabilities
play a critical role in managing data pipelines at scale. As the value of Hadoop data increases,
so does the importance of cleaning that data, preparing it for business intelligence tools,
and removing it from the cluster when it outlives its useful life.


The Falcon framework can also leverage other Hadoop components, such as Pig, HDFS, and Oozie<http://oozie.apache.org/>.
Falcon enables this simplified management by providing a framework to define, deploy, and
manage data pipelines.


RDBMS data is another primary data source for the data lake.  Apache Sqoop<http://sqoop.apache.org/>
is an open source tool to move structured data from an RDBMS to HDFS.


Come to this meetup to learn how customers are managing their data pipelines, learn about
about the current state of Falcon and it’s roadmap and learn what’s coming with Sqoop.


Agenda

6:30-7:00 Doors Open: Registration, Welcome & Networking

7:00-7:20 Hadoop data management use case

7:20-8:00 Apache Falcon data management features in 0.9 and demo

8:00-8:20 Talk on Apache Falcon futures

8:20-8:40 Apache Sqoop 2 (tentative)

8:40 pm Review, Close and Thank you for Attending

Join<http://www.meetup.com/futureofdata-siliconvalley/join/?joinFrom=event&eventId=229478721>
this Meetup to comment.

==
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message