falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suhas Vasu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-1096) Scan Hive Metastore to automatically create Falcon feeds for existing Hive tables
Date Mon, 16 Mar 2015 09:08:38 GMT

    [ https://issues.apache.org/jira/browse/FALCON-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362952#comment-14362952
] 

Suhas Vasu commented on FALCON-1096:
------------------------------------

As per FALCON-703, we are introducing a plugin which listens to JMSmessages and registers
the partitions on HCatalog table.

I prefer to have another plugin which solves this purpose, so that it can be enabled by users
who really need that feature.


> Scan Hive Metastore to automatically create Falcon feeds for existing Hive tables
> ---------------------------------------------------------------------------------
>
>                 Key: FALCON-1096
>                 URL: https://issues.apache.org/jira/browse/FALCON-1096
>             Project: Falcon
>          Issue Type: New Feature
>            Reporter: Adam Kawa
>
> In my organisation we create a Hive table for each production dataset in HDFS. When creating
a Hive table, you supply a lot of information about your dataset: its name, fields and their
types and comments, the location, the data format, properties in form of the key-value pairs
and meaningful description of the dataset. We think of Hive as a central and nicely documented
repository of our datasets.
> When using Falcon, we again need to create Falcon feed for each dataset (that corresponds
to a Hive table) and even specify multiple redundant properties (e.g. description).
> To make it simpler, Falcon could scan the Hive Metastore and automatically create feeds
for each Hive table and inherit its properties.
> The properties of Hive tables could be also used when searching for a dataset using new
Falcon Web UI e.g. field name, field comment, file format (some other statistics like total
file size, the last modification or access time could be also used).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message