hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vaibhav Aggarwal (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1620) Patch to write directly to S3 from Hive
Date Wed, 08 Sep 2010 18:25:32 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12907349#action_12907349
] 

Vaibhav Aggarwal commented on HIVE-1620:
----------------------------------------

This is the reason the patch uses task id instead of attempt id to write to s3.
Each process writes to the same file. In case of s3 the last process to commit the file wins.

Hadoop tasks are supposed to be idempotent hence this should work.

> Patch to write directly to S3 from Hive
> ---------------------------------------
>
>                 Key: HIVE-1620
>                 URL: https://issues.apache.org/jira/browse/HIVE-1620
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Vaibhav Aggarwal
>            Assignee: Vaibhav Aggarwal
>         Attachments: HIVE-1620.patch
>
>
> We want to submit a patch to Hive which allows user to write files directly to S3.
> This patch allow user to specify an S3 location as the table output location and hence
eliminates the need  of copying data from HDFS to S3.
> Users can run Hive queries directly over the data stored in S3.
> This patch helps integrate hive with S3 better and quicker.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message