hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergio Peña (JIRA) <>
Subject [jira] [Commented] (HIVE-14270) Write temporary data to HDFS when doing inserts on tables located on S3
Date Wed, 27 Jul 2016 21:07:21 GMT


Sergio Peña commented on HIVE-14270:

Another thing I just thought about having a configuration variable is that Hive won't be able
to ensure the quality of new values set by the user as there won't be tests that could verify
that.  We will provide S3 tests for now, but if something fails with azure for instance, then
users will complain about it. I think it would be better to define a list of supported blob
store schemes for now that are verified by tests and committers. What do you think?

Another thing, I am thinking on add a bunch of S3 tests like Hadoop does in another task.
This testing is taking me a little more time, and S3 is already supported in Hive without
such S3 integration test. Also, this patch will unblock the other subtasks for the S3 umbrella
jira. Does this sound reasonable?

> Write temporary data to HDFS when doing inserts on tables located on S3
> -----------------------------------------------------------------------
>                 Key: HIVE-14270
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Sergio Peña
>            Assignee: Sergio Peña
>         Attachments: HIVE-14270.1.patch
> Currently, when doing INSERT statements on tables located at S3, Hive writes and reads
temporary (or intermediate) files to S3 as well. 
> If HDFS is still the default filesystem on Hive, then we can keep such temporary files
on HDFS to keep things run faster.

This message was sent by Atlassian JIRA

View raw message