hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-20517) Creation of staging directory and Move operation is taking time in S3
Date Tue, 11 Sep 2018 14:47:01 GMT

    [ https://issues.apache.org/jira/browse/HIVE-20517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16610735#comment-16610735
] 

Hive QA commented on HIVE-20517:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12939204/HIVE-20517.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14939 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/13715/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13715/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13715/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12939204 - PreCommit-HIVE-Build

> Creation of staging directory and Move operation is taking time in S3
> ---------------------------------------------------------------------
>
>                 Key: HIVE-20517
>                 URL: https://issues.apache.org/jira/browse/HIVE-20517
>             Project: Hive
>          Issue Type: Sub-task
>          Components: repl
>    Affects Versions: 4.0.0
>            Reporter: mahesh kumar behera
>            Assignee: mahesh kumar behera
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: HIVE-20517.01.patch, HIVE-20517.02.patch
>
>
> Operations like insert and add partition creates a staging directory to generate the
files and then move the files created to actual location. In replication flow, the files are
first copied to the staging directory and then moved (rename) to the actual table location.
In case of S3, move is not an atomic operation. It internally does a copy and delete. So it
can not guarantee the consistency required. So it is better to copy the files directly to
the actual location. This will help in avoiding the staging directory creation (which takes
1-2 seconds in s3) and move (which takes time proportional to file size).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message