hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bing Li (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-13850) File name conflict when have multiple INSERT INTO queries running in parallel
Date Thu, 26 May 2016 14:41:12 GMT

     [ https://issues.apache.org/jira/browse/HIVE-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bing Li updated HIVE-13850:
---------------------------
    Description: 
We have an application which connect to HiveServer2 via JDBC.
In the application, it executes "INSERT INTO" query to the same table.

If there are a lot of users running the application at the same time. Some of the INSERT could
fail.

The root cause is that in Hive.checkPaths(), it uses the following method to check the existing
of the file. But if there are multiple inserts running in parallel, it will led to the conflict.

for (int counter = 1; fs.exists(itemDest) || destExists(result, itemDest); counter++) {
              itemDest = new Path(destf, name + ("_copy_" + counter) + filetype);
            }


The Error Message
===========================
In hive log,
org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error      
while moving files!!! Cannot move hdfs://node:8020/apps/hive/warehouse/met               
                        
adata.db/scalding_stats/.hive-staging_hive_2016-05-10_18-46-            
23_642_2056172497900766879-3321/-ext-10000/000000_0 to hdfs://node:8020/apps/hive        
                         
/warehouse/metadata.db/scalding_stats/000000_0_copy_9014                
        at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java: 
2719)                                                                   
        at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java: 
1645)          

                                                        
In hadoop log, 
WARN  hdfs.StateChange (FSDirRenameOp.java:     
unprotectedRenameTo(174)) - DIR* FSDirectory.unprotectedRenameTo:       
failed to rename /apps/hive/warehouse/metadata.db/scalding_stats/.hive- 
staging_hive_2016-05-10_18-46-23_642_2056172497900766879-3321/-ext-     
10000/000000_0 to /apps/hive/warehouse/metadata.                        
db/scalding_stats/000000_0_copy_9014 because destination exists

  was:
We have an application which connect to HiveServer2 via JDBC.
In the application, it executes "INSERT INTO" query to the same table.

If there are a lot of users running the application at the same time. Some of the INSERT could
fail.

In hive log,
org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error      
while moving files!!! Cannot move hdfs://node:8020/apps/hive/warehouse/met               
                        
adata.db/scalding_stats/.hive-staging_hive_2016-05-10_18-46-            
23_642_2056172497900766879-3321/-ext-10000/000000_0 to hdfs://node:8020/apps/hive        
                         
/warehouse/metadata.db/scalding_stats/000000_0_copy_9014                
        at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java: 
2719)                                                                   
        at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java: 
1645)          

                                                        
In hadoop log, 
WARN  hdfs.StateChange (FSDirRenameOp.java:     
unprotectedRenameTo(174)) - DIR* FSDirectory.unprotectedRenameTo:       
failed to rename /apps/hive/warehouse/metadata.db/scalding_stats/.hive- 
staging_hive_2016-05-10_18-46-23_642_2056172497900766879-3321/-ext-     
10000/000000_0 to /apps/hive/warehouse/metadata.                        
db/scalding_stats/000000_0_copy_9014 because destination exists


> File name conflict when have multiple INSERT INTO queries running in parallel
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-13850
>                 URL: https://issues.apache.org/jira/browse/HIVE-13850
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.2.1
>            Reporter: Bing Li
>            Assignee: Bing Li
>
> We have an application which connect to HiveServer2 via JDBC.
> In the application, it executes "INSERT INTO" query to the same table.
> If there are a lot of users running the application at the same time. Some of the INSERT
could fail.
> The root cause is that in Hive.checkPaths(), it uses the following method to check the
existing of the file. But if there are multiple inserts running in parallel, it will led to
the conflict.
> for (int counter = 1; fs.exists(itemDest) || destExists(result, itemDest); counter++)
{
>               itemDest = new Path(destf, name + ("_copy_" + counter) + filetype);
>             }
> The Error Message
> ===========================
> In hive log,
> org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error      
> while moving files!!! Cannot move hdfs://node:8020/apps/hive/warehouse/met          
                             
> adata.db/scalding_stats/.hive-staging_hive_2016-05-10_18-46-            
> 23_642_2056172497900766879-3321/-ext-10000/000000_0 to hdfs://node:8020/apps/hive   
                              
> /warehouse/metadata.db/scalding_stats/000000_0_copy_9014                
>         at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java: 
> 2719)                                                                   
>         at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java: 
> 1645)          
>                                                         
> In hadoop log, 
> WARN  hdfs.StateChange (FSDirRenameOp.java:     
> unprotectedRenameTo(174)) - DIR* FSDirectory.unprotectedRenameTo:       
> failed to rename /apps/hive/warehouse/metadata.db/scalding_stats/.hive- 
> staging_hive_2016-05-10_18-46-23_642_2056172497900766879-3321/-ext-     
> 10000/000000_0 to /apps/hive/warehouse/metadata.                        
> db/scalding_stats/000000_0_copy_9014 because destination exists



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message