hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Xu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-11685) StorageException complaining " no lease ID" during HBase distributed log splitting
Date Mon, 09 Mar 2015 17:50:41 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-11685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Duo Xu updated HADOOP-11685:
----------------------------
    Description: 
This is similar to HADOOP-11523, but in a different place. During HBase distributed log splitting,
multiple threads will access the same folder called "recovered.edits". However, lots of places
in our WASB code did not acquire lease and simply passed null to Azure storage, which caused
this issue.

{code}
2015-02-26 03:21:28,871 WARN org.apache.hadoop.hbase.regionserver.SplitLogWorker: log splitting
of WALs/workernode4.hbaseproddm2001.g6.internal.cloudapp.net,60020,1422071058425-splitting/workernode4.hbaseproddm2001.g6.internal.cloudapp.net%2C60020%2C1422071058425.1424914216773
failed, returning error
java.io.IOException: org.apache.hadoop.fs.azure.AzureException: java.io.IOException
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.checkForErrors(HLogSplitter.java:633)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.access$000(HLogSplitter.java:121)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$OutputSink.finishWriting(HLogSplitter.java:964)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.finishWritingAndClose(HLogSplitter.java:1019)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:359)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:223)
	at org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:142)
	at org.apache.hadoop.hbase.regionserver.handler.HLogSplitterHandler.process(HLogSplitterHandler.java:79)
	at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.fs.azure.AzureException: java.io.IOException
	at org.apache.hadoop.fs.azurenative.AzureNativeFileSystemStore.storeEmptyFolder(AzureNativeFileSystemStore.java:1477)
	at org.apache.hadoop.fs.azurenative.NativeAzureFileSystem.mkdirs(NativeAzureFileSystem.java:1862)
	at org.apache.hadoop.fs.azurenative.NativeAzureFileSystem.mkdirs(NativeAzureFileSystem.java:1812)
	at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1815)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getRegionSplitEditsPath(HLogSplitter.java:502)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.createWAP(HLogSplitter.java:1211)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.getWriterAndPath(HLogSplitter.java:1200)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1243)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:851)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:843)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:813)
Caused by: java.io.IOException
	at com.microsoft.windowsazure.storage.core.Utility.initIOException(Utility.java:493)
	at com.microsoft.windowsazure.storage.blob.BlobOutputStream.close(BlobOutputStream.java:282)
	at org.apache.hadoop.fs.azurenative.AzureNativeFileSystemStore.storeEmptyFolder(AzureNativeFileSystemStore.java:1472)
	... 10 more
Caused by: com.microsoft.windowsazure.storage.StorageException: There is currently a lease
on the blob and no lease ID was specified in the request.
	at com.microsoft.windowsazure.storage.StorageException.translateException(StorageException.java:163)
	at com.microsoft.windowsazure.storage.core.StorageRequest.materializeException(StorageRequest.java:306)
	at com.microsoft.windowsazure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:229)
	at com.microsoft.windowsazure.storage.blob.CloudBlockBlob.commitBlockList(CloudBlockBlob.java:248)
	at com.microsoft.windowsazure.storage.blob.BlobOutputStream.commit(BlobOutputStream.java:319)
	at com.microsoft.windowsazure.storage.blob.BlobOutputStream.close(BlobOutputStream.java:279)
	... 11 more
{code}

  was:
This is similar to HADOOP-11523, but in a different place. During HBase distributed log splitting,
multiple threads will access the same folder called "recovered.edits". However, lots of places
in our WASB code did not acquire lease and simply passed null to Azure storage, which caused
this issue.

{code}
2015-02-26 03:21:28,871 WARN org.apache.hadoop.hbase.regionserver.SplitLogWorker: log splitting
of WALs/workernode4.hbaseproddm2001.g6.internal.cloudapp.net,60020,1422071058425-splitting/workernode4.hbaseproddm2001.g6.internal.cloudapp.net%2C60020%2C1422071058425.1424914216773
failed, returning error
java.io.IOException: org.apache.hadoop.fs.azure.AzureException: java.io.IOException
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.checkForErrors(HLogSplitter.java:633)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.access$000(HLogSplitter.java:121)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$OutputSink.finishWriting(HLogSplitter.java:964)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.finishWritingAndClose(HLogSplitter.java:1019)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:359)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:223)
	at org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:142)
	at org.apache.hadoop.hbase.regionserver.handler.HLogSplitterHandler.process(HLogSplitterHandler.java:79)
	at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.fs.azure.AzureException: java.io.IOException
	at org.apache.hadoop.fs.azurenative.AzureNativeFileSystemStore.storeEmptyFolder(AzureNativeFileSystemStore.java:1477)
	at org.apache.hadoop.fs.azurenative.NativeAzureFileSystem.mkdirs(NativeAzureFileSystem.java:1862)
	at org.apache.hadoop.fs.azurenative.NativeAzureFileSystem.mkdirs(NativeAzureFileSystem.java:1812)
	at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1815)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getRegionSplitEditsPath(HLogSplitter.java:502)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.createWAP(HLogSplitter.java:1211)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.getWriterAndPath(HLogSplitter.java:1200)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1243)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:851)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:843)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:813)
Caused by: java.io.IOException
	at com.microsoft.windowsazure.storage.core.Utility.initIOException(Utility.java:493)
	at com.microsoft.windowsazure.storage.blob.BlobOutputStream.close(BlobOutputStream.java:282)
	at org.apache.hadoop.fs.azurenative.AzureNativeFileSystemStore.storeEmptyFolder(AzureNativeFileSystemStore.java:1472)
	... 10 more
Caused by: com.microsoft.windowsazure.storage.StorageException: There is currently a lease
on the blob and no lease ID was specified in the request.
	at com.microsoft.windowsazure.storage.StorageException.translateException(StorageException.java:163)
	at com.microsoft.windowsazure.storage.core.StorageRequest.materializeException(StorageRequest.java:306)
	at com.microsoft.windowsazure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:229)
	at com.microsoft.windowsazure.storage.blob.CloudBlockBlob.commitBlockList(CloudBlockBlob.java:248)
	at com.microsoft.windowsazure.storage.blob.BlobOutputStream.commit(BlobOutputStream.java:319)
	at com.microsoft.windowsazure.storage.blob.BlobOutputStream.close(BlobOutputStream.java:279)
	... 11 more
{code}

The fix is simple, just to acquire lease before the operation mkdir. However, this might hurt
performance a little bit, I will locally run a perf test before submitting the patch. 


> StorageException complaining " no lease ID" during HBase distributed log splitting
> ----------------------------------------------------------------------------------
>
>                 Key: HADOOP-11685
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11685
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: tools
>            Reporter: Duo Xu
>            Assignee: Duo Xu
>
> This is similar to HADOOP-11523, but in a different place. During HBase distributed log
splitting, multiple threads will access the same folder called "recovered.edits". However,
lots of places in our WASB code did not acquire lease and simply passed null to Azure storage,
which caused this issue.
> {code}
> 2015-02-26 03:21:28,871 WARN org.apache.hadoop.hbase.regionserver.SplitLogWorker: log
splitting of WALs/workernode4.hbaseproddm2001.g6.internal.cloudapp.net,60020,1422071058425-splitting/workernode4.hbaseproddm2001.g6.internal.cloudapp.net%2C60020%2C1422071058425.1424914216773
failed, returning error
> java.io.IOException: org.apache.hadoop.fs.azure.AzureException: java.io.IOException
> 	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.checkForErrors(HLogSplitter.java:633)
> 	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.access$000(HLogSplitter.java:121)
> 	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$OutputSink.finishWriting(HLogSplitter.java:964)
> 	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.finishWritingAndClose(HLogSplitter.java:1019)
> 	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:359)
> 	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:223)
> 	at org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:142)
> 	at org.apache.hadoop.hbase.regionserver.handler.HLogSplitterHandler.process(HLogSplitterHandler.java:79)
> 	at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.fs.azure.AzureException: java.io.IOException
> 	at org.apache.hadoop.fs.azurenative.AzureNativeFileSystemStore.storeEmptyFolder(AzureNativeFileSystemStore.java:1477)
> 	at org.apache.hadoop.fs.azurenative.NativeAzureFileSystem.mkdirs(NativeAzureFileSystem.java:1862)
> 	at org.apache.hadoop.fs.azurenative.NativeAzureFileSystem.mkdirs(NativeAzureFileSystem.java:1812)
> 	at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1815)
> 	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getRegionSplitEditsPath(HLogSplitter.java:502)
> 	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.createWAP(HLogSplitter.java:1211)
> 	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.getWriterAndPath(HLogSplitter.java:1200)
> 	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1243)
> 	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:851)
> 	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:843)
> 	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:813)
> Caused by: java.io.IOException
> 	at com.microsoft.windowsazure.storage.core.Utility.initIOException(Utility.java:493)
> 	at com.microsoft.windowsazure.storage.blob.BlobOutputStream.close(BlobOutputStream.java:282)
> 	at org.apache.hadoop.fs.azurenative.AzureNativeFileSystemStore.storeEmptyFolder(AzureNativeFileSystemStore.java:1472)
> 	... 10 more
> Caused by: com.microsoft.windowsazure.storage.StorageException: There is currently a
lease on the blob and no lease ID was specified in the request.
> 	at com.microsoft.windowsazure.storage.StorageException.translateException(StorageException.java:163)
> 	at com.microsoft.windowsazure.storage.core.StorageRequest.materializeException(StorageRequest.java:306)
> 	at com.microsoft.windowsazure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:229)
> 	at com.microsoft.windowsazure.storage.blob.CloudBlockBlob.commitBlockList(CloudBlockBlob.java:248)
> 	at com.microsoft.windowsazure.storage.blob.BlobOutputStream.commit(BlobOutputStream.java:319)
> 	at com.microsoft.windowsazure.storage.blob.BlobOutputStream.close(BlobOutputStream.java:279)
> 	... 11 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message