accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-1998) Encrypted WALogs seem to be excessively buffering
Date Wed, 08 Jan 2014 20:58:51 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865878#comment-13865878
] 

ASF subversion and git services commented on ACCUMULO-1998:
-----------------------------------------------------------

Commit 443cba7a7a3838b547880f1c49f2a9e0128692cd in branch refs/heads/master from [~vines]
[ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=443cba7 ]

ACCUMULO-1998

All encrypted walog events are now individual blocked on disk. This leads to an additional
maxBlockSize parameter (mostly to handle OOM from mismatched crypto). Additionally, because
of this behavior, as well as PKCS5 behavior, I have turned off all padding on the default
crypto configs and padding should not be used as it can cause data loss in walogs. I have
hammered 5 instances on and off every minute for 22 hours and counting with no related issues,
so I deem it a fix.


> Encrypted WALogs seem to be excessively buffering
> -------------------------------------------------
>
>                 Key: ACCUMULO-1998
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1998
>             Project: Accumulo
>          Issue Type: Bug
>            Reporter: Michael Allen
>            Assignee: John Vines
>            Priority: Blocker
>             Fix For: 1.6.0
>
>         Attachments: 0001-ACCUMULO-1998-Working-around-the-cipher-s-buffer-by-.patch,
0001-ACCUMULO-1998-forcing-Buffered-crypto-stream-to-flus.patch, 0001-ACCUMULO-1998.patch,
0002-ACCUMULO-1998-forcing-Buffered-crypto-stream-to-flus.patch, 0002-ACCUMULO-1998.patch,
0003-ACCUMULO-1998-forcing-Buffered-crypto-stream-to-flus.patch, 0004-ACCUMULO-1998-forcing-Buffered-crypto-stream-to-flus.patch
>
>
> The reproduction steps around this are a little bit fuzzy but basically we ran a moderate
workload against a 1.6.0 server.  Encryption happened to be turned on but that doesn't seem
to be germane to the problem.  After doing a moderate amount of work, Accumulo is refusing
to start up, spewing this error over and over to the log:
> {noformat}
> 2013-12-10 10:23:02,529 [tserver.TabletServer] WARN : exception while doing multi-scan

> java.lang.RuntimeException: java.io.IOException: Failed to open hdfs://10.10.1.115:9000/accumulo/tables/!0/table_info/A000042x.rf
> 	at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:1125)
> 	at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> 	at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
> 	at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
> 	at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: Failed to open hdfs://10.10.1.115:9000/accumulo/tables/!0/table_info/A000042x.rf
> 	at org.apache.accumulo.tserver.FileManager.reserveReaders(FileManager.java:333)
> 	at org.apache.accumulo.tserver.FileManager.access$500(FileManager.java:58)
> 	at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFiles(FileManager.java:478)
> 	at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFileRefs(FileManager.java:466)
> 	at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFiles(FileManager.java:486)
> 	at org.apache.accumulo.tserver.Tablet$ScanDataSource.createIterator(Tablet.java:2027)
> 	at org.apache.accumulo.tserver.Tablet$ScanDataSource.iterator(Tablet.java:1989)
> 	at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:163)
> 	at org.apache.accumulo.tserver.Tablet.lookup(Tablet.java:1565)
> 	at org.apache.accumulo.tserver.Tablet.lookup(Tablet.java:1672)
> 	at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:1114)
> 	... 6 more
> Caused by: java.io.FileNotFoundException: File does not exist: /accumulo/tables/!0/table_info/A000042x.rf
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchLocatedBlocks(DFSClient.java:2006)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1975)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1967)
> 	at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:735)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:165)
> 	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:436)
> 	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBCFile(CachableBlockFile.java:256)
> 	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.access$000(CachableBlockFile.java:143)
> 	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader$MetaBlockLoader.get(CachableBlockFile.java:212)
> 	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBlock(CachableBlockFile.java:313)
> 	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:367)
> 	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:143)
> 	at org.apache.accumulo.core.file.rfile.RFile$Reader.<init>(RFile.java:825)
> 	at org.apache.accumulo.core.file.rfile.RFileOperations.openReader(RFileOperations.java:79)
> 	at org.apache.accumulo.core.file.DispatchingFileFactory.openReader(FileOperations.java:119)
> 	at org.apache.accumulo.tserver.FileManager.reserveReaders(FileManager.java:314)
> 	... 16 more
> {noformat}
> Here's some other pieces of context:
> HDFS contents:
> {noformat}
> ubuntu@ip-10-10-1-115:/data0/logs/accumulo$ hadoop fs -lsr /accumulo/tables/
> drwxr-xr-x   - accumulo hadoop          0 2013-12-10 00:32 /accumulo/tables/!0
> drwxr-xr-x   - accumulo hadoop          0 2013-12-10 01:06 /accumulo/tables/!0/default_tablet
> drwxr-xr-x   - accumulo hadoop          0 2013-12-10 10:49 /accumulo/tables/!0/table_info
> -rw-r--r--   5 accumulo hadoop       1698 2013-12-10 00:34 /accumulo/tables/!0/table_info/F0000000.rf
> -rw-r--r--   5 accumulo hadoop      43524 2013-12-10 01:53 /accumulo/tables/!0/table_info/F000062q.rf
> drwxr-xr-x   - accumulo hadoop          0 2013-12-10 00:32 /accumulo/tables/+r
> drwxr-xr-x   - accumulo hadoop          0 2013-12-10 10:45 /accumulo/tables/+r/root_tablet
> -rw-r--r--   5 accumulo hadoop       2070 2013-12-10 10:45 /accumulo/tables/+r/root_tablet/A0000738.rf
> drwxr-xr-x   - accumulo hadoop          0 2013-12-10 00:33 /accumulo/tables/1
> drwxr-xr-x   - accumulo hadoop          0 2013-12-10 00:33 /accumulo/tables/1/default_tablet
> {noformat}
> ZooKeeper entries
> {noformat}
> [zk: localhost:2181(CONNECTED) 6] get /accumulo/371cfa3e-fe96-4a50-92e9-da7572589ffa/root_tablet/dir

> hdfs://10.10.1.115:9000/accumulo/tables/+r/root_tablet
> cZxid = 0x1b
> ctime = Tue Dec 10 00:32:56 EST 2013
> mZxid = 0x1b
> mtime = Tue Dec 10 00:32:56 EST 2013
> pZxid = 0x1b
> cversion = 0
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 54
> numChildren = 0
> {noformat}
> I'm going to preserve the state of this machine in HDFS for a while but not forever,
so if there are other pieces of context people need, let me know.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message