accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Vines (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (ACCUMULO-1998) Encrypted WALogs seem to be excessively buffering
Date Fri, 20 Dec 2013 00:35:07 GMT

     [ https://issues.apache.org/jira/browse/ACCUMULO-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

John Vines updated ACCUMULO-1998:
---------------------------------

    Attachment: 0001-ACCUMULO-1998-forcing-Buffered-crypto-stream-to-flus.patch

So this solutions seems too simple. I'm worried about the repurcussions of flushing before
syncing because the flush() will clear out the buffered crypto stream, but then it will keep
flushing. Depending on the underlying filesystems, it may do nothing or it may keep flushing.

> Encrypted WALogs seem to be excessively buffering
> -------------------------------------------------
>
>                 Key: ACCUMULO-1998
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1998
>             Project: Accumulo
>          Issue Type: Bug
>            Reporter: Michael Allen
>            Priority: Blocker
>             Fix For: 1.6.0
>
>         Attachments: 0001-ACCUMULO-1998-forcing-Buffered-crypto-stream-to-flus.patch
>
>
> The reproduction steps around this are a little bit fuzzy but basically we ran a moderate
workload against a 1.6.0 server.  Encryption happened to be turned on but that doesn't seem
to be germane to the problem.  After doing a moderate amount of work, Accumulo is refusing
to start up, spewing this error over and over to the log:
> {noformat}
> 2013-12-10 10:23:02,529 [tserver.TabletServer] WARN : exception while doing multi-scan

> java.lang.RuntimeException: java.io.IOException: Failed to open hdfs://10.10.1.115:9000/accumulo/tables/!0/table_info/A000042x.rf
> 	at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:1125)
> 	at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> 	at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
> 	at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
> 	at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: Failed to open hdfs://10.10.1.115:9000/accumulo/tables/!0/table_info/A000042x.rf
> 	at org.apache.accumulo.tserver.FileManager.reserveReaders(FileManager.java:333)
> 	at org.apache.accumulo.tserver.FileManager.access$500(FileManager.java:58)
> 	at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFiles(FileManager.java:478)
> 	at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFileRefs(FileManager.java:466)
> 	at org.apache.accumulo.tserver.FileManager$ScanFileManager.openFiles(FileManager.java:486)
> 	at org.apache.accumulo.tserver.Tablet$ScanDataSource.createIterator(Tablet.java:2027)
> 	at org.apache.accumulo.tserver.Tablet$ScanDataSource.iterator(Tablet.java:1989)
> 	at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:163)
> 	at org.apache.accumulo.tserver.Tablet.lookup(Tablet.java:1565)
> 	at org.apache.accumulo.tserver.Tablet.lookup(Tablet.java:1672)
> 	at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:1114)
> 	... 6 more
> Caused by: java.io.FileNotFoundException: File does not exist: /accumulo/tables/!0/table_info/A000042x.rf
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchLocatedBlocks(DFSClient.java:2006)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1975)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1967)
> 	at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:735)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:165)
> 	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:436)
> 	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBCFile(CachableBlockFile.java:256)
> 	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.access$000(CachableBlockFile.java:143)
> 	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader$MetaBlockLoader.get(CachableBlockFile.java:212)
> 	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBlock(CachableBlockFile.java:313)
> 	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:367)
> 	at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:143)
> 	at org.apache.accumulo.core.file.rfile.RFile$Reader.<init>(RFile.java:825)
> 	at org.apache.accumulo.core.file.rfile.RFileOperations.openReader(RFileOperations.java:79)
> 	at org.apache.accumulo.core.file.DispatchingFileFactory.openReader(FileOperations.java:119)
> 	at org.apache.accumulo.tserver.FileManager.reserveReaders(FileManager.java:314)
> 	... 16 more
> {noformat}
> Here's some other pieces of context:
> HDFS contents:
> {noformat}
> ubuntu@ip-10-10-1-115:/data0/logs/accumulo$ hadoop fs -lsr /accumulo/tables/
> drwxr-xr-x   - accumulo hadoop          0 2013-12-10 00:32 /accumulo/tables/!0
> drwxr-xr-x   - accumulo hadoop          0 2013-12-10 01:06 /accumulo/tables/!0/default_tablet
> drwxr-xr-x   - accumulo hadoop          0 2013-12-10 10:49 /accumulo/tables/!0/table_info
> -rw-r--r--   5 accumulo hadoop       1698 2013-12-10 00:34 /accumulo/tables/!0/table_info/F0000000.rf
> -rw-r--r--   5 accumulo hadoop      43524 2013-12-10 01:53 /accumulo/tables/!0/table_info/F000062q.rf
> drwxr-xr-x   - accumulo hadoop          0 2013-12-10 00:32 /accumulo/tables/+r
> drwxr-xr-x   - accumulo hadoop          0 2013-12-10 10:45 /accumulo/tables/+r/root_tablet
> -rw-r--r--   5 accumulo hadoop       2070 2013-12-10 10:45 /accumulo/tables/+r/root_tablet/A0000738.rf
> drwxr-xr-x   - accumulo hadoop          0 2013-12-10 00:33 /accumulo/tables/1
> drwxr-xr-x   - accumulo hadoop          0 2013-12-10 00:33 /accumulo/tables/1/default_tablet
> {noformat}
> ZooKeeper entries
> {noformat}
> [zk: localhost:2181(CONNECTED) 6] get /accumulo/371cfa3e-fe96-4a50-92e9-da7572589ffa/root_tablet/dir

> hdfs://10.10.1.115:9000/accumulo/tables/+r/root_tablet
> cZxid = 0x1b
> ctime = Tue Dec 10 00:32:56 EST 2013
> mZxid = 0x1b
> mtime = Tue Dec 10 00:32:56 EST 2013
> pZxid = 0x1b
> cversion = 0
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 54
> numChildren = 0
> {noformat}
> I'm going to preserve the state of this machine in HDFS for a while but not forever,
so if there are other pieces of context people need, let me know.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message