accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Busbey (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-1867) Split failed during conditional randomwalk test
Date Fri, 08 Nov 2013 19:06:17 GMT


Sean Busbey commented on ACCUMULO-1867:

the "Wrong filesystem" error happens when an HDFS client is incorrectly configured for writing
to a HA set up and they rely on absolute paths. the paths should have nameservices in the
path, not particular NameNodes. Otherwise, whenever there is a failover you'll get this kind
of error for whatever files have the other namenode in their path.

I don't know why HA HDFS lets you successfully access paths with the current active namenode,
since it just makes people catch this kind of error later. :(

Does your cluster have a proper [nameservice configuration|]?

> Split failed during conditional randomwalk test
> -----------------------------------------------
>                 Key: ACCUMULO-1867
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>            Reporter: Keith Turner
>            Priority: Critical
>             Fix For: 1.6.0
> I left the conditional random walk test running overnight against 1.6.0-SNAPSHOT configured
to use two namenodes.   After running for a few hours   a client saw a split operating failand
I saw the following corresponding error message in the tserver logs.
> {noformat}
> 2013-11-08 12:31:59,227 [util.FileUtil] DEBUG: Too many indexes (33) to open at once
for null null, reducing in tmpDir = /accumulo-1.6/tmp/idxReduce_1116774712
> 2013-11-08 12:31:59,369 [thrift.ProcessFunction] ERROR: Internal error processing splitTablet
> java.lang.IllegalArgumentException: Wrong FS: hdfs://nn2:9001/accumulo-1.6/tables/2/t-0000ew3/F0000ex7.rf,
expected: hdfs://nn1:6093
>         at org.apache.hadoop.fs.FileSystem.checkPath(
>         at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(
>         at
>         at
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBCFile(
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.access$000(
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader$MetaBlockLoader.get(
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBlock(
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(
>         at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(
>         at org.apache.accumulo.core.file.rfile.RFile$Reader.<init>(
>         at org.apache.accumulo.core.file.rfile.RFileOperations.openIndex(
>         at org.apache.accumulo.core.file.DispatchingFileFactory.openIndex(
>         at org.apache.accumulo.server.util.FileUtil.reduceFiles(
>         at org.apache.accumulo.server.util.FileUtil.estimatePercentageLTE(
>         at org.apache.accumulo.tserver.Tablet.split(
>         at org.apache.accumulo.tserver.TabletServer.splitTablet(
>         at org.apache.accumulo.tserver.TabletServer.access$1600(
>         at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.splitTablet(
>         at sun.reflect.GeneratedMethodAccessor34.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(
>         at java.lang.reflect.Method.invoke(
>         at org.apache.accumulo.trace.instrument.thrift.TraceWrap$1.invoke(
>         at $Proxy10.splitTablet(Unknown Source)
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$splitTablet.getResult(
>         at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$splitTablet.getResult(
>         at org.apache.thrift.ProcessFunction.process(
>         at org.apache.thrift.TBaseProcessor.process(
>         at org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(
>         at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(
>         at org.apache.accumulo.server.util.TServerUtils$THsHaServer$
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
>         at java.util.concurrent.ThreadPoolExecutor$
>         at
>         at
>         at
> {noformat}
> nn1 is the default namenode.  The "Too many indexes" message may be important.  That
message indicates the split code entered special code that handles tablets w/ lots of files.

This message was sent by Atlassian JIRA

View raw message