Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A71B411F74 for ; Thu, 20 Feb 2014 18:12:30 +0000 (UTC) Received: (qmail 60323 invoked by uid 500); 20 Feb 2014 18:12:29 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 60254 invoked by uid 500); 20 Feb 2014 18:12:28 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 60068 invoked by uid 99); 20 Feb 2014 18:12:24 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Feb 2014 18:12:24 +0000 Date: Thu, 20 Feb 2014 18:12:24 +0000 (UTC) From: "Eric Newton (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-2388) Continuous Ingest clients die during HDFS agitation MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907263#comment-13907263 ] Eric Newton commented on ACCUMULO-2388: --------------------------------------- The log shows this tablet server was held, minor compactions were probably hanging. But there's no errors bubbling up from DFSClient. Do you know what sort of agitation was going on at this time? NN or a bunch of DN's? > Continuous Ingest clients die during HDFS agitation > --------------------------------------------------- > > Key: ACCUMULO-2388 > URL: https://issues.apache.org/jira/browse/ACCUMULO-2388 > Project: Accumulo > Issue Type: Bug > Components: test, tserver > Environment: 1.6.0-SNAPSHOT (sha-1: 0da9a56) > cdh4.5.0 > Reporter: Mike Drob > Labels: 16_qa_bug > Fix For: 1.6.0 > > Attachments: tserver1.log > > > I was running continuous ingest on a 7 node cluster (5 slaves) and after enabling HDFS agitation, my clients died. > {code:title=ingest.err} > Thread "org.apache.accumulo.test.continuous.ContinuousIngest" died java.lang.reflect.InvocationTargetException > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.accumulo.start.Main$1.run(Main.java:137) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.lang.reflect.UndeclaredThrowableException > at $Proxy9.addMutation(Unknown Source) > at org.apache.accumulo.test.continuous.ContinuousIngest.main(ContinuousIngest.java:212) > ... 6 more > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.accumulo.trace.instrument.TraceProxy$2.invoke(TraceProxy.java:43) > ... 8 more > Caused by: org.apache.accumulo.core.client.MutationsRejectedException: # constraint violations : 0 security codes: {} # server errors 1 # exceptions 0 > at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.checkForFailures(TabletServerBatchWriter.java:537) > at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.addMutation(TabletServerBatchWriter.java:258) > at org.apache.accumulo.core.client.impl.BatchWriterImpl.addMutation(BatchWriterImpl.java:43) > ... 12 more > {code} > {code:title=ingest.out} > UUID 1392844086463 f822a6a9-9592-4b3a-ab3b-1c172be20b96 > FLUSH 1392844135523 49047 6165 1000000 1000000 > FLUSH 1392844165594 30071 7787 2000000 1000000 > FLUSH 1392844195875 30281 7816 3000000 1000000 > FLUSH 1392844226787 30912 8086 4000000 1000000 > FLUSH 1392844257194 30407 7989 5000000 1000000 > FLUSH 1392844287518 30324 7743 6000000 1000000 > FLUSH 1392844325833 38315 10933 7000000 1000000 > FLUSH 1392844364708 38875 7916 8000000 1000000 > FLUSH 1392844395818 31110 8104 9000000 1000000 > 2014-02-19 13:16:57,444 [impl.TabletServerBatchWriter] ERROR: Server side error on tserver1:10011: org.apache.thrift.TApplicationException: Internal error processing applyUpdates > 2014-02-19 13:16:57,446 [impl.TabletServerBatchWriter] ERROR: Failed to send tablet server tserver1:10011 its batch : Error on server tserver1:10011 > org.apache.accumulo.core.client.impl.AccumuloServerException: Error on server tserver1:10011 > at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.sendMutationsToTabletServer(TabletServerBatchWriter.java:937) > at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.access$1600(TabletServerBatchWriter.java:616) > at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter$SendTask.send(TabletServerBatchWriter.java:801) > at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter$SendTask.run(TabletServerBatchWriter.java:765) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47) > at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.thrift.TApplicationException: Internal error processing applyUpdates > at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) > at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) > at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_closeUpdate(TabletClientService.java:431) > at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.closeUpdate(TabletClientService.java:417) > at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.sendMutationsToTabletServer(TabletServerBatchWriter.java:899) > ... 11 more > {code} > {code:title=tserver.log} > 2014-02-19 13:16:56,156 [util.TServerUtils$THsHaServer] WARN : Got an IOException in internalRead! > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcher.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:198) > at sun.nio.ch.IOUtil.read(IOUtil.java:171) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243) > at org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141) > at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:515) > at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:355) > at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:202) > at org.apache.thrift.server.TNonblockingServer$SelectAcceptThread.select(TNonblockingServer.java:198) > at org.apache.thrift.server.TNonblockingServer$SelectAcceptThread.run(TNonblockingServer.java:154) > {code} > Note that this last message was not propagated to the monitor for some reason, but that is likely a different issue. (I had been seeing other WARN messages show up earlier.) -- This message was sent by Atlassian JIRA (v6.1.5#6160)