Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CD1D010E9D for ; Tue, 8 Apr 2014 14:30:47 +0000 (UTC) Received: (qmail 63073 invoked by uid 500); 8 Apr 2014 14:30:46 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 62682 invoked by uid 500); 8 Apr 2014 14:30:43 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 62664 invoked by uid 99); 8 Apr 2014 14:30:41 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Apr 2014 14:30:41 +0000 X-ASF-Spam-Status: No, hits=3.8 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_SOFTFAIL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: softfail (nike.apache.org: transitioning domain of wilhelm.von.cloud@accumulo.net does not designate 209.85.213.50 as permitted sender) Received: from [209.85.213.50] (HELO mail-yh0-f50.google.com) (209.85.213.50) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Apr 2014 14:30:35 +0000 Received: by mail-yh0-f50.google.com with SMTP id c41so932029yho.9 for ; Tue, 08 Apr 2014 07:30:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=hDuavXgTXb8i9jv7uS7a9uKDsfkiF1yT3HqYs5I/lbA=; b=Fkk3BGkBZ+1JE0ljRifudYq6lXcdOgBovitEltJRcCPAn9LivafCbm1dvPT3DstOTZ DpBW0H6jwP0e89UOm1CUY/cCiY2N36rJ/lrNWuUk+KYXvH34168vKt4PBkaOP4jFUI0c LWlY5xnmG3RZwb8luxdFqCuVatKMLfUz8rWpKYsOm+wqaOiDTPae+ub95+AA0WhPe5IY a5jh5oJN85WKjRYaxw8KDBJ5NHsXgIvDeCa9RM/NRtYz+pJrDRGetHtb15jaaVXYGNcI tgz5CkwbVPXXdMxxrL2xloDuux61Qy8UbeVFFQZR9MM/iorOEYax3/h0KRNa/Wdh6mB3 O46g== X-Gm-Message-State: ALoCoQl8YaDjm0AVofQsC4Jn2ykdALT+r/VAfzOwesyfTSHYPR2pGHJ7ItboxrdLmpEJ7kcDK2F3 MIME-Version: 1.0 X-Received: by 10.236.116.131 with SMTP id g3mr3371571yhh.110.1396967412790; Tue, 08 Apr 2014 07:30:12 -0700 (PDT) Received: by 10.170.117.14 with HTTP; Tue, 8 Apr 2014 07:30:12 -0700 (PDT) X-Originating-IP: [98.117.207.73] In-Reply-To: <1396964408022-8904.post@n5.nabble.com> References: <1396964408022-8904.post@n5.nabble.com> Date: Tue, 8 Apr 2014 10:30:12 -0400 Message-ID: Subject: Re: bulk ingest without mapred From: William Slacum To: user@accumulo.apache.org Content-Type: multipart/alternative; boundary=20cf303e9f3c679fdb04f688d281 X-Virus-Checked: Checked by ClamAV on apache.org --20cf303e9f3c679fdb04f688d281 Content-Type: text/plain; charset=ISO-8859-1 java.io.FileNotFoundException: File does not exist: bulk/entities_fails/failures sticks out to me. it looks like a relative path. where does that directory exist on your file system? On Tue, Apr 8, 2014 at 9:40 AM, pdread wrote: > Hi > > I interface to an accumulo cloud (100s of nodes) which I don't maintain. > I'll try and keep this short, the interface App is used to ingest millions > of docs/week from various streams, some are required near real time. A > problem came up where the tservers would not stay up and our ingest would > halt. Now the admins are working on fixing this but I'm not optimistic. > Others who have run into this tell me its the use of Mutations that is > causing the problem and it will go away if I do bulk ingest. However > mapreduce is way to slow to spin up and does not map to our arch. > > So here is what I have been trying to do. After much research I think I > should be able to bulk ingest if I create the RFile and feed this to > TableOperations.importDirectory(). I can create the RFile ok, at least I > thinks so, I create the "failure" directory using hadoops' file system. I > check that the failure directory is there and is a directory but when I > feed > it to the import I get an error over on the accumulo master log that the it > can not find the failure directory. Now the interesting thing is I have > traced the code thourgh the accumulo client it checks successfully for the > load file and the failure directory. What am I doing wrong? > > First the client error: > > org.apache.accumulo.core.client.AccumuloException: Internal error > processing > waitForTableOperation > at > > org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:290) > at > > org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:258) > at > > org.apache.accumulo.core.client.admin.TableOperationsImpl.importDirectory(TableOperationsImpl.java:945) > at > > airs.medr.accumulo.server.table.EntityTable.writeEntities(EntityTable.java:130) > > Now the master log exception: > > 2014-04-08 08:33:50,609 [thrift.MasterClientService$Processor] ERROR: > Internal error processing waitForTableOperation > java.lang.RuntimeException: java.io.FileNotFoundException: File does not > exist: bulk/entities_fails/failures > at > > org.apache.accumulo.server.master.Master$MasterClientServiceHandler.waitForTableOperation(Master.java:1053) > at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > > org.apache.accumulo.cloudtrace.instrument.thrift.TraceWrap$1.invoke(TraceWrap.java:59) > at $Proxy6.waitForTableOperation(Unknown Source) > at > > org.apache.accumulo.core.master.thrift.MasterClientService$Processor$waitForTableOperation.process(MasterClientService.java:2004) > at > > org.apache.accumulo.core.master.thrift.MasterClientService$Processor.process(MasterClientService.java:1472) > at > > org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:154) > at > > org.apache.thrift.server.TNonblockingServer$FrameBuffer.invoke(TNonblockingServer.java:631) > at > > org.apache.accumulo.server.util.TServerUtils$THsHaServer$Invocation.run(TServerUtils.java:202) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at > org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.io.FileNotFoundException: File does not exist: > bulk/entities_fails/failures > at > > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:528) > at > > org.apache.accumulo.server.trace.TraceFileSystem.getFileStatus(TraceFileSystem.java:797) > at > > org.apache.accumulo.server.master.tableOps.BulkImport.call(BulkImport.java:157) > at > > org.apache.accumulo.server.master.tableOps.BulkImport.call(BulkImport.java:110) > at > > org.apache.accumulo.server.master.tableOps.TraceRepo.call(TraceRepo.java:65) > at > org.apache.accumulo.server.fate.Fate$TransactionRunner.run(Fate.java:65) > > > Thoughts? > > Thanks > > Paul > > > > > -- > View this message in context: > http://apache-accumulo.1065345.n5.nabble.com/bulk-ingest-without-mapred-tp8904.html > Sent from the Users mailing list archive at Nabble.com. > --20cf303e9f3c679fdb04f688d281 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
java.io.FileNotFoundException: File does not exist:<= br> bulk/entities_fails/failures

sticks out to me. it looks like = a relative path. where does that directory exist on your file system?


On Tue, A= pr 8, 2014 at 9:40 AM, pdread <paul.read@siginttech.com> wrote:
Hi

I interface to an accumulo cloud (100s of nodes) which I don't maintain= .
I'll try and keep this short, the interface App is used to ingest milli= ons
of docs/week from various streams, some are required near real time. A
problem came up where the tservers would not stay up and our ingest would halt. Now the admins are working on fixing this but I'm not optimistic.=
Others who have run into this tell me its the use of Mutations that is
causing the problem and it will go away if I do bulk ingest. However
mapreduce is way to slow to spin up and does not map to our arch.

So here is what I have been trying to do. After much research I think I
should be able to bulk ingest if I create the RFile and feed this to
TableOperations.importDirectory(). I can create the RFile ok, at least I thinks so, I create the "failure" directory using hadoops' fi= le system. I
check that the failure directory is there and is a directory but when I fee= d
it to the import I get an error over on the accumulo master log that the it=
can not find the failure directory. Now the interesting thing is I have
traced the code thourgh the accumulo client it checks successfully for the<= br> load file and the failure directory. What am I doing wrong?

First the client error:

org.apache.accumulo.core.client.AccumuloException: Internal error processin= g
waitForTableOperation
=A0 =A0 =A0 =A0 at
org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(= TableOperationsImpl.java:290)
=A0 =A0 =A0 =A0 at
org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(= TableOperationsImpl.java:258)
=A0 =A0 =A0 =A0 at
org.apache.accumulo.core.client.admin.TableOperationsImpl.importDirectory(T= ableOperationsImpl.java:945)
=A0 =A0 =A0 =A0 at
airs.medr.accumulo.server.table.EntityTable.writeEntities(EntityTable.java:= 130)

Now the master log exception:

2014-04-08 08:33:50,609 [thrift.MasterClientService$Processor] ERROR:
Internal error processing waitForTableOperation
java.lang.RuntimeException: java.io.FileNotFoundException: File does not exist: bulk/entities_fails/failures
=A0 =A0 =A0 =A0 at
org.apache.accumulo.server.master.Master$MasterClientServiceHandler.waitFor= TableOperation(Master.java:1053)
=A0 =A0 =A0 =A0 at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Sou= rce)
=A0 =A0 =A0 =A0 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImp= l.java:25)
=A0 =A0 =A0 =A0 at java.lang.reflect.Method.invoke(Method.java:597)
=A0 =A0 =A0 =A0 at
org.apache.accumulo.cloudtrace.instrument.thrift.TraceWrap$1.invoke(TraceWr= ap.java:59)
=A0 =A0 =A0 =A0 at $Proxy6.waitForTableOperation(Unknown Source)
=A0 =A0 =A0 =A0 at
org.apache.accumulo.core.master.thrift.MasterClientService$Processor$waitFo= rTableOperation.process(MasterClientService.java:2004)
=A0 =A0 =A0 =A0 at
org.apache.accumulo.core.master.thrift.MasterClientService$Processor.proces= s(MasterClientService.java:1472)
=A0 =A0 =A0 =A0 at
org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServer= Utils.java:154)
=A0 =A0 =A0 =A0 at
org.apache.thrift.server.TNonblockingServer$FrameBuffer.invoke(TNonblocking= Server.java:631)
=A0 =A0 =A0 =A0 at
org.apache.accumulo.server.util.TServerUtils$THsHaServer$Invocation.run(TSe= rverUtils.java:202)
=A0 =A0 =A0 =A0 at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.j= ava:886)
=A0 =A0 =A0 =A0 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:= 908)
=A0 =A0 =A0 =A0 at
org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)<= br> =A0 =A0 =A0 =A0 at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.FileNotFoundException: File does not exist:
bulk/entities_fails/failures
=A0 =A0 =A0 =A0 at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileS= ystem.java:528)
=A0 =A0 =A0 =A0 at
org.apache.accumulo.server.trace.TraceFileSystem.getFileStatus(TraceFileSys= tem.java:797)
=A0 =A0 =A0 =A0 at
org.apache.accumulo.server.master.tableOps.BulkImport.call(BulkImport.java:= 157)
=A0 =A0 =A0 =A0 at
org.apache.accumulo.server.master.tableOps.BulkImport.call(BulkImport.java:= 110)
=A0 =A0 =A0 =A0 at
org.apache.accumulo.server.master.tableOps.TraceRepo.call(TraceRepo.java:65= )
=A0 =A0 =A0 =A0 at
org.apache.accumulo.server.fate.Fate$TransactionRunner.run(Fate.java:65)

Thoughts?

Thanks

Paul




--
View this message in context: http:/= /apache-accumulo.1065345.n5.nabble.com/bulk-ingest-without-mapred-tp8904.ht= ml
Sent from the Users mailing list archive at Nabble.com.

--20cf303e9f3c679fdb04f688d281--