accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dong Zhou <dz...@phemi.com>
Subject Re: Failed to assign map files to tablets during Bulk Import
Date Fri, 09 Feb 2018 22:36:10 GMT
Yes, we did try that too. accumulo rfile-info command reads the file
correctly.
Couple more things I forgot to mention in the original email are that:

   1. The same rfile was able to be loaded into a brand new empty table.
   2. The same rfile was able to be loaded into the same type of the table
   on a different cluster. By saying the same type, it means the table already
   contains some data that are generated with similar row ids.
   3. The Accumulo version we are using is 1.7.2-cdh5.5.0
   4. Accumulo monitor page also shows warning messages like:
      - Cannot split tablet o;TITLE|<some data>... TRUNCATED;TITLE|<some
      data>... TRUNCATED, selected split point too long.  Length :  1155888


On Fri, Feb 9, 2018 at 2:23 PM Keith Turner <keith@deenlo.com> wrote:

> On Fri, Feb 9, 2018 at 5:13 PM, Dong Zhou <dzhou@phemi.com> wrote:
> > The bulk import operation did finish, but it was marked as failed.
> > The failed to import rfile was moved into the failure directory.
> >
> > We tried manually to import this rfile using importdirectory, and then
> the
> > exact same outcome occurred and error log appeared in the tserver log
> file.
>
> Can you try running : accumulo rfile-info <path to file>
>
> I am curious if that command can read the file.
>
>
> >
> >
> > On Fri, Feb 9, 2018 at 1:46 PM Keith Turner <keith@deenlo.com> wrote:
> >>
> >> On Fri, Feb 9, 2018 at 2:26 PM, Dong Zhou <dzhou@phemi.com> wrote:
> >> > Hi All,
> >> >
> >> > We were trying to write some data into an Accumulo table that contains
> >> > roughly 3.7 Trillion entries using Bulk Import.
> >> >
> >> > Our code generates RFiles in distributed fashion, and most of them go
> >> > into
> >> > the table perfectly, excepting for 1 RFile.
> >> >
> >> > The bulk import failed with following error message. Please note that
> I
> >> > have
> >> > shortened the error message and also masked out some of the data.
> >> >
> >> >
> >>
> >> This message is from an intermediate tserver.  It possible this
> >> intermediate thread is running after the bulk import completed, more
> >> on this below.
> >>
> >> > 2018-02-05 21:47:07,244 [client.BulkImporter] ERROR: Encountered
> unknown
> >> > exception in assignMapFiles.
> >> > org.apache.thrift.TApplicationException: Internal error processing
> >> > bulkImport
> >> >         at
> >> >
> >> >
> org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
> >> >         at
> >> > org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_bulkImport(TabletClientService.java:594)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.bulkImport(TabletClientService.java:577)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.server.client.BulkImporter.assignMapFiles(BulkImporter.java:600)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.server.client.BulkImporter.access$400(BulkImporter.java:77)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.server.client.BulkImporter$AssignmentTask.run(BulkImporter.java:479)
> >> >         at
> >> >
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> >> >         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >> >         at
> >> >
> >> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >> >         at
> >> >
> >> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
> >> >         at java.lang.Thread.run(Thread.java:745)
> >> > 2018-02-05 21:47:07,244 [client.BulkImporter] INFO : Could not assign
> 1
> >> > map
> >> > files to tablet o;SESSION_ID|****;SESSION_ID|**** because :
> >> > org.apache.thrift.TApplicationException: Internal error processing
> >> > bulkImport .  Will retry ...
> >> > 2018-02-05 21:47:07,244 [client.BulkImporter] INFO : Could not assign
> 1
> >> > map
> >> > files to tablet o;SESSION_ID|****;SESSION_ID|**** because :
> >> > org.apache.thrift.TApplicationException: Internal error processing
> >> > bulkImport .  Will retry ...
> >> > 2018-02-05 21:47:07,244 [client.BulkImporter] INFO : Could not assign
> 1
> >> > map
> >> > files to tablet o;SESSION_ID|****;SESSION_ID|**** because :
> >> > org.apache.thrift.TApplicationException: Internal error processing
> >> > bulkImport .  Will retry ...
> >> > 2018-02-05 21:47:07,244 [client.BulkImporter] INFO : Could not assign
> 1
> >> > map
> >> > files to tablet o;SESSION_ID|****;SESSION_ID|**** because :
> >> > org.apache.thrift.TApplicationException: Internal error processing
> >> > bulkImport .  Will retry ...
> >> > 2018-02-05 21:47:07,244 [client.BulkImporter] INFO : Could not assign
> 1
> >> > map
> >> > files to tablet o;SESSION_ID|****;SESSION_ID|**** because :
> >> > org.apache.thrift.TApplicationException: Internal error processing
> >> > bulkImport .  Will retry ...
> >> > ...
> >> > ...
> >> > ...
> >>
> >> Each bulk import has a transaction id in zookeeper.  When bulk import
> >> finishes, it removes this transaction id from zookeeper (this is a
> >> step in a bulk import FATE tx).  Writes are made to the metadata table
> >> as part of bulk import. The metadata table has a constraint sanity
> >> check that ensure the bulk import is active.  The exception below is
> >> this check failing.
> >>
> >> Below is the code that removes this Id from ZK.
> >>
> >>
> >>
> https://github.com/apache/accumulo/blob/rel/1.8.1/server/master/src/main/java/org/apache/accumulo/master/tableOps/CompleteBulkImport.java#L42
> >>
> >> Bulk import spawns work threads on intermediate tablet servers to
> >> inspect rfiles indexes  to determine where the files should go.  These
> >> intermediate tablet servers instruct other tablet servers to load
> >> files.  The error message below is from one of these 2nd hop tservers.
> >> Its possible that this spawned work thread is still running after the
> >> bulk import has completed and this message could be ignored.  Did the
> >> bulk import operation fail of succeed from the Accumulo client
> >> perspective?   Its also possible that this error message is unrelated
> >> to the reason the bulk import failed.  Was there one file in the fail
> >> dir?
> >>
> >> > 2018-02-05 21:47:07,494 [impl.Writer] ERROR: error sending update to
> >> > <tserver ip address>:<tserver port>:
> >> >
> >> >
> ConstraintViolationException(violationSummaries:[TConstraintViolationSummary(constrainClass:org.apache.accumulo.server.constraints.MetadataConstraints,
> >> > violationCode:8, violationDescription:Bulk load transaction no longer
> >> > running, numberOfViolatingMutations:1)])
> >> > 2018-02-05 21:47:07,494 [util.MetadataTableUtil] ERROR: null
> >> >
> >> >
> ConstraintViolationException(violationSummaries:[TConstraintViolationSummary(constrainClass:org.apache.accumulo.server.constraints.MetadataConstraints,
> >> > violationCode:8, violationDescription:Bulk load transaction no longer
> >> > running, numberOfViolatingMutations:1)])
> >> >         at
> >> >
> >> >
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$update_result$update_resultStandardScheme.read(TabletClientService.java:15928)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$update_result$update_resultStandardScheme.read(TabletClientService.java:15896)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$update_result.read(TabletClientService.java:15830)
> >> >         at
> >> > org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_update(TabletClientService.java:468)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.update(TabletClientService.java:451)
> >> >         at
> >> >
> org.apache.accumulo.core.client.impl.Writer.updateServer(Writer.java:72)
> >> >         at
> >> > org.apache.accumulo.core.client.impl.Writer.update(Writer.java:98)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.server.util.MetadataTableUtil.update(MetadataTableUtil.java:153)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.server.util.MetadataTableUtil.update(MetadataTableUtil.java:145)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.server.util.MetadataTableUtil.updateTabletDataFile(MetadataTableUtil.java:196)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.tserver.tablet.Tablet.updatePersistedTime(Tablet.java:2657)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.tserver.tablet.DatafileManager.importMapFiles(DatafileManager.java:268)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.tserver.tablet.Tablet.importMapFiles(Tablet.java:2376)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.bulkImport(TabletServer.java:435)
> >> >         at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown
> Source)
> >> >         at
> >> >
> >> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >> >         at java.lang.reflect.Method.invoke(Method.java:498)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.core.trace.wrappers.RpcServerInvocationHandler.invoke(RpcServerInvocationHandler.java:46)
> >> >         at
> >> > org.apache.accumulo.server.rpc.RpcWrapper$1.invoke(RpcWrapper.java:74)
> >> >         at com.sun.proxy.$Proxy22.bulkImport(Unknown Source)
> >> >         at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown
> Source)
> >> >         at
> >> >
> >> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >> >         at java.lang.reflect.Method.invoke(Method.java:498)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.server.rpc.TCredentialsUpdatingInvocationHandler.invokeMethod(TCredentialsUpdatingInvocationHandler.java:154)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.server.rpc.TCredentialsUpdatingInvocationHandler.invoke(TCredentialsUpdatingInvocationHandler.java:58)
> >> >         at com.sun.proxy.$Proxy22.bulkImport(Unknown Source)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$bulkImport.getResult(TabletClientService.java:2585)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$bulkImport.getResult(TabletClientService.java:2569)
> >> >         at
> >> > org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> >> >         at
> >> > org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.server.rpc.UGIAssumingProcessor.process(UGIAssumingProcessor.java:102)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:63)
> >> >         at
> >> >
> >> >
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:225)
> >> >         at
> >> >
> >> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >> >         at
> >> >
> >> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
> >> >         at java.lang.Thread.run(Thread.java:745)
> >> > ...
> >> > ...
> >> > ...
> >>
> >> I think this error message is from an intermediate tablet sever.
> >>
> >> > 2018-02-05 21:47:07,526 [client.BulkImporter] ERROR: Encountered
> unknown
> >> > exception in assignMapFiles.
> >> > org.apache.thrift.TApplicationException: Internal error processing
> >> > bulkImport
> >> >         at
> >> >
> >> >
> org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
> >> >         at
> >> > org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_bulkImport(TabletClientService.java:594)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.bulkImport(TabletClientService.java:577)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.server.client.BulkImporter.assignMapFiles(BulkImporter.java:600)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.server.client.BulkImporter.access$400(BulkImporter.java:77)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.server.client.BulkImporter$AssignmentTask.run(BulkImporter.java:479)
> >> >         at
> >> >
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> >> >         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >> >         at
> >> >
> >> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >> >         at
> >> >
> >> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >> >         at
> >> >
> >> >
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
> >> >         at java.lang.Thread.run(Thread.java:745)
> >> > ...
> >> > ...
> >> > ...
> >> > 2018-02-05 21:47:08,542 [client.BulkImporter] INFO : Could not assign
> 1
> >> > map
> >> > files to tablet o;checksum|****;checksum|**** because :
> >> > org.apache.thrift.TApplicationException: Internal error processing
> >> > bulkImport .  Will retry ...
> >> > 2018-02-05 21:47:08,542 [client.BulkImporter] INFO : Could not assign
> 1
> >> > map
> >> > files to tablet o;checksum|****;checksum|**** because :
> >> > org.apache.thrift.TApplicationException: Internal error processing
> >> > bulkImport .  Will retry ...
> >> > 2018-02-05 21:47:08,542 [client.BulkImporter] INFO : Could not assign
> 1
> >> > map
> >> > files to tablet o;checksum|****;checksum|**** because :
> >> > org.apache.thrift.TApplicationException: Internal error processing
> >> > bulkImport .  Will retry ...
> >> > 2018-02-05 21:47:08,542 [client.BulkImporter] INFO : Could not assign
> 1
> >> > map
> >> > files to tablet o;checksum|****;checksum|**** because :
> >> > org.apache.thrift.TApplicationException: Internal error processing
> >> > bulkImport .  Will retry ...
> >> > 2018-02-05 21:47:08,542 [client.BulkImporter] INFO : Could not assign
> 1
> >> > map
> >> > files to tablet o;checksum|****;checksum|**** because :
> >> > org.apache.thrift.TApplicationException: Internal error processing
> >> > bulkImport .  Will retry ...
> >> >
> >> >
>

Mime
View raw message