From dev-return-712-archive-asf-public=cust-asf.ponee.io@crail.apache.org Tue Jul 2 14:02:01 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 2CE9F18060E for ; Tue, 2 Jul 2019 16:02:01 +0200 (CEST) Received: (qmail 25336 invoked by uid 500); 2 Jul 2019 14:02:00 -0000 Mailing-List: contact dev-help@crail.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@crail.apache.org Delivered-To: mailing list dev@crail.apache.org Received: (qmail 25309 invoked by uid 99); 2 Jul 2019 14:01:59 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Jul 2019 14:01:59 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 2946A180F68 for ; Tue, 2 Jul 2019 14:01:59 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.603 X-Spam-Level: X-Spam-Status: No, score=0.603 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_NUMSUBJECT=0.5, NUMERIC_HTTP_ADDR=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, WEIRD_PORT=0.001] autolearn=disabled Received: from mx1-he-de.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id k2fAnyZag8Qw for ; Tue, 2 Jul 2019 14:01:53 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2001:8e0:40:325::36; helo=gozo.iway.ch; envelope-from=pepperjo@japf.ch; receiver= Received: from gozo.iway.ch (gozo.iway.ch [IPv6:2001:8e0:40:325::36]) by mx1-he-de.apache.org (ASF Mail Server at mx1-he-de.apache.org) with ESMTPS id A16007E20D for ; Tue, 2 Jul 2019 14:01:52 +0000 (UTC) Received: from gozo.iway.ch (localhost [127.0.0.1]) by localhost (Postfix) with ESMTP id 9C2EE3411D1; Tue, 2 Jul 2019 16:01:46 +0200 (CEST) X-Iway-Path: 0 Received: from localhost (localhost [127.0.0.1]) by localhost (ACF/3959.1817); Tue, 2 Jul 2019 16:01:46 +0200 (CEST) Received: from switchplus-mail.ch (switchplus-mail.ch [212.25.8.236]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by gozo.iway.ch (Postfix) with ESMTPS; Tue, 2 Jul 2019 16:01:46 +0200 (CEST) Received: from [195.176.20.45] (account pepperjo@japf.ch) by japf.ch (CommuniGate Pro XIMSS 6.2.12) with HTTPU id 98069072; Tue, 02 Jul 2019 16:01:46 +0200 X-Mailer: CommuniGate Pronto! HTML5 6.2.4460 Subject: Re: Setting up storage class 1 and 2 MIME-Version: 1.0 From: "Jonas Pfefferle" In-Reply-To: References: To: dev@crail.apache.org, "David Crespi" Date: Tue, 02 Jul 2019 16:01:46 +0200 Message-ID: Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Transfer-Encoding: 8bit I typically do use the start-crail.sh script. Then you have to put all the command line arguments in the slaves file. The configuration files need to be identical. In our configuration we put the conf file on a NFS share this way we don't have to bother with it being synchronized between the nodes. Regards, Jonas On Tue, 2 Jul 2019 13:48:31 +0000 David Crespi wrote: > Thanks for the info Jonas. > > Quick question… do you typically start the datanodes from the >namenode using the command line? > > I’ve been launching containers independently of the namenode. The >containers do have the same > > base configuration file, but I pass in behaviors via environment >variables. > > > Regards, > > > David > > > ________________________________ >From: Jonas Pfefferle > Sent: Tuesday, July 2, 2019 4:27:05 AM > To: dev@crail.apache.org; David Crespi > Subject: Re: Setting up storage class 1 and 2 > > Hi David, > > > We run a great mix of configurations of NVMf and RDMA storage tiers >with > different storage classes, e.g. 3 storage classes where a group of >NVMf > datanodes is 0, another group of NVMf server is 1 and the RDMA >datanodes are > storage class 2. So this should work. I understand that the setup >might be a > bit tricky in the beginning. > > From your logs I see that you do not use the same configuration file >for > all containers. It is crucial that e.g. the order of storage types >etc is > the same in all configuration files. They have to be identical. To >specify a > storage class for a datanode you need to append "-c 1" (storage >class 1) > when starting the datanode. You can find the details of how exactly >this > works here: >https://incubator-crail.readthedocs.io/en/latest/run.html > The last example in "Starting Crail manually" talks about this. > > Regarding the patched version, I have to take another look. Please >use the > Apache Crail master for now (It will hang with Spark at the end of >your job > but it should run through). > > Regards, > Jonas > > On Tue, 2 Jul 2019 00:27:33 +0000 > David Crespi wrote: >> Jonas, >> >> Just wanted to be sure I’m doing things correctly. It runs okay >>without adding in the NVMf datanode (i.e. >> >> completes teragen). When I add the NVMf node in, even without using >>it on the run, it hangs during the >> >> terasort, with nothing being written to the datanode – only the >>metadata is created (i.e. /spark). >> >> >> My config is: >> >> 1 namenode container >> >> 1 rdma datanode storage class 1 container >> >> 1 nvmf datanode storage class 1 container. >> >> >> The namenode is showing that both datanode are starting up as >> >> Type 0 to storage class 0… is that correct? >> >> >> NameNode log at startup: >> >> 19/07/01 17:18:16 INFO crail: initalizing namenode >> >> 19/07/01 17:18:16 INFO crail: crail.version 3101 >> >> 19/07/01 17:18:16 INFO crail: crail.directorydepth 16 >> >> 19/07/01 17:18:16 INFO crail: crail.tokenexpiration 10 >> >> 19/07/01 17:18:16 INFO crail: crail.blocksize 1048576 >> >> 19/07/01 17:18:16 INFO crail: crail.cachelimit 0 >> >> 19/07/01 17:18:16 INFO crail: crail.cachepath /dev/hugepages/cache >> >> 19/07/01 17:18:16 INFO crail: crail.user crail >> >> 19/07/01 17:18:16 INFO crail: crail.shadowreplication 1 >> >> 19/07/01 17:18:16 INFO crail: crail.debug true >> >> 19/07/01 17:18:16 INFO crail: crail.statistics false >> >> 19/07/01 17:18:16 INFO crail: crail.rpctimeout 1000 >> >> 19/07/01 17:18:16 INFO crail: crail.datatimeout 1000 >> >> 19/07/01 17:18:16 INFO crail: crail.buffersize 1048576 >> >> 19/07/01 17:18:16 INFO crail: crail.slicesize 65536 >> >> 19/07/01 17:18:16 INFO crail: crail.singleton true >> >> 19/07/01 17:18:16 INFO crail: crail.regionsize 1073741824 >> >> 19/07/01 17:18:16 INFO crail: crail.directoryrecord 512 >> >> 19/07/01 17:18:16 INFO crail: crail.directoryrandomize true >> >> 19/07/01 17:18:16 INFO crail: crail.cacheimpl >>org.apache.crail.memory.MappedBufferCache >> >> 19/07/01 17:18:16 INFO crail: crail.locationmap >> >> 19/07/01 17:18:16 INFO crail: crail.namenode.address >>crail://minnie:9060?id=0&size=1 >> >> 19/07/01 17:18:16 INFO crail: crail.namenode.blockselection >>roundrobin >> >> 19/07/01 17:18:16 INFO crail: crail.namenode.fileblocks 16 >> >> 19/07/01 17:18:16 INFO crail: crail.namenode.rpctype >>org.apache.crail.namenode.rpc.tcp.TcpNameNode >> >> 19/07/01 17:18:16 INFO crail: crail.namenode.log >> >> 19/07/01 17:18:16 INFO crail: crail.storage.types >>org.apache.crail.storage.nvmf.NvmfStorageTier,org.apache.crail.storage.rdma.RdmaStorageTier >> >> 19/07/01 17:18:16 INFO crail: crail.storage.classes 2 >> >> 19/07/01 17:18:16 INFO crail: crail.storage.rootclass 1 >> >> 19/07/01 17:18:16 INFO crail: crail.storage.keepalive 2 >> >> 19/07/01 17:18:16 INFO crail: round robin block selection >> >> 19/07/01 17:18:16 INFO crail: round robin block selection >> >> 19/07/01 17:18:16 INFO narpc: new NaRPC server group v1.0, >>queueDepth 32, messageSize 512, nodealy true, cores 2 >> >> 19/07/01 17:18:16 INFO crail: crail.namenode.tcp.queueDepth 32 >> >> 19/07/01 17:18:16 INFO crail: crail.namenode.tcp.messageSize 512 >> >> 19/07/01 17:18:16 INFO crail: crail.namenode.tcp.cores 2 >> >> 19/07/01 17:18:17 INFO crail: new connection from >>/192.168.1.164:39260 >> >> 19/07/01 17:18:17 INFO narpc: adding new channel to selector, from >>/192.168.1.164:39260 >> >> 19/07/01 17:18:17 INFO crail: adding datanode /192.168.3.100:4420 of >>type 0 to storage class 0 >> >> 19/07/01 17:18:17 INFO crail: new connection from >>/192.168.1.164:39262 >> >> 19/07/01 17:18:17 INFO narpc: adding new channel to selector, from >>/192.168.1.164:39262 >> >> 19/07/01 17:18:18 INFO crail: adding datanode /192.168.3.100:50020 >>of type 0 to storage class 0 >> >> >> The RDMA datanode – it is set to have 4x1GB hugepages: >> >> 19/07/01 17:18:17 INFO crail: crail.version 3101 >> >> 19/07/01 17:18:17 INFO crail: crail.directorydepth 16 >> >> 19/07/01 17:18:17 INFO crail: crail.tokenexpiration 10 >> >> 19/07/01 17:18:17 INFO crail: crail.blocksize 1048576 >> >> 19/07/01 17:18:17 INFO crail: crail.cachelimit 0 >> >> 19/07/01 17:18:17 INFO crail: crail.cachepath /dev/hugepages/cache >> >> 19/07/01 17:18:17 INFO crail: crail.user crail >> >> 19/07/01 17:18:17 INFO crail: crail.shadowreplication 1 >> >> 19/07/01 17:18:17 INFO crail: crail.debug true >> >> 19/07/01 17:18:17 INFO crail: crail.statistics false >> >> 19/07/01 17:18:17 INFO crail: crail.rpctimeout 1000 >> >> 19/07/01 17:18:17 INFO crail: crail.datatimeout 1000 >> >> 19/07/01 17:18:17 INFO crail: crail.buffersize 1048576 >> >> 19/07/01 17:18:17 INFO crail: crail.slicesize 65536 >> >> 19/07/01 17:18:17 INFO crail: crail.singleton true >> >> 19/07/01 17:18:17 INFO crail: crail.regionsize 1073741824 >> >> 19/07/01 17:18:17 INFO crail: crail.directoryrecord 512 >> >> 19/07/01 17:18:17 INFO crail: crail.directoryrandomize true >> >> 19/07/01 17:18:17 INFO crail: crail.cacheimpl >>org.apache.crail.memory.MappedBufferCache >> >> 19/07/01 17:18:17 INFO crail: crail.locationmap >> >> 19/07/01 17:18:17 INFO crail: crail.namenode.address >>crail://minnie:9060 >> >> 19/07/01 17:18:17 INFO crail: crail.namenode.blockselection >>roundrobin >> >> 19/07/01 17:18:17 INFO crail: crail.namenode.fileblocks 16 >> >> 19/07/01 17:18:17 INFO crail: crail.namenode.rpctype >>org.apache.crail.namenode.rpc.tcp.TcpNameNode >> >> 19/07/01 17:18:17 INFO crail: crail.namenode.log >> >> 19/07/01 17:18:17 INFO crail: crail.storage.types >>org.apache.crail.storage.rdma.RdmaStorageTier >> >> 19/07/01 17:18:17 INFO crail: crail.storage.classes 1 >> >> 19/07/01 17:18:17 INFO crail: crail.storage.rootclass 1 >> >> 19/07/01 17:18:17 INFO crail: crail.storage.keepalive 2 >> >> 19/07/01 17:18:17 INFO disni: creating RdmaProvider of type 'nat' >> >> 19/07/01 17:18:17 INFO disni: jverbs jni version 32 >> >> 19/07/01 17:18:17 INFO disni: sock_addr_in size mismatch, jverbs >>size 28, native size 16 >> >> 19/07/01 17:18:17 INFO disni: IbvRecvWR size match, jverbs size 32, >>native size 32 >> >> 19/07/01 17:18:17 INFO disni: IbvSendWR size mismatch, jverbs size >>72, native size 128 >> >> 19/07/01 17:18:17 INFO disni: IbvWC size match, jverbs size 48, >>native size 48 >> >> 19/07/01 17:18:17 INFO disni: IbvSge size match, jverbs size 16, >>native size 16 >> >> 19/07/01 17:18:17 INFO disni: Remote addr offset match, jverbs size >>40, native size 40 >> >> 19/07/01 17:18:17 INFO disni: Rkey offset match, jverbs size 48, >>native size 48 >> >> 19/07/01 17:18:17 INFO disni: createEventChannel, objId >>140349068383088 >> >> 19/07/01 17:18:17 INFO disni: passive endpoint group, maxWR 32, >>maxSge 4, cqSize 3200 >> >> 19/07/01 17:18:17 INFO disni: createId, id 140349068429968 >> >> 19/07/01 17:18:17 INFO disni: new server endpoint, id 0 >> >> 19/07/01 17:18:17 INFO disni: launching cm processor, cmChannel 0 >> >> 19/07/01 17:18:17 INFO disni: bindAddr, address /192.168.3.100:50020 >> >> 19/07/01 17:18:17 INFO disni: listen, id 0 >> >> 19/07/01 17:18:17 INFO disni: allocPd, objId 140349068679808 >> >> 19/07/01 17:18:17 INFO disni: setting up protection domain, context >>100, pd 1 >> >> 19/07/01 17:18:17 INFO disni: PD value 1 >> >> 19/07/01 17:18:17 INFO crail: crail.storage.rdma.interface enp94s0f1 >> >> 19/07/01 17:18:17 INFO crail: crail.storage.rdma.port 50020 >> >> 19/07/01 17:18:17 INFO crail: crail.storage.rdma.storagelimit >>4294967296 >> >> 19/07/01 17:18:17 INFO crail: crail.storage.rdma.allocationsize >>1073741824 >> >> 19/07/01 17:18:17 INFO crail: crail.storage.rdma.datapath >>/dev/hugepages/rdma >> >> 19/07/01 17:18:17 INFO crail: crail.storage.rdma.localmap true >> >> 19/07/01 17:18:17 INFO crail: crail.storage.rdma.queuesize 32 >> >> 19/07/01 17:18:17 INFO crail: crail.storage.rdma.type passive >> >> 19/07/01 17:18:17 INFO crail: crail.storage.rdma.backlog 100 >> >> 19/07/01 17:18:17 INFO crail: crail.storage.rdma.connecttimeout 1000 >> >> 19/07/01 17:18:17 INFO narpc: new NaRPC server group v1.0, >>queueDepth 32, messageSize 512, nodealy true >> >> 19/07/01 17:18:17 INFO crail: crail.namenode.tcp.queueDepth 32 >> >> 19/07/01 17:18:17 INFO crail: crail.namenode.tcp.messageSize 512 >> >> 19/07/01 17:18:17 INFO crail: crail.namenode.tcp.cores 2 >> >> 19/07/01 17:18:17 INFO crail: rdma storage server started, address >>/192.168.3.100:50020, persistent false, maxWR 32, maxSge 4, cqSize >>3200 >> >> 19/07/01 17:18:17 INFO disni: starting accept >> >> 19/07/01 17:18:18 INFO crail: connected to namenode(s) >>minnie/192.168.1.164:9060 >> >> 19/07/01 17:18:18 INFO crail: datanode statistics, freeBlocks 1024 >> >> 19/07/01 17:18:18 INFO crail: datanode statistics, freeBlocks 2048 >> >> 19/07/01 17:18:19 INFO crail: datanode statistics, freeBlocks 3072 >> >> 19/07/01 17:18:19 INFO crail: datanode statistics, freeBlocks 4096 >> >> 19/07/01 17:18:19 INFO crail: datanode statistics, freeBlocks 4096 >> >> >> NVMf datanode is showing 1TB. >> >> 19/07/01 17:23:57 INFO crail: datanode statistics, freeBlocks >>1048576 >> >> >> Regards, >> >> >> David >> >> >> ________________________________ >>From: David Crespi >> Sent: Monday, July 1, 2019 3:57:42 PM >> To: Jonas Pfefferle; dev@crail.apache.org >> Subject: RE: Setting up storage class 1 and 2 >> >> A standard pull from the repo, one that didn’t have the patches from >>your private repo. >> >> I can put patches back in both the client and server containers if >>you really think it >> >> would make a difference. >> >> >> Are you guys running multiple types together? I’m running a RDMA >>storage class 1, >> >> a NVMf Storage Class 1 and NVMf Storage Class 2 together. I get >>errors when the >> >> RDMA is introduced into the mix. I have a small amount of memory >>(4GB) assigned >> >> with the RDMA tier, and looking for it to fall into the NVMf class 1 >>tier. It appears to want >> >> to do that, but gets screwed up… it looks like it’s trying to create >>another set of qp’s for >> >> an RDMA connection. It even blew up spdk trying to accomplish that. >> >> >> Do you guys have some documentation that shows what’s been tested >>(mixes/variations) so far? >> >> >> Regards, >> >> >> David >> >> >> ________________________________ >>From: Jonas Pfefferle >> Sent: Monday, July 1, 2019 12:51:09 AM >> To: dev@crail.apache.org; David Crespi >> Subject: Re: Setting up storage class 1 and 2 >> >> Hi David, >> >> >> Can you clarify which unpatched version you are talking about? Are >>you >> talking about the NVMf thread fix where I send you a link to a >>branch in my >> repository or the fix we provided earlier for the Spark hang in the >>Crail >> master? >> >> Generally, if you update, update all: clients and datanode/namenode. >> >> Regards, >> Jonas >> >> On Fri, 28 Jun 2019 17:59:32 +0000 >> David Crespi wrote: >>> Jonas, >>>FYI - I went back to using the unpatched version of crail on the >>>clients and it appears to work >>> okay now with the shuffle and RDMA, with only the RDMA containers >>>running on the server. >>> >>> Regards, >>> >>> David >>> >>> >>> ________________________________ >>>From: David Crespi >>> Sent: Friday, June 28, 2019 7:49:51 AM >>> To: Jonas Pfefferle; dev@crail.apache.org >>> Subject: RE: Setting up storage class 1 and 2 >>> >>> >>> Oh, and while I’m thinking about it Jonas, when I added the patches >>>you provided the other day, I only >>> >>> added them to the spark containers (clients) not to my crail >>>containers running on my storage server. >>> >>> Should the patches been added to all of the containers? >>> >>> >>> Regards, >>> >>> >>> David >>> >>> >>> ________________________________ >>>From: Jonas Pfefferle >>> Sent: Friday, June 28, 2019 12:54:27 AM >>> To: dev@crail.apache.org; David Crespi >>> Subject: Re: Setting up storage class 1 and 2 >>> >>> Hi David, >>> >>> >>> At the moment, it is possible to add a NVMf datanode even if only >>>the RDMA >>> storage type is specified in the config. As you have seen this will >>>go wrong >>> as soon as a client tries to connect to the datanode. Make sure to >>>start the >>> RDMA datanode with the appropriate classname, see: >>> https://incubator-crail.readthedocs.io/en/latest/run.html >>> The correct classname is >>>org.apache.crail.storage.rdma.RdmaStorageTier. >>> >>> Regards, >>> Jonas >>> >>> On Thu, 27 Jun 2019 23:09:26 +0000 >>> David Crespi wrote: >>>> Hi, >>>> I’m trying to integrate the storage classes and I’m hitting another >>>>issue when running terasort and just >>>> using the crail-shuffle with HDFS as the tmp storage. The program >>>>just sits, after the following >>>> message: >>>> 19/06/27 15:59:20 DEBUG Client: IPC Client (1998371610) connection >>>>to NameNode-1/192.168.3.7:54310 from hduser: closed >>>> 19/06/27 15:59:20 DEBUG Client: IPC Client (1998371610) connection >>>>to NameNode-1/192.168.3.7:54310 from hduser: stopped, remaining >>>>connections 0 >>>> >>>> During this run, I’ve removed the two crail nvmf (class 1 and 2) >>>>containers from the server, and I’m only running >>>> the namenode and a rdma storage class 1 datanode. My spark >>>>configuration is also now only looking at >>>> the rdma class. It looks as though it’s picking up the NVMf IP and >>>>port in the INFO messages seen below. >>>> I must be configuring something wrong, but I’ve not been able to >>>>track it down. Any thoughts? >>>> >>>> >>>> ************************************ >>>> TeraSort >>>> ************************************ >>>> SLF4J: Class path contains multiple SLF4J bindings. >>>> SLF4J: Found binding in >>>>[jar:file:/crail/jars/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class] >>>> SLF4J: Found binding in >>>>[jar:file:/crail/jars/jnvmf-1.6-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] >>>> SLF4J: Found binding in >>>>[jar:file:/crail/jars/disni-2.1-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] >>>> SLF4J: Found binding in >>>>[jar:file:/usr/spark-2.4.2/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] >>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an >>>>explanation. >>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] >>>> 19/06/27 15:59:07 WARN NativeCodeLoader: Unable to load >>>>native-hadoop library for your platform... using builtin-java classes >>>>where applicable >>>> 19/06/27 15:59:07 INFO SparkContext: Running Spark version 2.4.2 >>>> 19/06/27 15:59:07 INFO SparkContext: Submitted application: TeraSort >>>> 19/06/27 15:59:07 INFO SecurityManager: Changing view acls to: >>>>hduser >>>> 19/06/27 15:59:07 INFO SecurityManager: Changing modify acls to: >>>>hduser >>>> 19/06/27 15:59:07 INFO SecurityManager: Changing view acls groups >>>>to: >>>> 19/06/27 15:59:07 INFO SecurityManager: Changing modify acls groups >>>>to: >>>> 19/06/27 15:59:07 INFO SecurityManager: SecurityManager: >>>>authentication disabled; ui acls disabled; users with view >>>>permissions: Set(hduser); groups with view permissions: Set(); users >>>> with modify permissions: Set(hduser); groups with modify >>>>permissions: Set() >>>> 19/06/27 15:59:08 DEBUG InternalLoggerFactory: Using SLF4J as the >>>>default logging framework >>>> 19/06/27 15:59:08 DEBUG InternalThreadLocalMap: >>>>-Dio.netty.threadLocalMap.stringBuilder.initialSize: 1024 >>>> 19/06/27 15:59:08 DEBUG InternalThreadLocalMap: >>>>-Dio.netty.threadLocalMap.stringBuilder.maxSize: 4096 >>>> 19/06/27 15:59:08 DEBUG MultithreadEventLoopGroup: >>>>-Dio.netty.eventLoopThreads: 112 >>>> 19/06/27 15:59:08 DEBUG PlatformDependent0: -Dio.netty.noUnsafe: >>>>false >>>> 19/06/27 15:59:08 DEBUG PlatformDependent0: Java version: 8 >>>> 19/06/27 15:59:08 DEBUG PlatformDependent0: >>>>sun.misc.Unsafe.theUnsafe: available >>>> 19/06/27 15:59:08 DEBUG PlatformDependent0: >>>>sun.misc.Unsafe.copyMemory: available >>>> 19/06/27 15:59:08 DEBUG PlatformDependent0: java.nio.Buffer.address: >>>>available >>>> 19/06/27 15:59:08 DEBUG PlatformDependent0: direct buffer >>>>constructor: available >>>> 19/06/27 15:59:08 DEBUG PlatformDependent0: java.nio.Bits.unaligned: >>>>available, true >>>> 19/06/27 15:59:08 DEBUG PlatformDependent0: >>>>jdk.internal.misc.Unsafe.allocateUninitializedArray(int): unavailable >>>>prior to Java9 >>>> 19/06/27 15:59:08 DEBUG PlatformDependent0: >>>>java.nio.DirectByteBuffer.(long, int): available >>>> 19/06/27 15:59:08 DEBUG PlatformDependent: sun.misc.Unsafe: >>>>available >>>> 19/06/27 15:59:08 DEBUG PlatformDependent: -Dio.netty.tmpdir: /tmp >>>>(java.io.tmpdir) >>>> 19/06/27 15:59:08 DEBUG PlatformDependent: -Dio.netty.bitMode: 64 >>>>(sun.arch.data.model) >>>> 19/06/27 15:59:08 DEBUG PlatformDependent: >>>>-Dio.netty.noPreferDirect: false >>>> 19/06/27 15:59:08 DEBUG PlatformDependent: >>>>-Dio.netty.maxDirectMemory: 1029177344 bytes >>>> 19/06/27 15:59:08 DEBUG PlatformDependent: >>>>-Dio.netty.uninitializedArrayAllocationThreshold: -1 >>>> 19/06/27 15:59:08 DEBUG CleanerJava6: java.nio.ByteBuffer.cleaner(): >>>>available >>>> 19/06/27 15:59:08 DEBUG NioEventLoop: >>>>-Dio.netty.noKeySetOptimization: false >>>> 19/06/27 15:59:08 DEBUG NioEventLoop: >>>>-Dio.netty.selectorAutoRebuildThreshold: 512 >>>> 19/06/27 15:59:08 DEBUG PlatformDependent: >>>>org.jctools-core.MpscChunkedArrayQueue: available >>>> 19/06/27 15:59:08 DEBUG ResourceLeakDetector: >>>>-Dio.netty.leakDetection.level: simple >>>> 19/06/27 15:59:08 DEBUG ResourceLeakDetector: >>>>-Dio.netty.leakDetection.targetRecords: 4 >>>> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator: >>>>-Dio.netty.allocator.numHeapArenas: 9 >>>> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator: >>>>-Dio.netty.allocator.numDirectArenas: 10 >>>> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator: >>>>-Dio.netty.allocator.pageSize: 8192 >>>> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator: >>>>-Dio.netty.allocator.maxOrder: 11 >>>> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator: >>>>-Dio.netty.allocator.chunkSize: 16777216 >>>> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator: >>>>-Dio.netty.allocator.tinyCacheSize: 512 >>>> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator: >>>>-Dio.netty.allocator.smallCacheSize: 256 >>>> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator: >>>>-Dio.netty.allocator.normalCacheSize: 64 >>>> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator: >>>>-Dio.netty.allocator.maxCachedBufferCapacity: 32768 >>>> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator: >>>>-Dio.netty.allocator.cacheTrimInterval: 8192 >>>> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator: >>>>-Dio.netty.allocator.useCacheForAllThreads: true >>>> 19/06/27 15:59:08 DEBUG DefaultChannelId: -Dio.netty.processId: 2236 >>>>(auto-detected) >>>> 19/06/27 15:59:08 DEBUG NetUtil: -Djava.net.preferIPv4Stack: false >>>> 19/06/27 15:59:08 DEBUG NetUtil: -Djava.net.preferIPv6Addresses: >>>>false >>>> 19/06/27 15:59:08 DEBUG NetUtil: Loopback interface: lo (lo, >>>>127.0.0.1) >>>> 19/06/27 15:59:08 DEBUG NetUtil: /proc/sys/net/core/somaxconn: 128 >>>> 19/06/27 15:59:08 DEBUG DefaultChannelId: -Dio.netty.machineId: >>>>02:42:ac:ff:fe:1b:00:02 (auto-detected) >>>> 19/06/27 15:59:08 DEBUG ByteBufUtil: -Dio.netty.allocator.type: >>>>pooled >>>> 19/06/27 15:59:08 DEBUG ByteBufUtil: >>>>-Dio.netty.threadLocalDirectBufferSize: 65536 >>>> 19/06/27 15:59:08 DEBUG ByteBufUtil: >>>>-Dio.netty.maxThreadLocalCharBufferSize: 16384 >>>> 19/06/27 15:59:08 DEBUG TransportServer: Shuffle server started on >>>>port: 36915 >>>> 19/06/27 15:59:08 INFO Utils: Successfully started service >>>>'sparkDriver' on port 36915. >>>> 19/06/27 15:59:08 DEBUG SparkEnv: Using serializer: class >>>>org.apache.spark.serializer.KryoSerializer >>>> 19/06/27 15:59:08 INFO SparkEnv: Registering MapOutputTracker >>>> 19/06/27 15:59:08 DEBUG MapOutputTrackerMasterEndpoint: init >>>> 19/06/27 15:59:08 INFO CrailShuffleManager: crail shuffle started >>>> 19/06/27 15:59:08 INFO SparkEnv: Registering BlockManagerMaster >>>> 19/06/27 15:59:08 INFO BlockManagerMasterEndpoint: Using >>>>org.apache.spark.storage.DefaultTopologyMapper for getting topology >>>>information >>>> 19/06/27 15:59:08 INFO BlockManagerMasterEndpoint: >>>>BlockManagerMasterEndpoint up >>>> 19/06/27 15:59:08 INFO DiskBlockManager: Created local directory at >>>>/tmp/blockmgr-15237510-f459-40e3-8390-10f4742930a5 >>>> 19/06/27 15:59:08 DEBUG DiskBlockManager: Adding shutdown hook >>>> 19/06/27 15:59:08 INFO MemoryStore: MemoryStore started with >>>>capacity 366.3 MB >>>> 19/06/27 15:59:08 INFO SparkEnv: Registering OutputCommitCoordinator >>>> 19/06/27 15:59:08 DEBUG >>>>OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: init >>>> 19/06/27 15:59:08 DEBUG SecurityManager: Created SSL options for ui: >>>>SSLOptions{enabled=false, port=None, keyStore=None, >>>>keyStorePassword=None, trustStore=None, trustStorePassword=None, >>>>protocol=None, enabledAlgorithms=Set()} >>>> 19/06/27 15:59:08 INFO Utils: Successfully started service 'SparkUI' >>>>on port 4040. >>>> 19/06/27 15:59:08 INFO SparkUI: Bound SparkUI to 0.0.0.0, and >>>>started at http://192.168.1.161:4040 >>>> 19/06/27 15:59:08 INFO SparkContext: Added JAR >>>>file:/spark-terasort/target/spark-terasort-1.1-SNAPSHOT-jar-with-dependencies.jar >>>>at >>>>spark://master:36915/jars/spark-terasort-1.1-SNAPSHOT-jar-with-dependencies.jar >>>>with timestamp 1561676348562 >>>> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: >>>>Connecting to master spark://master:7077... >>>> 19/06/27 15:59:08 DEBUG TransportClientFactory: Creating new >>>>connection to master/192.168.3.13:7077 >>>> 19/06/27 15:59:08 DEBUG AbstractByteBuf: >>>>-Dio.netty.buffer.bytebuf.checkAccessible: true >>>> 19/06/27 15:59:08 DEBUG ResourceLeakDetectorFactory: Loaded default >>>>ResourceLeakDetector: io.netty.util.ResourceLeakDetector@5b1bb5d2 >>>> 19/06/27 15:59:08 DEBUG TransportClientFactory: Connection to >>>>master/192.168.3.13:7077 successful, running bootstraps... >>>> 19/06/27 15:59:08 INFO TransportClientFactory: Successfully created >>>>connection to master/192.168.3.13:7077 after 41 ms (0 ms spent in >>>>bootstraps) >>>> 19/06/27 15:59:08 DEBUG Recycler: >>>>-Dio.netty.recycler.maxCapacityPerThread: 32768 >>>> 19/06/27 15:59:08 DEBUG Recycler: >>>>-Dio.netty.recycler.maxSharedCapacityFactor: 2 >>>> 19/06/27 15:59:08 DEBUG Recycler: -Dio.netty.recycler.linkCapacity: >>>>16 >>>> 19/06/27 15:59:08 DEBUG Recycler: -Dio.netty.recycler.ratio: 8 >>>> 19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Connected to >>>>Spark cluster with app ID app-20190627155908-0005 >>>> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor >>>>added: app-20190627155908-0005/0 on >>>>worker-20190627152154-192.168.3.11-8882 (192.168.3.11:8882) with 2 >>>>core(s) >>>> 19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Granted executor >>>>ID app-20190627155908-0005/0 on hostPort 192.168.3.11:8882 with 2 >>>>core(s), 1024.0 MB RAM >>>> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor >>>>added: app-20190627155908-0005/1 on >>>>worker-20190627152150-192.168.3.12-8881 (192.168.3.12:8881) with 2 >>>>core(s) >>>> 19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Granted executor >>>>ID app-20190627155908-0005/1 on hostPort 192.168.3.12:8881 with 2 >>>>core(s), 1024.0 MB RAM >>>> 19/06/27 15:59:08 DEBUG TransportServer: Shuffle server started on >>>>port: 39189 >>>> 19/06/27 15:59:08 INFO Utils: Successfully started service >>>>'org.apache.spark.network.netty.NettyBlockTransferService' on port >>>>39189. >>>> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor >>>>added: app-20190627155908-0005/2 on >>>>worker-20190627152203-192.168.3.9-8884 (192.168.3.9:8884) with 2 >>>>core(s) >>>> 19/06/27 15:59:08 INFO NettyBlockTransferService: Server created on >>>>master:39189 >>>> 19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Granted executor >>>>ID app-20190627155908-0005/2 on hostPort 192.168.3.9:8884 with 2 >>>>core(s), 1024.0 MB RAM >>>> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor >>>>added: app-20190627155908-0005/3 on >>>>worker-20190627152158-192.168.3.10-8883 (192.168.3.10:8883) with 2 >>>>core(s) >>>> 19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Granted executor >>>>ID app-20190627155908-0005/3 on hostPort 192.168.3.10:8883 with 2 >>>>core(s), 1024.0 MB RAM >>>> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor >>>>added: app-20190627155908-0005/4 on >>>>worker-20190627152207-192.168.3.8-8885 (192.168.3.8:8885) with 2 >>>>core(s) >>>> 19/06/27 15:59:08 INFO BlockManager: Using >>>>org.apache.spark.storage.RandomBlockReplicationPolicy for block >>>>replication policy >>>> 19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Granted executor >>>>ID app-20190627155908-0005/4 on hostPort 192.168.3.8:8885 with 2 >>>>core(s), 1024.0 MB RAM >>>> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor >>>>updated: app-20190627155908-0005/0 is now RUNNING >>>> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor >>>>updated: app-20190627155908-0005/3 is now RUNNING >>>> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor >>>>updated: app-20190627155908-0005/4 is now RUNNING >>>> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor >>>>updated: app-20190627155908-0005/1 is now RUNNING >>>> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor >>>>updated: app-20190627155908-0005/2 is now RUNNING >>>> 19/06/27 15:59:08 INFO BlockManagerMaster: Registering BlockManager >>>>BlockManagerId(driver, master, 39189, None) >>>> 19/06/27 15:59:08 DEBUG DefaultTopologyMapper: Got a request for >>>>master >>>> 19/06/27 15:59:08 INFO BlockManagerMasterEndpoint: Registering block >>>>manager master:39189 with 366.3 MB RAM, BlockManagerId(driver, >>>>master, 39189, None) >>>> 19/06/27 15:59:08 INFO BlockManagerMaster: Registered BlockManager >>>>BlockManagerId(driver, master, 39189, None) >>>> 19/06/27 15:59:08 INFO BlockManager: Initialized BlockManager: >>>>BlockManagerId(driver, master, 39189, None) >>>> 19/06/27 15:59:09 INFO StandaloneSchedulerBackend: SchedulerBackend >>>>is ready for scheduling beginning after reached >>>>minRegisteredResourcesRatio: 0.0 >>>> 19/06/27 15:59:09 DEBUG SparkContext: Adding shutdown hook >>>> 19/06/27 15:59:09 DEBUG BlockReaderLocal: >>>>dfs.client.use.legacy.blockreader.local = false >>>> 19/06/27 15:59:09 DEBUG BlockReaderLocal: >>>>dfs.client.read.shortcircuit = false >>>> 19/06/27 15:59:09 DEBUG BlockReaderLocal: >>>>dfs.client.domain.socket.data.traffic = false >>>> 19/06/27 15:59:09 DEBUG BlockReaderLocal: dfs.domain.socket.path = >>>> 19/06/27 15:59:09 DEBUG RetryUtils: multipleLinearRandomRetry = null >>>> 19/06/27 15:59:09 DEBUG Server: rpcKind=RPC_PROTOCOL_BUFFER, >>>>rpcRequestWrapperClass=class >>>>org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, >>>>rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@23f3dbf0 >>>> 19/06/27 15:59:09 DEBUG Client: getting client out of cache: >>>>org.apache.hadoop.ipc.Client@3ed03652 >>>> 19/06/27 15:59:09 DEBUG PerformanceAdvisory: Both short-circuit >>>>local reads and UNIX domain socket are disabled. >>>> 19/06/27 15:59:09 DEBUG DataTransferSaslUtil: DataTransferProtocol >>>>not using SaslPropertiesResolver, no QOP found in configuration for >>>>dfs.data.transfer.protection >>>> 19/06/27 15:59:10 INFO MemoryStore: Block broadcast_0 stored as >>>>values in memory (estimated size 288.9 KB, free 366.0 MB) >>>> 19/06/27 15:59:10 DEBUG BlockManager: Put block broadcast_0 locally >>>>took 115 ms >>>> 19/06/27 15:59:10 DEBUG BlockManager: Putting block broadcast_0 >>>>without replication took 117 ms >>>> 19/06/27 15:59:10 INFO MemoryStore: Block broadcast_0_piece0 stored >>>>as bytes in memory (estimated size 23.8 KB, free 366.0 MB) >>>> 19/06/27 15:59:10 INFO BlockManagerInfo: Added broadcast_0_piece0 in >>>>memory on master:39189 (size: 23.8 KB, free: 366.3 MB) >>>> 19/06/27 15:59:10 DEBUG BlockManagerMaster: Updated info of block >>>>broadcast_0_piece0 >>>> 19/06/27 15:59:10 DEBUG BlockManager: Told master about block >>>>broadcast_0_piece0 >>>> 19/06/27 15:59:10 DEBUG BlockManager: Put block broadcast_0_piece0 >>>>locally took 6 ms >>>> 19/06/27 15:59:10 DEBUG BlockManager: Putting block >>>>broadcast_0_piece0 without replication took 6 ms >>>> 19/06/27 15:59:10 INFO SparkContext: Created broadcast 0 from >>>>newAPIHadoopFile at TeraSort.scala:60 >>>> 19/06/27 15:59:10 DEBUG Client: The ping interval is 60000 ms. >>>> 19/06/27 15:59:10 DEBUG Client: Connecting to >>>>NameNode-1/192.168.3.7:54310 >>>> 19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection >>>>to NameNode-1/192.168.3.7:54310 from hduser: starting, having >>>>connections 1 >>>> 19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection >>>>to NameNode-1/192.168.3.7:54310 from hduser sending #0 >>>> 19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection >>>>to NameNode-1/192.168.3.7:54310 from hduser got value #0 >>>> 19/06/27 15:59:10 DEBUG ProtobufRpcEngine: Call: getFileInfo took >>>>31ms >>>> 19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection >>>>to NameNode-1/192.168.3.7:54310 from hduser sending #1 >>>> 19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection >>>>to NameNode-1/192.168.3.7:54310 from hduser got value #1 >>>> 19/06/27 15:59:10 DEBUG ProtobufRpcEngine: Call: getListing took 5ms >>>> 19/06/27 15:59:10 DEBUG FileInputFormat: Time taken to get >>>>FileStatuses: 134 >>>> 19/06/27 15:59:10 INFO FileInputFormat: Total input paths to process >>>>: 2 >>>> 19/06/27 15:59:10 DEBUG FileInputFormat: Total # of splits generated >>>>by getSplits: 2, TimeTaken: 139 >>>> 19/06/27 15:59:10 DEBUG FileCommitProtocol: Creating committer >>>>org.apache.spark.internal.io.HadoopMapReduceCommitProtocol; job 1; >>>>output=hdfs://NameNode-1:54310/tmp/data_sort; dynamic=false >>>> 19/06/27 15:59:10 DEBUG FileCommitProtocol: Using (String, String, >>>>Boolean) constructor >>>> 19/06/27 15:59:10 INFO FileOutputCommitter: File Output Committer >>>>Algorithm version is 1 >>>> 19/06/27 15:59:10 DEBUG DFSClient: /tmp/data_sort/_temporary/0: >>>>masked=rwxr-xr-x >>>> 19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection >>>>to NameNode-1/192.168.3.7:54310 from hduser sending #2 >>>> 19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection >>>>to NameNode-1/192.168.3.7:54310 from hduser got value #2 >>>> 19/06/27 15:59:10 DEBUG ProtobufRpcEngine: Call: mkdirs took 3ms >>>> 19/06/27 15:59:10 DEBUG ClosureCleaner: Cleaning lambda: >>>>$anonfun$write$1 >>>> 19/06/27 15:59:10 DEBUG ClosureCleaner: +++ Lambda closure >>>>($anonfun$write$1) is now cleaned +++ >>>> 19/06/27 15:59:10 INFO SparkContext: Starting job: runJob at >>>>SparkHadoopWriter.scala:78 >>>> 19/06/27 15:59:10 INFO CrailDispatcher: CrailStore starting version >>>>400 >>>> 19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.deleteonclose >>>>false >>>> 19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.deleteOnStart >>>>true >>>> 19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.preallocate 0 >>>> 19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.writeAhead 0 >>>> 19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.debug false >>>> 19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.serializer >>>>org.apache.spark.serializer.CrailSparkSerializer >>>> 19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.shuffle.affinity >>>>true >>>> 19/06/27 15:59:10 INFO CrailDispatcher: >>>>spark.crail.shuffle.outstanding 1 >>>> 19/06/27 15:59:10 INFO CrailDispatcher: >>>>spark.crail.shuffle.storageclass 0 >>>> 19/06/27 15:59:10 INFO CrailDispatcher: >>>>spark.crail.broadcast.storageclass 0 >>>> 19/06/27 15:59:10 INFO crail: creating singleton crail file system >>>> 19/06/27 15:59:10 INFO crail: crail.version 3101 >>>> 19/06/27 15:59:10 INFO crail: crail.directorydepth 16 >>>> 19/06/27 15:59:10 INFO crail: crail.tokenexpiration 10 >>>> 19/06/27 15:59:10 INFO crail: crail.blocksize 1048576 >>>> 19/06/27 15:59:10 INFO crail: crail.cachelimit 0 >>>> 19/06/27 15:59:10 INFO crail: crail.cachepath /dev/hugepages/cache >>>> 19/06/27 15:59:10 INFO crail: crail.user crail >>>> 19/06/27 15:59:10 INFO crail: crail.shadowreplication 1 >>>> 19/06/27 15:59:10 INFO crail: crail.debug true >>>> 19/06/27 15:59:10 INFO crail: crail.statistics true >>>> 19/06/27 15:59:10 INFO crail: crail.rpctimeout 1000 >>>> 19/06/27 15:59:10 INFO crail: crail.datatimeout 1000 >>>> 19/06/27 15:59:10 INFO crail: crail.buffersize 1048576 >>>> 19/06/27 15:59:10 INFO crail: crail.slicesize 65536 >>>> 19/06/27 15:59:10 INFO crail: crail.singleton true >>>> 19/06/27 15:59:10 INFO crail: crail.regionsize 1073741824 >>>> 19/06/27 15:59:10 INFO crail: crail.directoryrecord 512 >>>> 19/06/27 15:59:10 INFO crail: crail.directoryrandomize true >>>> 19/06/27 15:59:10 INFO crail: crail.cacheimpl >>>>org.apache.crail.memory.MappedBufferCache >>>> 19/06/27 15:59:10 INFO crail: crail.locationmap >>>> 19/06/27 15:59:10 INFO crail: crail.namenode.address >>>>crail://192.168.1.164:9060 >>>> 19/06/27 15:59:10 INFO crail: crail.namenode.blockselection >>>>roundrobin >>>> 19/06/27 15:59:10 INFO crail: crail.namenode.fileblocks 16 >>>> 19/06/27 15:59:10 INFO crail: crail.namenode.rpctype >>>>org.apache.crail.namenode.rpc.tcp.TcpNameNode >>>> 19/06/27 15:59:10 INFO crail: crail.namenode.log >>>> 19/06/27 15:59:10 INFO crail: crail.storage.types >>>>org.apache.crail.storage.rdma.RdmaStorageTier >>>> 19/06/27 15:59:10 INFO crail: crail.storage.classes 1 >>>> 19/06/27 15:59:10 INFO crail: crail.storage.rootclass 0 >>>> 19/06/27 15:59:10 INFO crail: crail.storage.keepalive 2 >>>> 19/06/27 15:59:10 INFO crail: buffer cache, allocationCount 0, >>>>bufferCount 1024 >>>> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.interface eth0 >>>> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.port 50020 >>>> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.storagelimit >>>>4294967296 >>>> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.allocationsize >>>>1073741824 >>>> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.datapath >>>>/dev/hugepages/rdma >>>> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.localmap true >>>> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.queuesize 32 >>>> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.type passive >>>> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.backlog 100 >>>> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.connecttimeout 1000 >>>> 19/06/27 15:59:10 INFO narpc: new NaRPC server group v1.0, >>>>queueDepth 32, messageSize 512, nodealy true >>>> 19/06/27 15:59:10 INFO crail: crail.namenode.tcp.queueDepth 32 >>>> 19/06/27 15:59:10 INFO crail: crail.namenode.tcp.messageSize 512 >>>> 19/06/27 15:59:10 INFO crail: crail.namenode.tcp.cores 1 >>>> 19/06/27 15:59:10 INFO crail: connected to namenode(s) >>>>/192.168.1.164:9060 >>>> 19/06/27 15:59:10 INFO CrailDispatcher: creating main dir /spark >>>> 19/06/27 15:59:10 INFO crail: lookupDirectory: path /spark >>>> 19/06/27 15:59:10 INFO CrailDispatcher: creating main dir /spark >>>> 19/06/27 15:59:10 INFO crail: createNode: name /spark, type >>>>DIRECTORY, storageAffinity 0, locationAffinity 0 >>>> 19/06/27 15:59:10 INFO crail: CoreOutputStream, open, path /, fd 0, >>>>streamId 1, isDir true, writeHint 0 >>>> 19/06/27 15:59:10 INFO crail: passive data client >>>> 19/06/27 15:59:10 INFO disni: creating RdmaProvider of type 'nat' >>>> 19/06/27 15:59:10 INFO disni: jverbs jni version 32 >>>> 19/06/27 15:59:10 INFO disni: sock_addr_in size mismatch, jverbs >>>>size 28, native size 16 >>>> 19/06/27 15:59:10 INFO disni: IbvRecvWR size match, jverbs size 32, >>>>native size 32 >>>> 19/06/27 15:59:10 INFO disni: IbvSendWR size mismatch, jverbs size >>>>72, native size 128 >>>> 19/06/27 15:59:10 INFO disni: IbvWC size match, jverbs size 48, >>>>native size 48 >>>> 19/06/27 15:59:10 INFO disni: IbvSge size match, jverbs size 16, >>>>native size 16 >>>> 19/06/27 15:59:10 INFO disni: Remote addr offset match, jverbs size >>>>40, native size 40 >>>> 19/06/27 15:59:10 INFO disni: Rkey offset match, jverbs size 48, >>>>native size 48 >>>> 19/06/27 15:59:10 INFO disni: createEventChannel, objId >>>>139811924587312 >>>> 19/06/27 15:59:10 INFO disni: passive endpoint group, maxWR 32, >>>>maxSge 4, cqSize 64 >>>> 19/06/27 15:59:10 INFO disni: launching cm processor, cmChannel 0 >>>> 19/06/27 15:59:10 INFO disni: createId, id 139811924676432 >>>> 19/06/27 15:59:10 INFO disni: new client endpoint, id 0, idPriv 0 >>>> 19/06/27 15:59:10 INFO disni: resolveAddr, addres >>>>/192.168.3.100:4420 >>>> 19/06/27 15:59:10 INFO disni: resolveRoute, id 0 >>>> 19/06/27 15:59:10 INFO disni: allocPd, objId 139811924679808 >>>> 19/06/27 15:59:10 INFO disni: setting up protection domain, context >>>>467, pd 1 >>>> 19/06/27 15:59:10 INFO disni: setting up cq processor >>>> 19/06/27 15:59:10 INFO disni: new endpoint CQ processor >>>> 19/06/27 15:59:10 INFO disni: createCompChannel, context >>>>139810647883744 >>>> 19/06/27 15:59:10 INFO disni: createCQ, objId 139811924680688, ncqe >>>>64 >>>> 19/06/27 15:59:10 INFO disni: createQP, objId 139811924691192, >>>>send_wr size 32, recv_wr_size 32 >>>> 19/06/27 15:59:10 INFO disni: connect, id 0 >>>> 19/06/27 15:59:10 INFO disni: got event type + UNKNOWN, srcAddress >>>>/192.168.3.13:43273, dstAddress /192.168.3.100:4420 >>>> 19/06/27 15:59:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: >>>>Registered executor NettyRpcEndpointRef(spark-client://Executor) >>>>(192.168.3.11:35854) with ID 0 >>>> 19/06/27 15:59:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: >>>>Registered executor NettyRpcEndpointRef(spark-client://Executor) >>>>(192.168.3.12:44312) with ID 1 >>>> 19/06/27 15:59:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: >>>>Registered executor NettyRpcEndpointRef(spark-client://Executor) >>>>(192.168.3.8:34774) with ID 4 >>>> 19/06/27 15:59:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: >>>>Registered executor NettyRpcEndpointRef(spark-client://Executor) >>>>(192.168.3.9:58808) with ID 2 >>>> 19/06/27 15:59:11 DEBUG DefaultTopologyMapper: Got a request for >>>>192.168.3.11 >>>> 19/06/27 15:59:11 INFO BlockManagerMasterEndpoint: Registering block >>>>manager 192.168.3.11:41919 with 366.3 MB RAM, BlockManagerId(0, >>>>192.168.3.11, 41919, None) >>>> 19/06/27 15:59:11 DEBUG DefaultTopologyMapper: Got a request for >>>>192.168.3.12 >>>> 19/06/27 15:59:11 INFO BlockManagerMasterEndpoint: Registering block >>>>manager 192.168.3.12:46697 with 366.3 MB RAM, BlockManagerId(1, >>>>192.168.3.12, 46697, None) >>>> 19/06/27 15:59:11 DEBUG DefaultTopologyMapper: Got a request for >>>>192.168.3.8 >>>> 19/06/27 15:59:11 INFO BlockManagerMasterEndpoint: Registering block >>>>manager 192.168.3.8:37281 with 366.3 MB RAM, BlockManagerId(4, >>>>192.168.3.8, 37281, None) >>>> 19/06/27 15:59:11 DEBUG DefaultTopologyMapper: Got a request for >>>>192.168.3.9 >>>> 19/06/27 15:59:11 INFO BlockManagerMasterEndpoint: Registering block >>>>manager 192.168.3.9:43857 with 366.3 MB RAM, BlockManagerId(2, >>>>192.168.3.9, 43857, None) >>>> 19/06/27 15:59:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: >>>>Registered executor NettyRpcEndpointRef(spark-client://Executor) >>>>(192.168.3.10:40100) with ID 3 >>>> 19/06/27 15:59:11 DEBUG DefaultTopologyMapper: Got a request for >>>>192.168.3.10 >>>> 19/06/27 15:59:11 INFO BlockManagerMasterEndpoint: Registering block >>>>manager 192.168.3.10:38527 with 366.3 MB RAM, BlockManagerId(3, >>>>192.168.3.10, 38527, None) >>>> 19/06/27 15:59:20 DEBUG Client: IPC Client (1998371610) connection >>>>to NameNode-1/192.168.3.7:54310 from hduser: closed >>>> 19/06/27 15:59:20 DEBUG Client: IPC Client (1998371610) connection >>>>to NameNode-1/192.168.3.7:54310 from hduser: stopped, remaining >>>>connections 0 >>>> >>>> >>>> Regards, >>>> >>>> David >>>> >>>