crail-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Crespi <david.cre...@storedgesystems.com>
Subject RE: Setting up storage class 1 and 2
Date Tue, 02 Jul 2019 00:35:27 GMT
Bounced on the first attempt.

Regards,

           David
From: David Crespi<mailto:david.crespi@storedgesystems.com>
Sent: Monday, July 1, 2019 5:27 PM
To: dev@crail.apache.org<mailto:dev@crail.apache.org>; Jonas Pfefferle<mailto:pepperjo@japf.ch>
Subject: RE: Setting up storage class 1 and 2


Jonas,

Just wanted to be sure Iā€™m doing things correctly.  It runs okay without adding in the NVMf
datanode (i.e.

completes teragen).  When I add the NVMf node in, even without using it on the run, it hangs
during the

terasort, with nothing being written to the datanode ā€“ only the metadata is created (i.e.
/spark).



My config is:

1 namenode container

1 rdma datanode storage class 1 container

1 nvmf datanode storage class 1 container.



The namenode is showing that both datanode are starting up as

Type 0 to storage class 0ā€¦ is that correct?



NameNode log at startup:

19/07/01 17:18:16 INFO crail: initalizing namenode

19/07/01 17:18:16 INFO crail: crail.version 3101

19/07/01 17:18:16 INFO crail: crail.directorydepth 16

19/07/01 17:18:16 INFO crail: crail.tokenexpiration 10

19/07/01 17:18:16 INFO crail: crail.blocksize 1048576

19/07/01 17:18:16 INFO crail: crail.cachelimit 0

19/07/01 17:18:16 INFO crail: crail.cachepath /dev/hugepages/cache

19/07/01 17:18:16 INFO crail: crail.user crail

19/07/01 17:18:16 INFO crail: crail.shadowreplication 1

19/07/01 17:18:16 INFO crail: crail.debug true

19/07/01 17:18:16 INFO crail: crail.statistics false

19/07/01 17:18:16 INFO crail: crail.rpctimeout 1000

19/07/01 17:18:16 INFO crail: crail.datatimeout 1000

19/07/01 17:18:16 INFO crail: crail.buffersize 1048576

19/07/01 17:18:16 INFO crail: crail.slicesize 65536

19/07/01 17:18:16 INFO crail: crail.singleton true

19/07/01 17:18:16 INFO crail: crail.regionsize 1073741824

19/07/01 17:18:16 INFO crail: crail.directoryrecord 512

19/07/01 17:18:16 INFO crail: crail.directoryrandomize true

19/07/01 17:18:16 INFO crail: crail.cacheimpl org.apache.crail.memory.MappedBufferCache

19/07/01 17:18:16 INFO crail: crail.locationmap

19/07/01 17:18:16 INFO crail: crail.namenode.address crail://minnie:9060?id=0&size=1

19/07/01 17:18:16 INFO crail: crail.namenode.blockselection roundrobin

19/07/01 17:18:16 INFO crail: crail.namenode.fileblocks 16

19/07/01 17:18:16 INFO crail: crail.namenode.rpctype org.apache.crail.namenode.rpc.tcp.TcpNameNode

19/07/01 17:18:16 INFO crail: crail.namenode.log

19/07/01 17:18:16 INFO crail: crail.storage.types org.apache.crail.storage.nvmf.NvmfStorageTier,org.apache.crail.storage.rdma.RdmaStorageTier

19/07/01 17:18:16 INFO crail: crail.storage.classes 2

19/07/01 17:18:16 INFO crail: crail.storage.rootclass 1

19/07/01 17:18:16 INFO crail: crail.storage.keepalive 2

19/07/01 17:18:16 INFO crail: round robin block selection

19/07/01 17:18:16 INFO crail: round robin block selection

19/07/01 17:18:16 INFO narpc: new NaRPC server group v1.0, queueDepth 32, messageSize 512,
nodealy true, cores 2

19/07/01 17:18:16 INFO crail: crail.namenode.tcp.queueDepth 32

19/07/01 17:18:16 INFO crail: crail.namenode.tcp.messageSize 512

19/07/01 17:18:16 INFO crail: crail.namenode.tcp.cores 2

19/07/01 17:18:17 INFO crail: new connection from /192.168.1.164:39260

19/07/01 17:18:17 INFO narpc: adding new channel to selector, from /192.168.1.164:39260

19/07/01 17:18:17 INFO crail: adding datanode /192.168.3.100:4420 of type 0 to storage class
0

19/07/01 17:18:17 INFO crail: new connection from /192.168.1.164:39262

19/07/01 17:18:17 INFO narpc: adding new channel to selector, from /192.168.1.164:39262

19/07/01 17:18:18 INFO crail: adding datanode /192.168.3.100:50020 of type 0 to storage class
0



The RDMA datanode ā€“ it is set to have 4x1GB hugepages:

19/07/01 17:18:17 INFO crail: crail.version 3101

19/07/01 17:18:17 INFO crail: crail.directorydepth 16

19/07/01 17:18:17 INFO crail: crail.tokenexpiration 10

19/07/01 17:18:17 INFO crail: crail.blocksize 1048576

19/07/01 17:18:17 INFO crail: crail.cachelimit 0

19/07/01 17:18:17 INFO crail: crail.cachepath /dev/hugepages/cache

19/07/01 17:18:17 INFO crail: crail.user crail

19/07/01 17:18:17 INFO crail: crail.shadowreplication 1

19/07/01 17:18:17 INFO crail: crail.debug true

19/07/01 17:18:17 INFO crail: crail.statistics false

19/07/01 17:18:17 INFO crail: crail.rpctimeout 1000

19/07/01 17:18:17 INFO crail: crail.datatimeout 1000

19/07/01 17:18:17 INFO crail: crail.buffersize 1048576

19/07/01 17:18:17 INFO crail: crail.slicesize 65536

19/07/01 17:18:17 INFO crail: crail.singleton true

19/07/01 17:18:17 INFO crail: crail.regionsize 1073741824

19/07/01 17:18:17 INFO crail: crail.directoryrecord 512

19/07/01 17:18:17 INFO crail: crail.directoryrandomize true

19/07/01 17:18:17 INFO crail: crail.cacheimpl org.apache.crail.memory.MappedBufferCache

19/07/01 17:18:17 INFO crail: crail.locationmap

19/07/01 17:18:17 INFO crail: crail.namenode.address crail://minnie:9060

19/07/01 17:18:17 INFO crail: crail.namenode.blockselection roundrobin

19/07/01 17:18:17 INFO crail: crail.namenode.fileblocks 16

19/07/01 17:18:17 INFO crail: crail.namenode.rpctype org.apache.crail.namenode.rpc.tcp.TcpNameNode

19/07/01 17:18:17 INFO crail: crail.namenode.log

19/07/01 17:18:17 INFO crail: crail.storage.types org.apache.crail.storage.rdma.RdmaStorageTier

19/07/01 17:18:17 INFO crail: crail.storage.classes 1

19/07/01 17:18:17 INFO crail: crail.storage.rootclass 1

19/07/01 17:18:17 INFO crail: crail.storage.keepalive 2

19/07/01 17:18:17 INFO disni: creating  RdmaProvider of type 'nat'

19/07/01 17:18:17 INFO disni: jverbs jni version 32

19/07/01 17:18:17 INFO disni: sock_addr_in size mismatch, jverbs size 28, native size 16

19/07/01 17:18:17 INFO disni: IbvRecvWR size match, jverbs size 32, native size 32

19/07/01 17:18:17 INFO disni: IbvSendWR size mismatch, jverbs size 72, native size 128

19/07/01 17:18:17 INFO disni: IbvWC size match, jverbs size 48, native size 48

19/07/01 17:18:17 INFO disni: IbvSge size match, jverbs size 16, native size 16

19/07/01 17:18:17 INFO disni: Remote addr offset match, jverbs size 40, native size 40

19/07/01 17:18:17 INFO disni: Rkey offset match, jverbs size 48, native size 48

19/07/01 17:18:17 INFO disni: createEventChannel, objId 140349068383088

19/07/01 17:18:17 INFO disni: passive endpoint group, maxWR 32, maxSge 4, cqSize 3200

19/07/01 17:18:17 INFO disni: createId, id 140349068429968

19/07/01 17:18:17 INFO disni: new server endpoint, id 0

19/07/01 17:18:17 INFO disni: launching cm processor, cmChannel 0

19/07/01 17:18:17 INFO disni: bindAddr, address /192.168.3.100:50020

19/07/01 17:18:17 INFO disni: listen, id 0

19/07/01 17:18:17 INFO disni: allocPd, objId 140349068679808

19/07/01 17:18:17 INFO disni: setting up protection domain, context 100, pd 1

19/07/01 17:18:17 INFO disni: PD value 1

19/07/01 17:18:17 INFO crail: crail.storage.rdma.interface enp94s0f1

19/07/01 17:18:17 INFO crail: crail.storage.rdma.port 50020

19/07/01 17:18:17 INFO crail: crail.storage.rdma.storagelimit 4294967296

19/07/01 17:18:17 INFO crail: crail.storage.rdma.allocationsize 1073741824

19/07/01 17:18:17 INFO crail: crail.storage.rdma.datapath /dev/hugepages/rdma

19/07/01 17:18:17 INFO crail: crail.storage.rdma.localmap true

19/07/01 17:18:17 INFO crail: crail.storage.rdma.queuesize 32

19/07/01 17:18:17 INFO crail: crail.storage.rdma.type passive

19/07/01 17:18:17 INFO crail: crail.storage.rdma.backlog 100

19/07/01 17:18:17 INFO crail: crail.storage.rdma.connecttimeout 1000

19/07/01 17:18:17 INFO narpc: new NaRPC server group v1.0, queueDepth 32, messageSize 512,
nodealy true

19/07/01 17:18:17 INFO crail: crail.namenode.tcp.queueDepth 32

19/07/01 17:18:17 INFO crail: crail.namenode.tcp.messageSize 512

19/07/01 17:18:17 INFO crail: crail.namenode.tcp.cores 2

19/07/01 17:18:17 INFO crail: rdma storage server started, address /192.168.3.100:50020, persistent
false, maxWR 32, maxSge 4, cqSize 3200

19/07/01 17:18:17 INFO disni: starting accept

19/07/01 17:18:18 INFO crail: connected to namenode(s) minnie/192.168.1.164:9060

19/07/01 17:18:18 INFO crail: datanode statistics, freeBlocks 1024

19/07/01 17:18:18 INFO crail: datanode statistics, freeBlocks 2048

19/07/01 17:18:19 INFO crail: datanode statistics, freeBlocks 3072

19/07/01 17:18:19 INFO crail: datanode statistics, freeBlocks 4096

19/07/01 17:18:19 INFO crail: datanode statistics, freeBlocks 4096



NVMf datanode is showing 1TB.

19/07/01 17:23:57 INFO crail: datanode statistics, freeBlocks 1048576





Regards,



           David




Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message