hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Pavlis <david.pav...@javlin.eu>
Subject Re: How-to use DFSClient's BlockReader from Java
Date Mon, 09 Jan 2012 17:56:17 GMT
Hi Denny,

Thanks a lot. I was able to make my code work.

I am posting a small example below - in case somebody in the future has
similar need ;-)
(not handling replica datablocks).

David.

***************************************************************************
public static void main(String args[]){
	String filename="/user/hive/warehouse/sample_07/sample_07.csv";
	int DATANODE_PORT = 50010;
	int NAMENODE_PORT = 8020;
	String HOST_IP = "192.168.1.230";
    	
	byte[] buf=new byte[1000];
    	

	try{
    		
		ClientProtocol client= DFSClient.createNamenode(new
InetSocketAddress(HOST_IP,NAMENODE_PORT), new Configuration());
    		
    		
    		
    		
		LocatedBlocks located= client.getBlockLocations(filename, 0,
Long.MAX_VALUE);
    		
    		

		for(LocatedBlock block : located.getLocatedBlocks()){
			Socket sock = SocketFactory.getDefault().createSocket();
			InetSocketAddress targetAddr = new
InetSocketAddress(HOST_IP,DATANODE_PORT);
			NetUtils.connect(sock, targetAddr, 10000);
			sock.setSoTimeout(10000);
    			
			
			BlockReader reader=BlockReader.newBlockReader(sock,  filename,
				block.getBlock().getBlockId(),  block.getBlockToken(),
block.getBlock().getGenerationStamp(), 0, 						block.getBlockSize(),
1000);
    		

			int count=0;
			int length;
			while((length=reader.read(buf,0,1000))>0){
				//System.out.print(new String(buf,0,length,"UTF-8"));
				if (length<1000) break;
			}
			reader.close();			
			sock.close();
		}

    		
	}catch(IOException ex){
		ex.printStackTrace();
	}
    	
}






***************************************************************************



From:  Denny Ye <dennyy99@gmail.com>
Reply-To:  <hdfs-user@hadoop.apache.org>
Date:  Mon, 9 Jan 2012 16:29:18 +0800
To:  <hdfs-user@hadoop.apache.org>
Subject:  Re: How-to use DFSClient's BlockReader from Java


hi David     Please refer to the method "DFSInputStream#blockSeekTo", it
has same purpose with you.

***************************************************************************
        LocatedBlock targetBlock = getBlockAt(target, true);
        assert (target==this.pos) : "Wrong postion " + pos + " expect " +
target;
        long offsetIntoBlock = target - targetBlock.getStartOffset();

        DNAddrPair retval = chooseDataNode(targetBlock);
        chosenNode = retval.info <http://retval.info>;
        InetSocketAddress targetAddr = retval.addr;

        try {
          s = socketFactory.createSocket();
          NetUtils.connect(s, targetAddr, socketTimeout);
          s.setSoTimeout(socketTimeout);
          Block blk = targetBlock.getBlock();
          Token<BlockTokenIdentifier> accessToken =
targetBlock.getBlockToken();

          blockReader = BlockReader.newBlockReader(s, src,
blk.getBlockId(),
              accessToken,
              blk.getGenerationStamp(),
              offsetIntoBlock, blk.getNumBytes() - offsetIntoBlock,
              buffersize, verifyChecksum, clientName);


***************************************************************************


-Regards
Denny Ye

2012/1/6 David Pavlis <david.pavlis@javlin.eu>

Hi,

I am relatively new to Hadoop and I am trying to utilize HDFS for own
application where I want to take advantage of data partitioning HDFS
performs.

The idea is that I get list of individual blocks - BlockLocations of
particular file and then directly read those (go to individual DataNodes).
So far I found org.apache.hadoop.hdfs.DFSClient.BlockReader to be the way
to go.

However I am struggling with instantiating the BlockReader() class, namely
creating the "Token<BlockTokenIdentifier>".

Is there an example Java code showing how to access individual blocks of
particular file stored on HDFS ?

Thanks in advance,

David.












Mime
View raw message