hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doss_IPH <d...@intellipowerhive.com>
Subject Re: Need Info
Date Thu, 22 Oct 2009 03:49:17 GMT

 you can use this pseudo code for loading data to HDFS.

import java.io.File;
import java.net.URI;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hdfs.DistributedFileSystem;

 * @author: Arockia Doss S
 * @emailto: doss@intellipowerhive.com
 * @url: http://www.intellipowerhive.com,http://www.dossinfotech.com
 * @comments: You can use and modify this code for your use.
 * @About this: This below code works in hadoop-0.19.0 version platform. 
 *              If you want to test this code, you have set the hadoop
libraries in your class path.
 *              You need to give set of parameters before running it (like
Hadoop path, Host, Users).

public class HadoopConfiguration {    
    //Hadoop Absolute Path
    private static final String CLUSTERPATH="/home/hadoop-0.19.0/";
    private static final String SITEFILE = "conf/hadoop-site.xml";
    private static final String DEFAULTFILE = "conf/hadoop-default.xml";
    //Hadoop Name Node Host
    private static final String HADOOPHOST = "";
    //Hadoop Root and its users list
    private static final String HOSTUSERS = "root,doss";    
    private static Configuration conf = new Configuration();
    private static DistributedFileSystem dfs = new DistributedFileSystem();
    public HadoopConfiguration()  throws java.lang.Exception{
        Path sitepath = new Path(CLUSTERPATH+SITEFILE);
        Path defaultpath = new Path(CLUSTERPATH+DEFAULTFILE);
        getConf().set("hadoop.job.ugi", HOSTUSERS);
        dfs.initialize(new URI("hdfs://"+HADOOPHOST+":9000/"), conf);        
    public static Configuration getConf(){
        return conf;
    public static void main(String[] args){
            HadoopConfiguration h = new HadoopConfiguration();
            FileSystem fs = FileSystem.get(h.getConf());
            //Copy sample.xls file to HDFS, The file will be there after
copying it.
            fs.copyFromLocalFile(new Path("/home/sample.xls"),new
            //Move sample.doc file to HDFS, The file will not be there after
moving it.
            fs.moveFromLocalFile(new Path("/home/sample.doc"),new
            //This below code gives to list the files from HDFS
            FileStatus[]  fileStatus = fs.listStatus(new Path("/home/xls"));
            for(int i=0;i<fileStatus.length;i++){
                Path path = fileStatus[i].getPath();
        }catch(java.lang.Exception e){

shwitzu wrote:
> Thanks for Responding.
> I read about HDFS and understood how it works and I also installed hadoop
> in my windows using cygwin and tried a sample driver code and made sure it
> works.
> But my concern is, given the problem statement how should I proceed
> Could you please give me some clue/ pseudo code or a design.
> Thanks in anticipation. 
> Doss_IPH wrote:
>> First and for most, you need to understand about hadoop platform
>> infrastructures. 
>> Currently, I am working in real time application using hadoop. I think
>> that Hadoop will be fit to your requirements. 
>> Hadoop is mainly for three things,
>> 1. Scalability no limit for storage
>> 2. Peta bytes of data processing in distributed parallel mode.
>> 3. Fault tolerance (Automatically Block Replication) recovering data from
>> failure. 
>> shwitzu wrote:
>>> Hello Sir!
>>> I am new to hadoop. I have a project  based on webservices. I have my
>>> information in 4 databases with different files in each one of them.
>>> Say, images in one, video, documents etc. My task is to develop a web
>>> service which accepts the keyword from the client and process the
>>> request and send back the actual requested file back to the user. Now I
>>> have to use Hadoop distributed file system in this project.
>>> I have the following questions:
>>> 1) How should I start with the design?
>>> 2)  Should I upload all the files and create Map, Reduce and Driver code
>>> and once I run my application will it automatically go the file system
>>> and get back the results to me?
>>> 3) How do i handle the binary data? I want to store binary format data
>>> using MTOM in my databse.
>>> Please let me know how I should proceed. I dont know much about this
>>> hadoop and am I searching for some help. It would be great if you could
>>> assist me. Thanks again

View this message in context: http://www.nabble.com/Need-Info-tp25901902p26003660.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

View raw message