flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kashif khan <drkashif8...@gmail.com>
Subject Re: Automatically upload files into HDFS
Date Tue, 20 Nov 2012 10:40:10 GMT
HI M Tariq


I am trying the following the program to create directory and copy file to
hdfs. But I am getting the following errors



Program:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import java.io.IOException;

public class CopyFile {

        public static void main(String[] args) throws IOException{
        Configuration conf = new Configuration();
        conf.set("fs.default.name", "hadoop1.example.com:8020");
        FileSystem dfs = FileSystem.get(conf);
        String dirName = "Test1";
        Path src = new Path(dfs.getWorkingDirectory() + "/" + dirName);
        dfs.mkdirs(src);
        Path scr1 = new Path("/usr/Eclipse/Output.csv");
        Path dst = new Path(dfs.getWorkingDirectory() + "/Test1/");
        dfs.copyFromLocalFile(src, dst);

        }
        }


    Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
Server IPC version 7 cannot communicate with client version 4
    at org.apache.hadoop.ipc.Client.call(Client.java:1070)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
    at $Proxy1.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
    at
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
    at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
    at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
    at CopyFile.main(CopyFile.java:11)



I am using CDH4.1. i have download the source file of hadoop-1.0.4 and
import the jar files into Eclipse. I think it is due to version problem.
Could you please let me know what will be correct version for the CDH4.1?

Many thanks





On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <dontariq@gmail.com> wrote:

> It should work. Same code is working fine for me. Try to create some other
> directory in your Hdfs and use it as your output path. Also see if you find
> something in datanode logs.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <drkashif8310@gmail.com>wrote:
>
>> The input path is fine. Problem in output path. I am just wonder that it
>> copy the data into local disk  (/user/root/) not into hdfs. I dont know
>> why? Is it we give the correct statement to point to hdfs?
>>
>> Thanks
>>
>>
>>
>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>>
>>> Try this as your input file path
>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>>
>>>> when I am applying the command as
>>>>
>>>> $ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.
>>>>
>>>> its work fine and file browsing in the hdfs. But i dont know why its
>>>> not work in program.
>>>>
>>>> Many thanks for your cooperation.
>>>>
>>>> Best regards,
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>>>>
>>>>> It would be good if I could have a look on the files. Meantime try
>>>>> some other directories. Also, check the directory permissions once.
>>>>>
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>>>>
>>>>>>
>>>>>> I have tried through root user and made the following changes:
>>>>>>
>>>>>>
>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>
>>>>>> No result. The following is the log output. The log shows the
>>>>>> destination is null.
>>>>>>
>>>>>>
>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who
(auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who
(auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase
(auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who
(auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who
(auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>>>>>
>>>>>>> Yeah, My cluster running. When brows http://hadoop1.example.com:
>>>>>>> 50070/dfshealth.jsp. I am getting the main page. Then click on
Brows file
>>>>>>> system. I am getting the following:
>>>>>>>
>>>>>>> hbase
>>>>>>> tmp
>>>>>>> user
>>>>>>>
>>>>>>> And when click on user getting:
>>>>>>>
>>>>>>> beeswax
>>>>>>> huuser (I have created)
>>>>>>> root (I have created)
>>>>>>>
>>>>>>> Would you like to see my configuration file. As did not change
any
>>>>>>> things, all by default. I have installed CDH4.1 and running on
VMs.
>>>>>>>
>>>>>>> Many thanks
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>>>>>>>
>>>>>>>> Is your cluster running fine? Are you able to browse Hdfs
through
>>>>>>>> the Hdfs Web Console at 50070?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>     Mohammad Tariq
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Many thanks.
>>>>>>>>>
>>>>>>>>> I have changed the program accordingly. It does not show
any error
>>>>>>>>> but one warring , but when I am browsing the HDFS folder,
file is not
>>>>>>>>> copied.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> public class CopyData {
>>>>>>>>> public static void main(String[] args) throws IOException{
>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>         //Configuration configuration = new Configuration();
>>>>>>>>>         //configuration.addResource(new
>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>         //configuration.addResource(new
>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>
>>>>>>>>>         conf.addResource(new
>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>         conf.addResource(new Path
>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>          FileSystem fs = FileSystem.get(conf);
>>>>>>>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>         Path outputFile = new Path("/user/hduser/Output1.csv");
>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>         fs.close();
>>>>>>>>>     }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader
>>>>>>>>> <clinit>
>>>>>>>>> WARNING: Unable to load native-hadoop library for your
platform...
>>>>>>>>> using builtin-java classes where applicable
>>>>>>>>>
>>>>>>>>> Have any idea?
>>>>>>>>>
>>>>>>>>> Many thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> If it is just copying the files without any processing
or change,
>>>>>>>>>> you can use something like this :
>>>>>>>>>>
>>>>>>>>>> public class CopyData {
>>>>>>>>>>
>>>>>>>>>>     public static void main(String[] args) throws
IOException{
>>>>>>>>>>
>>>>>>>>>>         Configuration configuration = new Configuration();
>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>         Path inputFile = new
>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>         fs.close();
>>>>>>>>>>     }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> Obviously you have to modify it as per your requirements
like
>>>>>>>>>> continuously polling the targeted directory for new
files.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>
>>>>>>>>>>> As I am new in  Java and Hadoop and have no much
experience. I
>>>>>>>>>>> am trying to first write a simple program to
upload data into HDFS and
>>>>>>>>>>> gradually move forward. I have written the following
simple program to
>>>>>>>>>>> upload the file into HDFS, I dont know why it
does not working.  could you
>>>>>>>>>>> please check it, if have time.
>>>>>>>>>>>
>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>> import java.io.File;
>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>
>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> public static void main(String [] args) throws
IOException
>>>>>>>>>>> {
>>>>>>>>>>>     try{
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>     conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>
>>>>>>>>>>>     //String fileName = source.substring(source.lastIndexOf('/')
>>>>>>>>>>> + source.length());
>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>
>>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>>     {
>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>     }
>>>>>>>>>>>     else
>>>>>>>>>>>     {
>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>
>>>>>>>>>>>     }
>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>     {
>>>>>>>>>>>         System.out.println("File" + dest + "
already exists");
>>>>>>>>>>>     }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length()
];
>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>
>>>>>>>>>>>    {
>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>    }
>>>>>>>>>>>    in.close();
>>>>>>>>>>>    out.close();
>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>     }
>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>     {
>>>>>>>>>>>
>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>     }
>>>>>>>>>>>     }
>>>>>>>>>>>
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks again,
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>>
>>>>>>>>>>> KK
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq
<
>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> You can set your cronjob to execute the program
after every 5
>>>>>>>>>>>> sec.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan
<
>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Well, I want to automatically upload
the files as  the files
>>>>>>>>>>>>> are generating about every 3-5 sec and
each file has size about 3MB.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  Is it possible to automate the system
using put or cp command?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I read about the flume and webHDFS but
I am not sure it will
>>>>>>>>>>>>> work or not.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander
Alten-Lorenz <
>>>>>>>>>>>>> wget.null@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Why do you don't use HDFS related
tools like put or cp?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif
khan <
>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > I am generating files continuously
in local folder of my
>>>>>>>>>>>>>> base machine. How
>>>>>>>>>>>>>> > I can now use the flume to stream
the generated files from
>>>>>>>>>>>>>> local folder to
>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>> > I dont know how exactly configure
the sources, sinks and
>>>>>>>>>>>>>> hdfs.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > 1) location of folder where
files are generating:
>>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message