flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammad Tariq <donta...@gmail.com>
Subject Re: Automatically upload files into HDFS
Date Tue, 20 Nov 2012 14:19:55 GMT
Hello Kashif,

     You are correct. This because of some version mismatch. I am not using
CDH personally but AFAIK, CDH4 uses Hadoop-2.x.

Regards,
    Mohammad Tariq



On Tue, Nov 20, 2012 at 4:10 PM, kashif khan <drkashif8310@gmail.com> wrote:

> HI M Tariq
>
>
> I am trying the following the program to create directory and copy file to
> hdfs. But I am getting the following errors
>
>
>
> Program:
>
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.Path;
> import java.io.IOException;
>
> public class CopyFile {
>
>
>         public static void main(String[] args) throws IOException{
>         Configuration conf = new Configuration();
>          conf.set("fs.default.name", "hadoop1.example.com:8020");
>         FileSystem dfs = FileSystem.get(conf);
>         String dirName = "Test1";
>         Path src = new Path(dfs.getWorkingDirectory() + "/" + dirName);
>         dfs.mkdirs(src);
>         Path scr1 = new Path("/usr/Eclipse/Output.csv");
>         Path dst = new Path(dfs.getWorkingDirectory() + "/Test1/");
>         dfs.copyFromLocalFile(src, dst);
>
>         }
>         }
>
>
>     Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
> Server IPC version 7 cannot communicate with client version 4
>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>     at $Proxy1.getProtocolVersion(Unknown Source)
>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>     at
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>     at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
>     at CopyFile.main(CopyFile.java:11)
>
>
>
> I am using CDH4.1. i have download the source file of hadoop-1.0.4 and
> import the jar files into Eclipse. I think it is due to version problem.
> Could you please let me know what will be correct version for the CDH4.1?
>
> Many thanks
>
>
>
>
>
>
> On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>
>> It should work. Same code is working fine for me. Try to create some
>> other directory in your Hdfs and use it as your output path. Also see if
>> you find something in datanode logs.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>
>>> The input path is fine. Problem in output path. I am just wonder that it
>>> copy the data into local disk  (/user/root/) not into hdfs. I dont know
>>> why? Is it we give the correct statement to point to hdfs?
>>>
>>> Thanks
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>>>
>>>> Try this as your input file path
>>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>>
>>>> Regards,
>>>>     Mohammad Tariq
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>>>
>>>>> when I am applying the command as
>>>>>
>>>>> $ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.
>>>>>
>>>>> its work fine and file browsing in the hdfs. But i dont know why its
>>>>> not work in program.
>>>>>
>>>>> Many thanks for your cooperation.
>>>>>
>>>>> Best regards,
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>>>>>
>>>>>> It would be good if I could have a look on the files. Meantime try
>>>>>> some other directories. Also, check the directory permissions once.
>>>>>>
>>>>>> Regards,
>>>>>>     Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>>>>>
>>>>>>>
>>>>>>> I have tried through root user and made the following changes:
>>>>>>>
>>>>>>>
>>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>>
>>>>>>> No result. The following is the log output. The log shows the
>>>>>>> destination is null.
>>>>>>>
>>>>>>>
>>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true
ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true
ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true
ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true
ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true
ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <drkashif8310@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Yeah, My cluster running. When brows http://hadoop1.example.com:
>>>>>>>> 50070/dfshealth.jsp. I am getting the main page. Then click
on Brows file
>>>>>>>> system. I am getting the following:
>>>>>>>>
>>>>>>>> hbase
>>>>>>>> tmp
>>>>>>>> user
>>>>>>>>
>>>>>>>> And when click on user getting:
>>>>>>>>
>>>>>>>> beeswax
>>>>>>>> huuser (I have created)
>>>>>>>> root (I have created)
>>>>>>>>
>>>>>>>> Would you like to see my configuration file. As did not change
any
>>>>>>>> things, all by default. I have installed CDH4.1 and running
on VMs.
>>>>>>>>
>>>>>>>> Many thanks
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <dontariq@gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Is your cluster running fine? Are you able to browse
Hdfs through
>>>>>>>>> the Hdfs Web Console at 50070?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>     Mohammad Tariq
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Many thanks.
>>>>>>>>>>
>>>>>>>>>> I have changed the program accordingly. It does not
show any
>>>>>>>>>> error but one warring , but when I am browsing the
HDFS folder, file is not
>>>>>>>>>> copied.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> public class CopyData {
>>>>>>>>>> public static void main(String[] args) throws IOException{
>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>         //Configuration configuration = new Configuration();
>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>
>>>>>>>>>>         conf.addResource(new
>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>         conf.addResource(new Path
>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>          FileSystem fs = FileSystem.get(conf);
>>>>>>>>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>         Path outputFile = new Path("/user/hduser/Output1.csv");
>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>         fs.close();
>>>>>>>>>>     }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader
>>>>>>>>>> <clinit>
>>>>>>>>>> WARNING: Unable to load native-hadoop library for
your
>>>>>>>>>> platform... using builtin-java classes where applicable
>>>>>>>>>>
>>>>>>>>>> Have any idea?
>>>>>>>>>>
>>>>>>>>>> Many thanks
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> If it is just copying the files without any processing
or
>>>>>>>>>>> change, you can use something like this :
>>>>>>>>>>>
>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>
>>>>>>>>>>>     public static void main(String[] args) throws
IOException{
>>>>>>>>>>>
>>>>>>>>>>>         Configuration configuration = new Configuration();
>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>         fs.close();
>>>>>>>>>>>     }
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> Obviously you have to modify it as per your requirements
like
>>>>>>>>>>> continuously polling the targeted directory for
new files.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan
<
>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>>
>>>>>>>>>>>> As I am new in  Java and Hadoop and have
no much experience. I
>>>>>>>>>>>> am trying to first write a simple program
to upload data into HDFS and
>>>>>>>>>>>> gradually move forward. I have written the
following simple program to
>>>>>>>>>>>> upload the file into HDFS, I dont know why
it does not working.  could you
>>>>>>>>>>>> please check it, if have time.
>>>>>>>>>>>>
>>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>>> import java.io.File;
>>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>>
>>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> public static void main(String [] args) throws
IOException
>>>>>>>>>>>> {
>>>>>>>>>>>>     try{
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>>     conf.addResource(new
>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>>
>>>>>>>>>>>>     //String fileName =
>>>>>>>>>>>> source.substring(source.lastIndexOf('/')
+ source.length());
>>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>>
>>>>>>>>>>>>     if (dest.charAt(dest.length() -1) !=
'/')
>>>>>>>>>>>>     {
>>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>>     }
>>>>>>>>>>>>     else
>>>>>>>>>>>>     {
>>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>>
>>>>>>>>>>>>     }
>>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>>     {
>>>>>>>>>>>>         System.out.println("File" + dest
+ " already exists");
>>>>>>>>>>>>     }
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length()
];
>>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>>
>>>>>>>>>>>>    {
>>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>>    }
>>>>>>>>>>>>    in.close();
>>>>>>>>>>>>    out.close();
>>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>>     }
>>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>>     {
>>>>>>>>>>>>
>>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>>     }
>>>>>>>>>>>>     }
>>>>>>>>>>>>
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>
>>>>>>>>>>>> KK
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad
Tariq <
>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> You can set your cronjob to execute the
program after every 5
>>>>>>>>>>>>> sec.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif
khan <
>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Well, I want to automatically upload
the files as  the files
>>>>>>>>>>>>>> are generating about every 3-5 sec
and each file has size about 3MB.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Is it possible to automate the system
using put or cp
>>>>>>>>>>>>>> command?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I read about the flume and webHDFS
but I am not sure it will
>>>>>>>>>>>>>> work or not.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM,
Alexander Alten-Lorenz <
>>>>>>>>>>>>>> wget.null@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Why do you don't use HDFS related
tools like put or cp?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM,
kashif khan <
>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > I am generating files continuously
in local folder of my
>>>>>>>>>>>>>>> base machine. How
>>>>>>>>>>>>>>> > I can now use the flume
to stream the generated files from
>>>>>>>>>>>>>>> local folder to
>>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>>> > I dont know how exactly
configure the sources, sinks and
>>>>>>>>>>>>>>> hdfs.
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > 1) location of folder where
files are generating:
>>>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>>> German Hadoop LinkedIn Group:
http://goo.gl/N8pCF
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message