flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shekhar sharma <shekhar2...@gmail.com>
Subject Re: Automatically upload files into HDFS
Date Mon, 26 Nov 2012 16:42:32 GMT
Hello Khasif,
Sorry for late reply...Are you done? or u still struggling?

mail me: shekhar2581@gmail.com
Regards,
Som Shekhar Sharma



On Wed, Nov 21, 2012 at 6:06 PM, kashif khan <drkashif8310@gmail.com> wrote:

> Dear Shankar Sharma.
>
> I am using Eclipse as IDE. I dont have any idea, how to create the project
> as maven project. I have downloaded Mave2 but given me some strange error.
> So if you can help me then I will try the maven. Actually, I am trying to
> automatically upload the files into HDFS and then will apply some
> algorithms to analyze the data. The algorithms will implement in mapreduce
> . So if you think maven will good for me then please let me know how I can
> create the project as maven project.
>
>
> Many thanks
>
> Best regards,
>
> KK
>
>
>
>  On Tue, Nov 20, 2012 at 7:06 PM, shekhar sharma <shekhar2581@gmail.com>wrote:
>
>> By the way how are you building and running your project.Are u running
>> from any IDE?
>> The best practises you can follow:
>>
>> (1) Create your project as maven project and give the dependency of
>> hadoop-X.Y.Z . So your project will automatically will have all the
>> necessary jars
>> and i am sure you will not face these kind of errors
>> (2) And In your HADOOP_CLASSPATH provide the path for $HADOOP_LIB.
>>
>> Regards,
>> SOm
>>
>>
>>
>>
>>
>> On Tue, Nov 20, 2012 at 9:52 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>
>>> Dear Tariq
>>>
>>> Many thanks, finally I have created the directory and upload the file.
>>>
>>> Once again many thanks
>>>
>>> Best regards
>>>
>>>
>>> On Tue, Nov 20, 2012 at 3:04 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>>
>>>> Dear Many thanks
>>>>
>>>>
>>>> I have downloaded the jar file and added to project. Now getting
>>>> another error as:
>>>>
>>>> og4j:WARN No appenders could be found for logger
>>>> (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
>>>> log4j:WARN Please initialize the log4j system properly.
>>>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfigfor more info.
>>>> Exception in thread "main" java.io.IOException: No FileSystem for
>>>> scheme: hdfs
>>>>     at
>>>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2206)
>>>>     at
>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2213)
>>>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>>>>     at
>>>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>>>>
>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>>>     at CopyFile.main(CopyFile.java:14)
>>>>
>>>> Have any idea about this?
>>>>
>>>> Thanks again
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Nov 20, 2012 at 2:53 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>>>>
>>>>> You can download the jar here :
>>>>> http://search.maven.org/remotecontent?filepath=com/google/guava/guava/13.0.1/guava-13.0.1.jar
>>>>>
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Nov 20, 2012 at 8:06 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>>>>
>>>>>> Could please let me know the name of jar file and location
>>>>>>
>>>>>> Many thanks
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 20, 2012 at 2:33 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>>>>>>
>>>>>>> Download the required jar and include it in your project.
>>>>>>>
>>>>>>> Regards,
>>>>>>>     Mohammad Tariq
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 20, 2012 at 7:57 PM, kashif khan <drkashif8310@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Dear Tariq Thanks
>>>>>>>>
>>>>>>>> I have added the jar files from Cdh and download the cdh4 eclipse
>>>>>>>> plugin and copied into eclipse plugin folder. The previous error I think
>>>>>>>> sorted out but now I am getting another strange error.
>>>>>>>>
>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>> com/google/common/collect/Maps
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.metrics2.lib.MetricsRegistry.<init>(MetricsRegistry.java:42)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:87)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:133)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:97)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:190)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2373)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2365)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2233)
>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>>>>>>>     at CopyFile.main(CopyFile.java:14)
>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>> com.google.common.collect.Maps
>>>>>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>>>>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>>>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>>>>>>>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>>>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>>>>>>>>     ... 13 more
>>>>>>>>
>>>>>>>> Have any idea about this error.
>>>>>>>>
>>>>>>>> Many thanks
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Nov 20, 2012 at 2:19 PM, Mohammad Tariq <dontariq@gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Hello Kashif,
>>>>>>>>>
>>>>>>>>>      You are correct. This because of some version mismatch. I am
>>>>>>>>> not using CDH personally but AFAIK, CDH4 uses Hadoop-2.x.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>     Mohammad Tariq
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Nov 20, 2012 at 4:10 PM, kashif khan <
>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> HI M Tariq
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I am trying the following the program to create directory and
>>>>>>>>>> copy file to hdfs. But I am getting the following errors
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Program:
>>>>>>>>>>
>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>
>>>>>>>>>> public class CopyFile {
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>         public static void main(String[] args) throws IOException{
>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>         conf.set("fs.default.name", "hadoop1.example.com:8020");
>>>>>>>>>>         FileSystem dfs = FileSystem.get(conf);
>>>>>>>>>>         String dirName = "Test1";
>>>>>>>>>>         Path src = new Path(dfs.getWorkingDirectory() + "/" +
>>>>>>>>>> dirName);
>>>>>>>>>>         dfs.mkdirs(src);
>>>>>>>>>>         Path scr1 = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>         Path dst = new Path(dfs.getWorkingDirectory() +
>>>>>>>>>> "/Test1/");
>>>>>>>>>>         dfs.copyFromLocalFile(src, dst);
>>>>>>>>>>
>>>>>>>>>>         }
>>>>>>>>>>         }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>     Exception in thread "main"
>>>>>>>>>> org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot
>>>>>>>>>> communicate with client version 4
>>>>>>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>>>>>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>>>>     at $Proxy1.getProtocolVersion(Unknown Source)
>>>>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>>>>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>>>>>>>>>>     at
>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>>>>>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>>>>>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>>>>>>>>>>     at
>>>>>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>>>>>>>>>>     at
>>>>>>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>>>>>>>>>>     at
>>>>>>>>>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>>>>>>>>     at
>>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>>>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>>>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
>>>>>>>>>>     at CopyFile.main(CopyFile.java:11)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I am using CDH4.1. i have download the source file of
>>>>>>>>>> hadoop-1.0.4 and import the jar files into Eclipse. I think it is due to
>>>>>>>>>> version problem. Could you please let me know what will be correct version
>>>>>>>>>> for the CDH4.1?
>>>>>>>>>>
>>>>>>>>>> Many thanks
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <
>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> It should work. Same code is working fine for me. Try to create
>>>>>>>>>>> some other directory in your Hdfs and use it as your output path. Also see
>>>>>>>>>>> if you find something in datanode logs.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <
>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> The input path is fine. Problem in output path. I am just
>>>>>>>>>>>> wonder that it copy the data into local disk  (/user/root/) not into hdfs.
>>>>>>>>>>>> I dont know why? Is it we give the correct statement to point to hdfs?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <
>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Try this as your input file path
>>>>>>>>>>>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <
>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> when I am applying the command as
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> $ hadoop fs -put /usr/Eclipse/Output.csv
>>>>>>>>>>>>>> /user/root/Output.csv.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> its work fine and file browsing in the hdfs. But i dont know
>>>>>>>>>>>>>> why its not work in program.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Many thanks for your cooperation.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <
>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It would be good if I could have a look on the files.
>>>>>>>>>>>>>>> Meantime try some other directories. Also, check the directory permissions
>>>>>>>>>>>>>>> once.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <
>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have tried through root user and made the following
>>>>>>>>>>>>>>>> changes:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> No result. The following is the log output. The log shows
>>>>>>>>>>>>>>>> the destination is null.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>>>>>>>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>>>>>>>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>>>>>>>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>>>>>>>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <
>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yeah, My cluster running. When brows
>>>>>>>>>>>>>>>>> http://hadoop1.example.com: 50070/dfshealth.jsp. I am
>>>>>>>>>>>>>>>>> getting the main page. Then click on Brows file system. I am getting the
>>>>>>>>>>>>>>>>> following:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> hbase
>>>>>>>>>>>>>>>>> tmp
>>>>>>>>>>>>>>>>> user
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> And when click on user getting:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> beeswax
>>>>>>>>>>>>>>>>> huuser (I have created)
>>>>>>>>>>>>>>>>> root (I have created)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Would you like to see my configuration file. As did not
>>>>>>>>>>>>>>>>> change any things, all by default. I have installed CDH4.1 and running on
>>>>>>>>>>>>>>>>> VMs.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Is your cluster running fine? Are you able to browse Hdfs
>>>>>>>>>>>>>>>>>> through the Hdfs Web Console at 50070?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Many thanks.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have changed the program accordingly. It does not show
>>>>>>>>>>>>>>>>>>> any error but one warring , but when I am browsing the HDFS folder, file is
>>>>>>>>>>>>>>>>>>> not copied.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>>>>>>>> public static void main(String[] args) throws
>>>>>>>>>>>>>>>>>>> IOException{
>>>>>>>>>>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>>>>         //Configuration configuration = new
>>>>>>>>>>>>>>>>>>> Configuration();
>>>>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>         conf.addResource(new
>>>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>         conf.addResource(new Path
>>>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(conf);
>>>>>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>>>>>> Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>>>>>>         Path outputFile = new
>>>>>>>>>>>>>>>>>>> Path("/user/hduser/Output1.csv");
>>>>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 19-Nov-2012 13:50:32
>>>>>>>>>>>>>>>>>>> org.apache.hadoop.util.NativeCodeLoader <clinit>
>>>>>>>>>>>>>>>>>>> WARNING: Unable to load native-hadoop library for your
>>>>>>>>>>>>>>>>>>> platform... using builtin-java classes where applicable
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Have any idea?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> If it is just copying the files without any processing
>>>>>>>>>>>>>>>>>>>> or change, you can use something like this :
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>  public class CopyData {
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     public static void main(String[] args) throws
>>>>>>>>>>>>>>>>>>>> IOException{
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>         Configuration configuration = new
>>>>>>>>>>>>>>>>>>>> Configuration();
>>>>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Obviously you have to modify it as per your
>>>>>>>>>>>>>>>>>>>> requirements like continuously polling the targeted directory for new files.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> As I am new in  Java and Hadoop and have no much
>>>>>>>>>>>>>>>>>>>>> experience. I am trying to first write a simple program to upload data into
>>>>>>>>>>>>>>>>>>>>> HDFS and gradually move forward. I have written the following simple
>>>>>>>>>>>>>>>>>>>>> program to upload the file into HDFS, I dont know why it does not working.
>>>>>>>>>>>>>>>>>>>>> could you please check it, if have time.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>>>>>>>>>>>> import java.io.File;
>>>>>>>>>>>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> public static void main(String [] args) throws
>>>>>>>>>>>>>>>>>>>>> IOException
>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>     try{
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>>>>>>     conf.addResource(new
>>>>>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>     //String fileName =
>>>>>>>>>>>>>>>>>>>>> source.substring(source.lastIndexOf('/') + source.length());
>>>>>>>>>>>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>     else
>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>         System.out.println("File" + dest + " already
>>>>>>>>>>>>>>>>>>>>> exists");
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>>>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>    {
>>>>>>>>>>>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>>>>>    in.close();
>>>>>>>>>>>>>>>>>>>>>    out.close();
>>>>>>>>>>>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> KK
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> You can set your cronjob to execute the program after
>>>>>>>>>>>>>>>>>>>>>> every 5 sec.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Well, I want to automatically upload the files as
>>>>>>>>>>>>>>>>>>>>>>> the files are generating about every 3-5 sec and each file has size about
>>>>>>>>>>>>>>>>>>>>>>> 3MB.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>  Is it possible to automate the system using put or
>>>>>>>>>>>>>>>>>>>>>>> cp command?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I read about the flume and webHDFS but I am not sure
>>>>>>>>>>>>>>>>>>>>>>> it will work or not.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander
>>>>>>>>>>>>>>>>>>>>>>> Alten-Lorenz <wget.null@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Why do you don't use HDFS related tools like put or
>>>>>>>>>>>>>>>>>>>>>>>> cp?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > I am generating files continuously in local
>>>>>>>>>>>>>>>>>>>>>>>> folder of my base machine. How
>>>>>>>>>>>>>>>>>>>>>>>> > I can now use the flume to stream the generated
>>>>>>>>>>>>>>>>>>>>>>>> files from local folder to
>>>>>>>>>>>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>>>>>>>>>>>> > I dont know how exactly configure the sources,
>>>>>>>>>>>>>>>>>>>>>>>> sinks and hdfs.
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>>>>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>>>>>>>>>>>>> > 2) name node address: htdfs://
>>>>>>>>>>>>>>>>>>>>>>>> hadoop1.example.com:8020
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message