flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shekhar sharma <shekhar2...@gmail.com>
Subject Re: Automatically upload files into HDFS
Date Tue, 20 Nov 2012 19:06:55 GMT
By the way how are you building and running your project.Are u running from
any IDE?
The best practises you can follow:

(1) Create your project as maven project and give the dependency of
hadoop-X.Y.Z . So your project will automatically will have all the
necessary jars
and i am sure you will not face these kind of errors
(2) And In your HADOOP_CLASSPATH provide the path for $HADOOP_LIB.

Regards,
SOm





On Tue, Nov 20, 2012 at 9:52 PM, kashif khan <drkashif8310@gmail.com> wrote:

> Dear Tariq
>
> Many thanks, finally I have created the directory and upload the file.
>
> Once again many thanks
>
> Best regards
>
>
> On Tue, Nov 20, 2012 at 3:04 PM, kashif khan <drkashif8310@gmail.com>wrote:
>
>> Dear Many thanks
>>
>>
>> I have downloaded the jar file and added to project. Now getting another
>> error as:
>>
>> og4j:WARN No appenders could be found for logger
>> (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
>> log4j:WARN Please initialize the log4j system properly.
>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
>> more info.
>> Exception in thread "main" java.io.IOException: No FileSystem for scheme:
>> hdfs
>>     at
>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2206)
>>     at
>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2213)
>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>>     at
>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>>
>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>     at CopyFile.main(CopyFile.java:14)
>>
>> Have any idea about this?
>>
>> Thanks again
>>
>>
>>
>>
>>
>> On Tue, Nov 20, 2012 at 2:53 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>>
>>> You can download the jar here :
>>> http://search.maven.org/remotecontent?filepath=com/google/guava/guava/13.0.1/guava-13.0.1.jar
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Tue, Nov 20, 2012 at 8:06 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>>
>>>> Could please let me know the name of jar file and location
>>>>
>>>> Many thanks
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On Tue, Nov 20, 2012 at 2:33 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>>>>
>>>>> Download the required jar and include it in your project.
>>>>>
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Nov 20, 2012 at 7:57 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>>>>
>>>>>> Dear Tariq Thanks
>>>>>>
>>>>>> I have added the jar files from Cdh and download the cdh4 eclipse
>>>>>> plugin and copied into eclipse plugin folder. The previous error
I think
>>>>>> sorted out but now I am getting another strange error.
>>>>>>
>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>> com/google/common/collect/Maps
>>>>>>     at
>>>>>> org.apache.hadoop.metrics2.lib.MetricsRegistry.<init>(MetricsRegistry.java:42)
>>>>>>     at
>>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:87)
>>>>>>     at
>>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:133)
>>>>>>     at
>>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)
>>>>>>     at
>>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)
>>>>>>     at
>>>>>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:97)
>>>>>>     at
>>>>>> org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:190)
>>>>>>     at
>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2373)
>>>>>>     at
>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2365)
>>>>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2233)
>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>>>>>     at CopyFile.main(CopyFile.java:14)
>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>> com.google.common.collect.Maps
>>>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>>>>>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>>>>>>     ... 13 more
>>>>>>
>>>>>> Have any idea about this error.
>>>>>>
>>>>>> Many thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 20, 2012 at 2:19 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>>>>>>
>>>>>>> Hello Kashif,
>>>>>>>
>>>>>>>      You are correct. This because of some version mismatch.
I am
>>>>>>> not using CDH personally but AFAIK, CDH4 uses Hadoop-2.x.
>>>>>>>
>>>>>>> Regards,
>>>>>>>     Mohammad Tariq
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 20, 2012 at 4:10 PM, kashif khan <drkashif8310@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> HI M Tariq
>>>>>>>>
>>>>>>>>
>>>>>>>> I am trying the following the program to create directory
and copy
>>>>>>>> file to hdfs. But I am getting the following errors
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Program:
>>>>>>>>
>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>> import java.io.IOException;
>>>>>>>>
>>>>>>>> public class CopyFile {
>>>>>>>>
>>>>>>>>
>>>>>>>>         public static void main(String[] args) throws IOException{
>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>         conf.set("fs.default.name", "hadoop1.example.com:8020");
>>>>>>>>         FileSystem dfs = FileSystem.get(conf);
>>>>>>>>         String dirName = "Test1";
>>>>>>>>         Path src = new Path(dfs.getWorkingDirectory() + "/"
+
>>>>>>>> dirName);
>>>>>>>>         dfs.mkdirs(src);
>>>>>>>>         Path scr1 = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>         Path dst = new Path(dfs.getWorkingDirectory() + "/Test1/");
>>>>>>>>         dfs.copyFromLocalFile(src, dst);
>>>>>>>>
>>>>>>>>         }
>>>>>>>>         }
>>>>>>>>
>>>>>>>>
>>>>>>>>     Exception in thread "main"
>>>>>>>> org.apache.hadoop.ipc.RemoteException: Server IPC version
7 cannot
>>>>>>>> communicate with client version 4
>>>>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>>>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>>     at $Proxy1.getProtocolVersion(Unknown Source)
>>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>>>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>>>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
>>>>>>>>     at CopyFile.main(CopyFile.java:11)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I am using CDH4.1. i have download the source file of hadoop-1.0.4
>>>>>>>> and import the jar files into Eclipse. I think it is due
to version
>>>>>>>> problem. Could you please let me know what will be correct
version for the
>>>>>>>> CDH4.1?
>>>>>>>>
>>>>>>>> Many thanks
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <dontariq@gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> It should work. Same code is working fine for me. Try
to create
>>>>>>>>> some other directory in your Hdfs and use it as your
output path. Also see
>>>>>>>>> if you find something in datanode logs.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>     Mohammad Tariq
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <
>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> The input path is fine. Problem in output path. I
am just wonder
>>>>>>>>>> that it copy the data into local disk  (/user/root/)
not into hdfs. I dont
>>>>>>>>>> know why? Is it we give the correct statement to
point to hdfs?
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <
>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Try this as your input file path
>>>>>>>>>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan
<
>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> when I am applying the command as
>>>>>>>>>>>>
>>>>>>>>>>>> $ hadoop fs -put /usr/Eclipse/Output.csv
/user/root/Output.csv.
>>>>>>>>>>>>
>>>>>>>>>>>> its work fine and file browsing in the hdfs.
But i dont know
>>>>>>>>>>>> why its not work in program.
>>>>>>>>>>>>
>>>>>>>>>>>> Many thanks for your cooperation.
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad
Tariq <
>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> It would be good if I could have a look
on the files. Meantime
>>>>>>>>>>>>> try some other directories. Also, check
the directory permissions once.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif
khan <
>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have tried through root user and
made the following
>>>>>>>>>>>>>> changes:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> No result. The following is the log
output. The log shows the
>>>>>>>>>>>>>> destination is null.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit:
allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null
perm=null
>>>>>>>>>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit:
allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>>>>>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit:
allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null
perm=null
>>>>>>>>>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit:
allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null
perm=null
>>>>>>>>>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit:
allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null
perm=null
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:29 PM,
kashif khan <
>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yeah, My cluster running. When
brows
>>>>>>>>>>>>>>> http://hadoop1.example.com: 50070/dfshealth.jsp.
I am
>>>>>>>>>>>>>>> getting the main page. Then click
on Brows file system. I am getting the
>>>>>>>>>>>>>>> following:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> hbase
>>>>>>>>>>>>>>> tmp
>>>>>>>>>>>>>>> user
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> And when click on user getting:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> beeswax
>>>>>>>>>>>>>>> huuser (I have created)
>>>>>>>>>>>>>>> root (I have created)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Would you like to see my configuration
file. As did not
>>>>>>>>>>>>>>> change any things, all by default.
I have installed CDH4.1 and running on
>>>>>>>>>>>>>>> VMs.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:04
PM, Mohammad Tariq <
>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Is your cluster running fine?
Are you able to browse Hdfs
>>>>>>>>>>>>>>>> through the Hdfs Web Console
at 50070?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 7:31
PM, kashif khan <
>>>>>>>>>>>>>>>> drkashif8310@gmail.com>
wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Many thanks.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have changed the program
accordingly. It does not show
>>>>>>>>>>>>>>>>> any error but one warring
, but when I am browsing the HDFS folder, file is
>>>>>>>>>>>>>>>>> not copied.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> public class CopyData
{
>>>>>>>>>>>>>>>>> public static void main(String[]
args) throws IOException{
>>>>>>>>>>>>>>>>>         Configuration
conf = new Configuration();
>>>>>>>>>>>>>>>>>         //Configuration
configuration = new
>>>>>>>>>>>>>>>>> Configuration();
>>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>         conf.addResource(new
>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>         conf.addResource(new
Path
>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>         FileSystem fs
= FileSystem.get(conf);
>>>>>>>>>>>>>>>>>         Path inputFile
= new
>>>>>>>>>>>>>>>>> Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>>>>         Path outputFile
= new
>>>>>>>>>>>>>>>>> Path("/user/hduser/Output1.csv");
>>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile,
outputFile);
>>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 19-Nov-2012 13:50:32
>>>>>>>>>>>>>>>>> org.apache.hadoop.util.NativeCodeLoader
<clinit>
>>>>>>>>>>>>>>>>> WARNING: Unable to load
native-hadoop library for your
>>>>>>>>>>>>>>>>> platform... using builtin-java
classes where applicable
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Have any idea?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012
at 1:18 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>> dontariq@gmail.com>
wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> If it is just copying
the files without any processing or
>>>>>>>>>>>>>>>>>> change, you can use
something like this :
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>  public class CopyData
{
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>     public static
void main(String[] args) throws
>>>>>>>>>>>>>>>>>> IOException{
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>         Configuration
configuration = new Configuration();
>>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>         FileSystem
fs = FileSystem.get(configuration);
>>>>>>>>>>>>>>>>>>         Path inputFile
= new
>>>>>>>>>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>>>>>>>>>         Path outputFile
= new Path("/mapout/FFT.java");
>>>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile,
outputFile);
>>>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Obviously you have
to modify it as per your requirements
>>>>>>>>>>>>>>>>>> like continuously
polling the targeted directory for new files.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012
at 6:23 PM, kashif khan <
>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com>
wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> As I am new in
 Java and Hadoop and have no much
>>>>>>>>>>>>>>>>>>> experience. I
am trying to first write a simple program to upload data into
>>>>>>>>>>>>>>>>>>> HDFS and gradually
move forward. I have written the following simple
>>>>>>>>>>>>>>>>>>> program to upload
the file into HDFS, I dont know why it does not working.
>>>>>>>>>>>>>>>>>>> could you please
check it, if have time.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>>>>>>>>>> import java.io.File;
>>>>>>>>>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>>>>>>>>>> public class
hdfsdata {
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> public static
void main(String [] args) throws
>>>>>>>>>>>>>>>>>>> IOException
>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>     try{
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     Configuration
conf = new Configuration();
>>>>>>>>>>>>>>>>>>>     conf.addResource(new
>>>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>     conf.addResource(new
Path
>>>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>     FileSystem
fileSystem = FileSystem.get(conf);
>>>>>>>>>>>>>>>>>>>     String source
= "/usr/Eclipse/Output.csv";
>>>>>>>>>>>>>>>>>>>     String dest
= "/user/hduser/input/";
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     //String
fileName =
>>>>>>>>>>>>>>>>>>> source.substring(source.lastIndexOf('/')
+ source.length());
>>>>>>>>>>>>>>>>>>>     String fileName
= "Output1.csv";
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     if (dest.charAt(dest.length()
-1) != '/')
>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>         dest
= dest + "/" +fileName;
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>     else
>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>         dest
= dest + fileName;
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>     Path path
= new Path(dest);
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>         System.out.println("File"
+ dest + " already
>>>>>>>>>>>>>>>>>>> exists");
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    FSDataOutputStream
out = fileSystem.create(path);
>>>>>>>>>>>>>>>>>>>    InputStream
in = new BufferedInputStream(new
>>>>>>>>>>>>>>>>>>> FileInputStream(new
File(source)));
>>>>>>>>>>>>>>>>>>>    File myfile
= new File(source);
>>>>>>>>>>>>>>>>>>>    byte [] b
= new byte [(int) myfile.length() ];
>>>>>>>>>>>>>>>>>>>    int numbytes
= 0;
>>>>>>>>>>>>>>>>>>>    while((numbytes
= in.read(b)) >= 0)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    {
>>>>>>>>>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>>>    in.close();
>>>>>>>>>>>>>>>>>>>    out.close();
>>>>>>>>>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>     catch(Exception
e)
>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> KK
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mon, Nov 19,
2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>>>> dontariq@gmail.com>
wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> You can set
your cronjob to execute the program after
>>>>>>>>>>>>>>>>>>>> every 5 sec.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>     Mohammad
Tariq
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Mon, Nov
19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com>
wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Well,
I want to automatically upload the files as  the
>>>>>>>>>>>>>>>>>>>>> files
are generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>  Is it
possible to automate the system using put or cp
>>>>>>>>>>>>>>>>>>>>> command?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I read
about the flume and webHDFS but I am not sure
>>>>>>>>>>>>>>>>>>>>> it will
work or not.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Many
thanks
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Best
regards
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Mon,
Nov 19, 2012 at 12:26 PM, Alexander
>>>>>>>>>>>>>>>>>>>>> Alten-Lorenz
<wget.null@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Why
do you don't use HDFS related tools like put or
>>>>>>>>>>>>>>>>>>>>>> cp?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> -
Alex
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On
Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com>
wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >
HI,
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> >
I am generating files continuously in local folder
>>>>>>>>>>>>>>>>>>>>>> of
my base machine. How
>>>>>>>>>>>>>>>>>>>>>> >
I can now use the flume to stream the generated
>>>>>>>>>>>>>>>>>>>>>> files
from local folder to
>>>>>>>>>>>>>>>>>>>>>> >
HDFS.
>>>>>>>>>>>>>>>>>>>>>> >
I dont know how exactly configure the sources,
>>>>>>>>>>>>>>>>>>>>>> sinks
and hdfs.
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> >
1) location of folder where files are generating:
>>>>>>>>>>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>>>>>>>>>>> >
2) name node address: htdfs://
>>>>>>>>>>>>>>>>>>>>>> hadoop1.example.com:8020
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> >
Please let me help.
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> >
Many thanks
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> >
Best regards,
>>>>>>>>>>>>>>>>>>>>>> >
KK
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>> Alexander
Alten-Lorenz
>>>>>>>>>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>>>>>>>>>> German
Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message