hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dhruba Borthakur" <dhr...@yahoo-inc.com>
Subject RE: Calling FsShell.doMain() hold so many threads
Date Wed, 25 Jul 2007 06:19:52 GMT
Please try this attached patch, let me know if it works.

Thanks,
dhruba

-----Original Message-----
From: KrzyCube [mailto:yuxh312@gmail.com] 
Sent: Tuesday, July 24, 2007 6:19 PM
To: hadoop-user@lucene.apache.org
Subject: Re: Calling FsShell.doMain() hold so many threads


first of all ,thanks , Raghu.

here's the exception info:
------------------------------------------------------------------------
Exception in thread "main" java.lang.OutOfMemoryError: unable to create new
native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Unknown Source)
at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:116)
at
org.apache.hadoop.dfs.DistributedFileSystem$RawDistributedFileSystem.initial
ize(DistributedFileSystem.java:67)
at
org.apache.hadoop.fs.FilterFileSystem.initialize(FilterFileSystem.java:57)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:160)
at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:119)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:91)
at org.apache.hadoop.fs.FsShell.init(FsShell.java:41)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:809)
at kingsoft.lab.duba.CustomInterface.CreateDir(CustomInterface.java:138)
at kingsoft.lab.duba.CustomInterface.main(CustomInterface.java:155)
------------------------------------------------------------------------

Then , is there any recommendable API for these use ?
here "these" I mean: upload or download files and create dir
programmatically even in concurrency operation.


Raghu Angadi wrote:
> 
> 
> Can you get the stack trace of the threads that are left? It was not 
> obvious from the code where a thread is started. It might be 'trash 
> handler'.
> 
> You could add sleep(10sec) to give you enough time to get the trace.
> 
> FsShell might not be designed for this use, but seems like a pretty 
> useful feature.
> 
> Raghu.
> 
> KrzyCube wrote:
>> I have tried the way TestDFSShell.java does,
>> here's my code:
>> 
>> ------------------------------------------------------------
>> public class CustomInterface 
>> {	
>> 	Configuration conf ;
>> 	FsShell fs ;
>> 	
>> 	public CustomInterface()
>> 	{
>> 		conf = new Configuration();
>> 		fs = new FsShell();
>> 		
>> 		fs.setConf(conf);
>> 	}
>> 
>>         public int createDir(String strDirName,String strPath)
>> 	{
>>                 // omit exception catch
>> 		int iRet = 0;
>> 	        strPath += strDirName;
>> 	        String[] strCmd = new String[2];
>> 		strCmd[0] = "-mkdir";
>> 		strCmd[1] = strPath;		
>> 		return m_fs.run(strCmd);		
>> 	}	
>> }
>> ------------------------------------------------------------
>> 
>> Then i just call the createdir Method
>> 
>> for(int i =0 ; i < 100000 ; i ++)
>> {
>>     custom.createDir("someName");
>> }
>> 
>> this cause the java vm process hold many threads
>> and these threads eat memory .
>> till the JVM Heap are eat up , throws Exceptions.
>> only larger Heap size holds more threads , but not fix the problem.
>> 
>> thanks.
>> 
>> 
>> Dhruba Borthakur wrote:
>>> One example of programmatically using FsShell is in
>>> src/test/org/apache/hadoop/dfs/TestDFSShell.java
>>>
>>> Thanks,
>>> dhruba
>>>
>>> -----Original Message-----
>>> From: KrzyCube [mailto:yuxh312@gmail.com] 
>>> Sent: Monday, July 23, 2007 7:49 PM
>>> To: hadoop-user@lucene.apache.org
>>> Subject: Calling FsShell.doMain() hold so many threads
>>>
>>>
>>> Hi there:
>>>
>>> i got two questions:
>>>
>>> Q1:
>>>     I am try to  call the FsShell.doMain() with my own code , which is
>>> only
>>> a easy wrapper of the FsShell.
>>> But when i am trying to create many dirs , 10000 etc. Exception like
>>> "Not
>>> enough memory for more threads" throw ,  i have set the -Xmx512m.
>>>     Then i trying to view the process info while the program running ,
>>> then
>>> i found there are more and more threads invoked during the process , and
>>> eat
>>> more and more memory ,all threads still there without exit.
>>>     Then i came to the source code , and found that while the
>>> FsShell.Main()
>>> for terminal call there is one line
>>> "System.exit(return_value_of_doMain)"
>>> ,
>>> Is that mean the call of the ToolBase.run() which implemented in
>>> FsShell.java is always create a new thread and have to be force
>>> terminated
>>> by System.exit() to kill the process ?
>>>     So , if that is , how can i write my own code to use hadoop with
>>> FsShell
>>> in multi-thread mode , or is there any other way to do this ?
>>>
>>> Q2:
>>>      I svn code  , and run it in eclipse [the only reason i refer to
>>> eclipse
>>> is to indicate my environment],
>>> under Unbuntu 7.04.
>>>      all about casual , i want to see how much time the FsShell.doMain()
>>> take , I use "new Date()" and 
>>> get the interval with "DateEnd.getTime() - DateBeg.getTime()"
>>>      Then i found that: even mkdir take more then 1000 [getTime shows]
>>> if there's no arguments , it take 25 , but even if i just give it a
>>> wrong
>>> argument , such as "-sl", it take more than 1000 , is that means the
>>> argument check take most of the time cost?
>>>
>>> -- 
>>> View this message in context:
>>>
http://www.nabble.com/Calling-FsShell.doMain%28%29-hold-so-many-threads-tf41
>>> 33557.html#a11756139
>>> Sent from the Hadoop Users mailing list archive at Nabble.com.
>>>
>>>
>>>
>>>
>> 
> 
> 
> 

-- 
View this message in context:
http://www.nabble.com/Calling-FsShell.doMain%28%29-hold-so-many-threads-tf41
33557.html#a11774684
Sent from the Hadoop Users mailing list archive at Nabble.com.


Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message