uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "rohan rai" <hiroha...@gmail.com>
Subject Re: import location over Hadoop
Date Thu, 12 Jun 2008 07:50:15 GMT
Hi Thilo

Sorry for asking such a simple thing ...Under which topic should I add this
info

Regards
Rohan

On Thu, Jun 12, 2008 at 2:21 AM, Thilo Goetz <twgoetz@gmx.de> wrote:

> Hi Rohan,
>
> I'm glad you got it to work.  This is useful information.  It would
> be great if you could put it up on the UIMA Wiki:
> http://cwiki.apache.org/UIMA/
>
> --Thilo
>
>
> rohan rai wrote:
>
>> I think I got it.....Thanks for all the help you guys.........To make a
>> simple UIMA app work over hadoop (I did it on pseudo distributed
>> environment) 3-4 factors come together..
>>
>> 1) the UIMA app along with the mapper reducer and your job main file + the
>> the resources should be contained within the job jar you created
>>
>> 2) probably all import in the descriptor should be import by name (haven't
>> verified this works with location)
>>
>> 3) any resource being read in any of the class file should be done via
>> Classloader
>>   E.g XMLInputSource in = new
>>
>> XMLInputSource(ClassLoader.getSystemResourceAsStream(aeXmlDescriptor),null);
>>
>> 4) the When any AnalysisEngine or something like that of UIMA  is being
>> getting produced (I am doing it in mapper) then ResourceManager should be
>> used
>>  E.g. ResourceManager rMng=UIMAFramework.newDefaultResourceManager();
>>                rMng.setExtensionClassPath(str, true); //Here str is the
>> path to any of the resources which can be obtained via
>>
>> //ClassLoader.getSystemResource(aeXmlDescriptor).getPath()
>>                rMng.setDataPath(str);
>>                aEngine =
>> UIMAFramework.produceAnalysisEngine(aSpecifier,rMng,null);
>>
>> This 4th point has to be considered as when we read a xml without using
>> classloader by default it reads from temp task directory eg.
>>
>>
>> /tmp/hadoop-root/mapred/local/taskTracker/jobcache/job_200806112341_0002/task_200806112341_0002_m_000000_0/
>>
>> But all the resources and classes gets unjarred in
>>
>> /tmp/hadoop-root/mapred/local/taskTracker/jobcache/job_200806112341_0002/work
>>
>> directory
>>
>> So to tell the system to look out for the resources in the correct
>> directory when not using classloader (which is what UIMA's
>> XMLInputSource does)
>> we have to use resource manager
>>
>> Regards
>> Rohan
>>
> ...
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message