uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "rohan rai" <hiroha...@gmail.com>
Subject Re: import location over Hadoop
Date Wed, 11 Jun 2008 13:01:22 GMT
Thanks Thilo. Well If do that all sorts of invalid xml exception is getting
thrown

org.apache.uima.util.InvalidXMLException: Invalid descriptor at
<unknown source>.
	at org.apache.uima.util.impl.XMLParser_impl.parse(XMLParser_impl.java:193)
	at org.apache.uima.util.impl.XMLParser_impl.parseResourceSpecifier(XMLParser_impl.java:365)
	at org.apache.uima.util.impl.XMLParser_impl.parseResourceSpecifier(XMLParser_impl.java:346)
	at org.ziva.dq.hadoop.DQHadoopMain$Map.dQFile(DQHadoopMain.java:45)
	at org.ziva.dq.hadoop.DQHadoopMain$Map.map(DQHadoopMain.java:37)
	at org.ziva.dq.hadoop.DQHadoopMain$Map.map(DQHadoopMain.java:1)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2084)
Caused by: org.xml.sax.SAXParseException: Content is not allowed in prolog.
	at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
	at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
	at org.apache.uima.util.impl.XMLParser_impl.parse(XMLParser_impl.java:176)
	... 8 more
org.apache.uima.util.InvalidXMLException: Invalid descriptor at
<unknown source>.
	at org.apache.uima.util.impl.XMLParser_impl.parse(XMLParser_impl.java:193)
	at org.apache.uima.util.impl.XMLParser_impl.parseResourceSpecifier(XMLParser_impl.java:365)
	at org.apache.uima.util.impl.XMLParser_impl.parseResourceSpecifier(XMLParser_impl.java:346)
	at org.ziva.dq.hadoop.DQHadoopMain$Map.dQFile(DQHadoopMain.java:45)
	at org.ziva.dq.hadoop.DQHadoopMain$Map.map(DQHadoopMain.java:37)
	at org.ziva.dq.hadoop.DQHadoopMain$Map.map(DQHadoopMain.java:1)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2084)
Caused by: org.xml.sax.SAXParseException: Content is not allowed in prolog.
	at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
	at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
	at org.apache.uima.util.impl.XMLParser_impl.parse(XMLParser_impl.java:176)



On Wed, Jun 11, 2008 at 6:08 PM, Thilo Goetz <twgoetz@gmx.de> wrote:

> You need to use import by name instead of import
> by location in your descriptor.  Then things get
> loaded via the classpath and you should be ok
> (provided that you stick your descriptors in the
> jar of course).  I suggest you test this locally
> first by moving your application to a different
> machine where you don't have any descriptors
> lying around.  It'll be easier to debug than in
> hadoop.
>
> --Thilo
>
>
> rohan rai wrote:
>
>> Well the question is for running UIMA over hadoop? How to do that as in
>> UIMA
>> there are xml descriptors which have relative urls and location? Which
>> throws exception
>>
>> But I can probably do without that answer
>>
>> Simplifying the problem
>>
>> I create a jar for my application and I am trying to run a map reduce job
>>
>> In the map I am trying to read an xml resource which gives this kind of
>> exceprion
>>
>> java.io.FileNotFoundException:
>>
>> /tmp/hadoop-root/mapred/local/taskTracker/jobcache/job_200806102252_0028/task_200806102252_0028_m_000000_0/./descriptors/annotators/RecordCandidateAnnotator.xml
>> (No such file or directory)
>>        at java.io.FileInputStream.open(Native Method)
>>        at java.io.FileInputStream.<init>(FileInputStream.java:106)
>>        at java.io.FileInputStream.<init>(FileInputStream.java:66)
>>        at
>> sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:70)
>>        at
>> sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:161)
>>        at java.net.URL.openStream(URL.java:1009)
>>        at
>> org.apache.uima.util.XMLInputSource.<init>(XMLInputSource.java:83)
>>
>> I think I require to pass on the content of the jar which contains the
>> resource xml and classes(other than the JOB class) to each and every
>> taskXXXXXXX getting created
>>
>> How can I do that
>>
>> REgards
>> Rohan
>>
>>
>>
>>
>> On Wed, Jun 11, 2008 at 5:12 PM, Michael Baessler <
>> mba@michael-baessler.de>
>> wrote:
>>
>>  rohan rai wrote:
>>>
>>>> Hi
>>>>  A simple thing such as a name annotator which has an import location of
>>>> type starts throwing exception when I create a jar of the application I
>>>>
>>> am
>>>
>>>> developing and run over hadoop.
>>>>
>>>> If I have to do it a java class file then I can use XMLInputSource in =
>>>>
>>> new
>>>
>>> XMLInputSource(ClassLoader.getSystemResourceAsStream(aeXmlDescriptor),null);
>>>
>>>> But the relative paths in annotators, analysis engines etc starts
>>>>
>>> throwing
>>>
>>>> exception
>>>>
>>>> Please Help
>>>>
>>>> Regards
>>>> Rohan
>>>>
>>>>  I'm not sure I understand your question, but I think you need some help
>>> with the exceptions you get.
>>> Can you provide the exception stack trace?
>>>
>>> -- Michael
>>>
>>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message