uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Mauceri <mauc...@hermeneute.com>
Subject Re: UIMA Apache is a real pain
Date Thu, 30 Aug 2007 15:05:53 GMT
Hi Marshall,

I'm using as a good soldier Thylo's recommendations on Eclipse Compiler 
Settings http://cwiki.apache.org/UIMA/eclipse-compiler-settings.html, it 
is why I get errors for deprecations, but yes I was unfair on this point 
as I could get back to a more lenient configuration but as I'm very 
nervous when doing these migrations I try to keep as strict as possible. 
My first suggestion concerning the Eclipse appeoach of UIMA should be to 
automatically add in the plugin manifest the famous list, when switching 
in UIMA mode:
Eclipse-BuddyPolicy: registered
Eclipse-RegisterBuddy:
org.apache.uima.debug,
org.apache.uima.desceditor,
org.apache.uima.jcas.jcasgenp,
org.apache.uima.pear,
org.apache.uima.runtime
the great advantage is to avoid people spending time on class not found 
problem. Indeed if you build a descriptor on the fly and try to use it 
without these magic lines in your manifest there no way for UIMA to 
retrieve the classes you are referring to in the generated descriptor. 
Having that the integration of UIMA in Eclipse plugins is almost 
transparent, for my part I store the generated descriptors and CPEs in a 
dedicated folder and use eclipse actions to launch them. The reason of 
this choice is dictated by the fact it is difficult to ask an end user 
to do this himself and in other hand  we need some fexibility to create 
different projects, I think it's a good tradeoff. To be more precise 
imagine you have an application able to make some statistics on a corpus 
based on certain criteria described in scripts files (regexp, 
dictionnaries, etc...) the corpus and these files are parameters you do 
not know a priori even if your programs know how to ues them. The 
solution I use is the following I use Eclipse Wizards to collect from 
the user these informations then build the correct descriptors for instance:
.
.
. 
       <configurationParameterSettings>
            <nameValuePair>
               * <name>InputDirectory</name>
                <value>
                    <string>C:\Documents and 
Settings\Administrator\Desktop\These\RuntimeWorkspace3.3\Aziyadé 
II\Res\Corpus</string>
                </value>
            </nameValuePair>*
        </configurationParameterSettings>
.
.
.
Is a part of the collection reader descriptor indicating it where the 
corpus is using

            XMLInputSource in =
                new 
XMLInputSource(root.getLocation().append(FileSystemCollectionReaderPath).toOSString());
            CollectionReaderDescription cdr =
                
UIMAFramework.getXMLParser().parseCollectionReaderDescription(in);
            ConfigurationParameterSettings ps =
                cdr.getMetaData().getConfigurationParameterSettings();
            ps.setParameterValue(*"InputDirectory", 
root.getLocation().append(corpusPath).toOSString()*);
            cdr.getMetaData().setConfigurationParameterSettings(ps);
            cdr.toXML(new 
FileOutputStream(root.getLocation().append(FileSystemCollectionReaderPath).toOSString()));

in the wizard, and so on... So it is a little bit painful at the 
beginning but the results are nice because if you hide as a .resource in 
the project the user does not even know she/he is using UIMA and focus 
only on real stuff. The grammar, the dictionnaries, the corpus, the 
statistical parameters, the kwics, etc. It is what I really like in 
UIMA, on top of what we all know from an architectural point of view, it 
is humble and *we can forget it!!!!!* It's a fantastic fence, I love to 
forget once I get what I want. To conclude by an analogy, you can use 
SVG by hand to build fancy figures or you can generate svg files using 
the Dom API to produce fancy charts, users always prefer the second 
solution.
Thanks for your fantastic effort and once again it was the frustration 
which dictated this acrimonious note, Pascal can tell you I was even 
worse when we used to work together in the IBM Scientific Center of 
Paris, fortunately we are getting old. ;-) .
By the way I believe we could write something funny on this in the 
Eclipse Corner to promote UIMA.



Marshall Schor wrote:
> Hi Christian -
>
> I see you are doing some fancy Eclipse plugin programming :-)  We
> appreciate your comments - you can probably teach us a few tricks here, too!
>
> Things that we deprecate are not (at least not intentionally :-) removed
> until we're pretty sure it won't affect our users; we value having our
> users be able to depend on keeping things stable/working, where possible.
>
> We use deprecation to signal that new work using these APIs should use
> the new method(s); but previous code should still run (unless there is
> some very unusual circumstance).
>
> When you say "... are no longer accepted by the compiler" - did you mean
> it compiled, but you got a deprecation warning?  If so, it still should
> have worked, I think.  If the deprecation messages bother you, you can
> turn them off in Eclipse (you probably know how to do this already - but
> for others reading this note:  menu in Eclipse 3.3: windows ->
> preference ->java -> compiler -> Errors/Warnings , then scroll down to
> "Deprecated and Restricted APIs").
>
> Finally, because you're doing here some advanced techniques (e.g.
> building up a descriptor inside an Eclipse plugin, at run-time), you're
> venturing into areas of the framework that are perhaps less well
> documented - so please feel free to ask questions (and perhaps suggest
> improvements to the docs).
>
> -Marshall
>
> Christian Mauceri wrote:
>   
>> Hi Marshall,
>>
>> sorry it was late and I was tired. I eventually found the solution.
>> The problem came from the deprecation of  setDescripor, for instance
>> the statements:
>>
>>            CpeIntegratedCasProcessor basf =
>>                CpeDescriptorFactory.produceCasProcessor("BasicForms");
>>           
>> basf.setDescriptor(root.getLocation().append(BasicFormPath).toOSString());
>>
>>
>> are no longer accepted by the compiler, I replaced them by things like:
>>
>>            CpeIntegratedCasProcessor basf =
>>                CpeDescriptorFactory.produceCasProcessor("BasicForms");
>>            CpeComponentDescriptor ccd =
>> UIMAFramework.getResourceSpecifierFactory().createDescriptor();
>>            ccd.setSourceUrl(new
>> URL("file://"+root.getLocation().append(BasicFormPath).toOSString()));
>>            basf.setCpeComponentDescriptor(ccd);
>>
>> And it was a mistake, the change should have been:
>>
>>            CpeIntegratedCasProcessor basf =
>>                CpeDescriptorFactory.produceCasProcessor("BasicForms");
>>            CpeComponentDescriptor ccd =
>> UIMAFramework.getResourceSpecifierFactory().createDescriptor();
>>            //ccd.setSourceUrl(new
>> URL("file://"+root.getLocation().append(BasicFormPath).toOSString()));
>>            CpeInclude cpeInclude =
>> UIMAFramework.getResourceSpecifierFactory().createInclude();
>>           
>> cpeInclude.set(root.getLocation().append(BasicFormPath).toOSString());
>>            ccd.setInclude(cpeInclude);
>>              basf.setCpeComponentDescriptor(ccd);
>>
>> Another very important point to highlight is not to forget (as you
>> taught me some months ago) to replace in the manifest plugin:
>>
>> Eclipse-BuddyPolicy: registered
>> Eclipse-RegisterBuddy: com.ibm.uima.debug,
>> com.ibm.uima.desceditor,
>> com.ibm.uima.jcas.jcasgenp,
>> com.ibm.uima.pear,
>> com.ibm.uima.runtime
>>
>> by :
>>
>> Eclipse-BuddyPolicy: registered
>> Eclipse-RegisterBuddy:
>> org.apache.uima.debug,
>> org.apache.uima.desceditor,
>> org.apache.uima.jcas.jcasgenp,
>> org.apache.uima.pear,
>> org.apache.uima.runtime
>>
>> In order to make UIMA recognizes the plugin's classes. On this topic I
>> noticed only org.apache.uima.runtime contains the instruction
>> 'Eclipse-BuddyPolicy: registered' in its manifest. I added it in the
>> other plugins because I believe it could provoke error messages when
>> editing the generated descriptors in the application project (even if
>> it is not the purpose)
>>
>> So, sorry for this access of bad mood, I do not regret th have chosen
>> UIMA, you guys have done a great work!
>>
>>
>> Marshall Schor wrote:
>>     
>>> Hi -
>>>
>>> Sorry to hear you're having such a frustrating time!
>>> It's a little hard to figure out what might be helpful here without some
>>> further details.  I don't think anything changed in the implementation
>>> that would alter the behavior you describe regarding CPEs.   Can you
>>> describe what's going wrong?
>>> We're continually trying to balance going forward with keeping backwards
>>> compatibility.  When moving to Apache UIMA, there was a need to change
>>> the package names (to org.apache.uima...) - that was the biggest change
>>> that required users to change their code and recompile.  We included a
>>> utility that attempted to update the source for these changes - were you
>>> able to make use of it?
>>>
>>> -Marshall
>>>
>>>
>>> Christian Mauceri wrote:
>>>  
>>>       
>>>> I spent some hours in trying to port my old UIMA IBM Appli in the
>>>> Apache version and it's a real pain where you know. I do not
>>>> understand why to change things at this point and make things so
>>>> difficult for the others. I do not see the benefit for anybody, one
>>>> can imagine the decision to use UIMA is not spending all the time in
>>>> trying to understand the deprecated functions, the PATH rules etc.
>>>> Something becomes a standard because it is supposed to be useful and
>>>> make people life easier. For my deepest regret it is not the case for
>>>> this version of UIMA. Among other thing I cannot understand why it is
>>>> not possible to embed in simple way descriptors and CPEs in a plugin
>>>> and forget the machinery beyond, let's imagine if for instance EMF
>>>> produced such head ache.
>>>> In the IBM version it was possible to generate a CPE and put it in a
>>>> folder with the other descriptors and have an Eclipse action doing
>>>> something like :
>>>>
>>>> CpeDescription cpeDesc = UIMAFramework.getXMLParser()
>>>>                .parseCpeDescription(
>>>>                        new
>>>> XMLInputSource(cpeFile.getLocation().toOSString()));
>>>> CollectionProcessingEngine cpe =
>>>> UIMAFramework.produceCollectionProcessingEngine(cpeDesc);
>>>>
>>>> then something like
>>>>
>>>>                        monitor.beginTask("Starting CPE", nod);
>>>>                        //Create and register a Status Callback Listener
>>>>                                            
>>>> StatusCallbackListenerImpl cbl =
>>>>                            new StatusCallbackListenerImpl(monitor);
>>>>                        cpe.addStatusCallbackListener(cbl);
>>>>                                              cpe.process();
>>>>                                              while (!cbl.isFinished()){
>>>>                            if(monitor.isCanceled()){
>>>>                                cpe.stop();
>>>>                                return Status.CANCEL_STATUS;
>>>>                            }
>>>>                        }
>>>>
>>>> without worrying about the CLASSPATH or I do not know what, why is it
>>>> that difficult now? Because we have to suffer before having the right
>>>> to use this so wonderful framework?
>>>>
>>>> I'm at 1 month from a crucial deadline, I need the Eclipse 3.3
>>>> version, I regret my first choice, deeply!
>>>>
>>>>
>>>>     
>>>>         
>>>
>>>   
>>>       
>
>
>
>   

-- 
Cordialement/Regards
Christian Mauceri
http://hermeneute.com/Christian


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message