I can see only one attached file: TextSimplifier.xml
Can you send me the input file, the rules and the type systems.
Peter
On 19.11.2012 13:45, Monaghan, Fergal wrote:
>
> I've attached here the descriptor ("TextSimplifier.xml": configuration
> for TextMarkerEngine), the test input data ("random01.txt.xmi":
> Cleartk[OpenNLP] annotated), the rules file ("rules.tm": with 1 rule,
> my first partial attempt at the text simplification process) and the
> current output ("1.xmi": one additional tag has been created by the
> rule), if this helps,
>
> Thanks again,
>
> Fergal.
>
> *From:*fergal.monaghan@sap.com
> *Sent:* 19 November 2012 09:56
> *To:* 'user@uima.apache.org'
> *Subject:* TextMarker language workthrough for text simplification
> example?
>
> Hi all (and especially the good folks working on TextMarker in the
> sandbox),
>
> 1. I am interested in implementing the type of text simplification
> rules set out in this paper [1].
>
> 2. I would prefer to use TextMarker (and its language) natively in
> UIMA than use the UIMA<->GATE integration and JAPE rules.
>
> 3. I have cloned TextMarker from the repo and have configured an
> analysis engine descriptor to run TextMarkerEngine using custom rules.
>
> 4. I have switched off the TextMarkerEngine seed annotations as I am
> testing on pre-processed XMI files that have been pre-annotated with
> the Cleartk type systems (up to and including TreebankNodes... OpenNLP
> used under the hood if that's of interest).
>
> 5. Things are building and unit tests running fine on simple rules.
> Yay! Good work guys :)
>
> Now I am focussing on customising the rules for the text
> simplification application. I have been studying the TextMarker
> language documentation here [2] as well as TextMarker's unit tests in
> the sandbox to get things working so far, but am now asking for your
> help to complete one of the example rules I'd like to implement. This
> is the example from [1]:
>
> Input (original):
>
> "The jury also commented on the Fulton court, which has been under
> fire for its practices in the appointment of appraisers, guardians and
> administrators."
>
> Output (simplified):
>
> "The jury also commented on the Fulton court." "The Fulton court has
> been under fire for its practices in the appointment of appraisers,
> guardians and administrators."
>
> Rule I want to implement in the TextMarker language:
>
> V W:NP_ant, Rel Clause(X:Rel Pr Y), Z. -> V W Z. W Y.
>
> which can be interpreted as "If a sentence consists of any text V
> followed by the antecedent noun phrase W, a relative clause
> (consisting of a relative pronoun X and a sequence of words Y)
> enclosed in commas and a sequence of words Z, then the embedded clause
> can be made into a new sentence with W as the subject NP".
>
> So far I have gotten to this in the TextMarker language (please see
> below the contents of my rules.tm file that I'm running through
> TextMarker). Please note this itself is not an attempt at the final
> complete rule, but some intermediate attempt that is the furthest I've
> been able to get on my own which still passes unit tests:
>
> ===============================================
>
> PACKAGE org.cleartk.syntax.constituent.type;
>
> (TreebankNode{FEATURE("nodeType","NP")}
> TerminalTreebankNode{FEATURE("nodeType",",")}
> TerminalTreebankNode{FEATURE("nodeType","WDT")}
> TreebankNode{FEATURE("nodeType","S")}){->MARK(com.sap.research.bd.ta.AdjectivalOrRelativeClause)};
>
> ===============================================
>
> Can someone complete this rule to get me closer to the example above?
> I lack understanding of the TextMarker language, but I feel that if I
> had an example of this slightly more complex rule than what is present
> in the unit tests/documentation, that I would be able to work it out
> for the rest of the rules I want to implement.
>
> Thanks very much for reading, and for any help you can provide,
>
> *Fergal Monaghan*
> B.E., Ph.D. | Research Specialist | SAP Research
> *SAP (UK) Limited* | The Concourse | Queen's Road |
> Belfast BT3 9DT
>
> T: +44 (0)28 9078-5705 | M: +44 (0)79 2076-6281 | F: +44
> (0)28 9078-5777
>
> mailto:fergal.monaghan@sap.com | www.sap.com/research
> <http://www.sap.com/research>__
>
> [1] http://homepages.abdn.ac.uk/advaith/pages/LEC02.pdf
> <http://homepages.abdn.ac.uk/advaith/pages/LEC02.pdf>
>
> [2] http://tmwiki.informatik.uni-wuerzburg.de/Wiki.jsp?page=Introduction
>
|