uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bonnie MacKellar <bkmackel...@gmail.com>
Subject Re: XML files as input to UIMA?
Date Fri, 22 Feb 2019 12:17:05 GMT
Thanks so much!

Bonnie MacKellar

On Fri, Feb 22, 2019 at 7:03 AM Erik Fäßler <erik.faessler@uni-jena.de>
wrote:

> Hey,
>
> just wanted to say that I didn’t come around to make the component
> available yet, will do first thing next week!
>
> Best,
>
> Erik
>
> > On 20. Feb 2019, at 19:47, Bonnie MacKellar <bkmackellar@gmail.com>
> wrote:
> >
> > Hi,
> > Yes, we are using that format. I have a parser that I wrote, but it isn't
> > integrated into UIMA. It runs separately and loads the full clinical
> trial
> > data into a triplestore (Stardog). I would be interested in your system
> > since I am not really familiar with how to write file readers in the UMIA
> > framework. Perhaps I can merge my parser into it and end up with just the
> > right thing. If you can make it available, I would definitely be
> > interested.  I will take a look at the other links as well.  Thanks!!
> >
> > Bonnie MacKellar
> >
> > On Wed, Feb 20, 2019 at 3:54 AM Erik Fäßler <erik.faessler@uni-jena.de>
> > wrote:
> >
> >> Dear Bonnie,
> >>
> >> are you talking about the clinical trial XML format used by
> >> ClinicalTrials. <http://clinicaltrials.org/>gov by any chance?
> >> If so, I did create a UIMA reader for these data. Its not perfect but
> >> perhaps enough for your purposes and also you might want to enhance it.
> >> Please let me know if you would be interested in that, I did not get
> >> around to make it publicly available yet but could do so quickly.
> >>
> >> To answer the general question to the best of my knowledge:
> >> There is no such thing as a general XML reader built-in into the UIMA
> >> framework. For all non-trivial formats, a specific reader is necessary.
> >> This also holds true with regard to the employed type system.
> >> That being said, there are UIMA readers that try to serve as a general
> XML
> >> reading facility, e.g. the “XML Reader” from our lab (JULIELab,
> >> https://github.com/JULIELab/jcore-base/tree/master/jcore-xml-reader <
> >> https://github.com/JULIELab/jcore-base/tree/master/jcore-xml-reader>).
> >> However, in my experience XML inputs come in a lot of different forms
> >> which might often not be suitable to a generic approach which is why I
> >> wrote quite a few UIMA readers for specific XML formats in the past.
> >>
> >> Hope that helps,
> >>
> >> Erik
> >>
> >>> On 20. Feb 2019, at 01:13, Bonnie MacKellar <bkmackellar@gmail.com>
> >> wrote:
> >>>
> >>> This is probably a very naive question, but I can't seem to find
> anything
> >>> about this. I currently have a lot of XML files (clinical trial
> >>> descriptions). My current workflow is to run a preprocessor that parses
> >> the
> >>> XML and generates text files in a simple format. I then run these files
> >> in
> >>> a UIMA pipeline, using FileCollectionReader to load the text files,
> RUTA
> >> to
> >>> parse the simple format, the Metamap annotator to do some UMLS
> >> annotations,
> >>> and finally I have a writer that generates RDF triples from the UMIA
> >>> annotations and loads the triples into a database. This has worked but
> is
> >>> clunky, especially the preprocessing. I feel like there has to be a
> >> better
> >>> way. Is there any support for reading XML files  or do I need to write
> my
> >>> own CollectionReader? Are there any other tools within UIMA for
> handling
> >>> XML text?
> >>>
> >>> thanks,
> >>> Bonnie MacKellar
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message