uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravindra <ravindra.baj...@gmail.com>
Subject Re: Read file name in an annotator
Date Thu, 10 Jul 2014 11:39:24 GMT
May this help -

    // Also store location of source document in CAS. This information is
critical
    // if CAS Consumers will need to know where the original document
contents are located.
    // For example, the Semantic Search CAS Indexer writes this information
into the
    // search index that it creates, which allows applications that use the
search index to
    // locate the documents that satisfy their semantic queries.
    SourceDocumentInformation srcDocInfo = new
SourceDocumentInformation(jcas);
    srcDocInfo.setUri(file.getAbsoluteFile().toURL().toString());
    srcDocInfo.setOffsetInSource(0);
    srcDocInfo.setDocumentSize((int) file.length());
    srcDocInfo.setLastSegment(mCurrentIndex == mFiles.size());
    srcDocInfo.addToIndexes();


followed by
   // retrieve the filename of the input file from the CAS
    FSIterator it =
jcas.getAnnotationIndex(SourceDocumentInformation.type).iterator();
    File outFile = null;
    if (it.hasNext()) {
      SourceDocumentInformation fileLoc = (SourceDocumentInformation)
it.next();
      File inFile;
      try {
        inFile = new File(new URL(fileLoc.getUri()).getPath());
        String outFileName = inFile.getName();
        if (fileLoc.getOffsetInSource() > 0) {
          outFileName += ("_" + fileLoc.getOffsetInSource());
        }
        outFileName += ".xmi";
        outFile = new File(mOutputDir, outFileName);
        modelFileName = mOutputDir.getAbsolutePath() + "/" +
inFile.getName() + ".ecore";
      } catch (MalformedURLException e1) {
        // invalid URL, use default processing below
      }
    }

look for SourceDocumentInformation in the examples


--
Ravi.
*''We do not inherit the earth from our ancestors, we borrow it from our
children.'' PROTECT IT !*


On Thu, Jul 10, 2014 at 4:49 PM, Debbie Zhang <debbie.d.zhang@gmail.com>
wrote:

> Thanks Thomas. May I ask if there is any sample code of UIMA readers that
> can provide file name information for developing annotation? I was looking
> on the internet today, but couldn't find one. Thanks again for your help -
> much appreciated!
>
> Regards,
>
> Debbie Zhang
>
> > -----Original Message-----
> > From: Thomas Ginter [mailto:thomas.ginter@utah.edu]
> > Sent: Thursday, 10 July 2014 5:00 AM
> > To: user@uima.apache.org
> > Subject: Re: Read file name in an annotator
> >
> > Hi Debbie,
> >
> > The file name is not provided by default in UIMA although I believe the
> > UIMA FileReader does populate a SourceDocumentInformation annotation
> > with this information.  Our group has a set of readers that populate
> > our own annotation type to provide location data and other meta-
> > information for each record (CAS) being processed.  In short you will
> > be better off writing your reader to provide that information for you.
> >
> > Thanks,
> >
> > Thomas Ginter
> > 801-448-7676
> > thomas.ginter@utah.edu
> >
> >
> >
> >
> > On Jul 9, 2014, at 5:41, Debbie Zhang <debbie.d.zhang@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > Can anyone tell me how to read the file name in an annotator using
> > the
> > > JCas? It seems the DocumentAnnotation does't contain file name. Thank
> > > you!
> > >
> > > Best regards,
> > >
> > > Debbie Zhang
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message