uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thilo Goetz <twgo...@gmx.de>
Subject Re: Non-matching filter?
Date Sat, 05 Jan 2008 07:53:17 GMT
jonathan doklovic wrote:
> Hi,
> 
> I have been looking at Contraints and Filters.
> I understand how to use them to get an iterator that matches a certain
> type, but I want to do the opposite....
> 
> I have annotations for 3 types: City, State, and Location (where
> location contains a city and a state)
> 
> Now I want to create a filtered iterator that basically returns any city
> annotations that are NOT already within a Location annotation.
> 
> Is there any way to do this?
> 
> Thanks,
> 
> - Jonathan

Jonathan,

first, let me make sure I understand what it is that you need.  So for example,
for a sentence "the exhibition will visit New York, NY, and Paris, France" you
would might have city annotations for "New York" and "Paris", a state annotation
for "NY", and a location annotation for "New York, NY".  You would want to find
the city annotation for Paris, but not the one for New York.

If this is what you're trying to do, I don't know of an easy answer.  The fastest
method would involve iterating over locations and cities in parallel, but that
gets really messy and there are a ton of boundary cases to consider.  So here's
something that's a bit less efficient, but still ok performance-wise.
Unfortunately, it still involves some relatively advanced use of CAS iterators.

Please note: I just typed this in.  It compiles, but has never run.  If you
can't get it to work, I'll need a real example ;-)  And if this is not the
problem you're trying to solve, also let us know.  I'll stick the method here
in the text, and the complete file in an attachment.

HTH,
Thilo

   public List<AnnotationFS> findOrphanedCities(CAS cas) {
     // Obtain type system info; replace with correct type names
     Type cityType = cas.getTypeSystem().getType("city");
     Type locationType = cas.getTypeSystem().getType("location");
     Feature beginFeat = cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_BEGIN);
     Feature endFeat = cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_END);
     // Create an empty location annotation to position iterator
     AnnotationFS locationSearch = cas.createAnnotation(cityType, 0, 0);
     // Obtain city and annotation iterators
     FSIterator cityIterator = cas.getAnnotationIndex(cityType).iterator();
     FSIterator locationIterator = cas.getAnnotationIndex(locationType).iterator();
     // Result list
     List<AnnotationFS> list = new ArrayList<AnnotationFS>();
     // Iterate over all cities and collect those that are not covered by a location
     for (cityIterator.moveToFirst(); cityIterator.isValid(); cityIterator.moveToNext()) {
       AnnotationFS city = (AnnotationFS) cityIterator.get();
       // Set the search location to the position of the current city
       locationSearch.setIntValue(beginFeat, city.getBegin());
       locationSearch.setIntValue(endFeat, city.getEnd());
       // Set the location iterator to that location, if it exists
       locationIterator.moveTo(locationSearch);
       // Check that the iterator is valid, and that the location it points to covers the
city
       if (locationIterator.isValid()) {
         AnnotationFS loc = (AnnotationFS) locationIterator.get();
         if ((loc.getBegin() <= city.getBegin()) && (loc.getEnd() >= city.getEnd()))
{
           list.add(city);
         }
       }
     }
     return list;
   }


Mime
View raw message