From Richard Eckart de Castilho <eckar...@tk.informatik.tu-darmstadt.de>
Subject 2nd UIMA@GSCL Workshop - Final Call for Participation
Date Wed, 16 Sep 2009 21:42:00 GMT

Final Call for Participation

Unstructured Information Management Architecture (UIMA)
2nd UIMA@GSCL Workshop

October 1st, 2009
Potsdam, Germany




09:00 - 10:00	-	UIMA Tutorial, Graham Wilcock

10:00 - 10:30	-	Coffee Break

10:30 - 10:45	-	Opening

10:45 - 11:15	-	ClearTK: A Framework for Statistical Natural Language  
Processing (Philip V. Ogren, Philipp G. Wetzler, and Steven J. Bethard)
11:15 - 11:45	-	Multimedia Feature Extraction in the SAPIR Project  
(Aaron Kaplan, Jonathan Mamou, Francesco Gallo, and Benjamin Sznajder)
11:45 - 12:15	-	TextMarker: A Tool for Rule-Based Information  
Extraction (Peter Kluegl, Martin Atzmueller, and Frank Puppe)

12:15 - 13:00	-	Lunch Break

13:00 - 13:30	-	LuCas - A Lucene CAS Indexer (Erik Faessler, Rico  
Landefeld, Katrin Tomanek, and Udo Hahn)
13:30 - 14:00	-	Abstracting the types away from a UIMA type system  
(Karin Verspoor, William Baumgartner Jr., Christophe Roeder, and  
Lawrence Hunter)

14:00 - 14:30	-	Poster Session

14:30 - 15:00	-	Round Table/Discussion

Workshop Description

For many decades, NLP has suffered from low software engineering  
standards causing a limited degree of re-usability of code and  
interoperability of different modules within larger NLP systems. While  
this did not really hamper success in limited task areas (such as  
implementing a parser), it caused serious problems for the emerging  
field of language technology where the focus is on building complex  
integrated software systems, e.g., for information extraction or  
machine translation. This lack of integration has led to duplicated  
software development, work-arounds for programs written in different  
(versions of) programming languages, and ad-hoc tweaking of interfaces  
between modules developed at different sites.

In recent years, the Unstructured Information Management Architecture  
(UIMA) framework has been proposed as a middleware platform which  
offers integration by design through common type systems and  
standardized communication methods for components analysing streams of  
unstructured information, such as natural language. The UIMA framework  
offers a solid processing infrastructure that allows developers to  
concentrate on the implementation of the actual analytics components.  
An increasing number of members of the NLP community thus have adopted  
UIMA as a platform facilitating the creation of reusable NLP  
components that can be assembled to address different NLP tasks  
depending on their order, combination and configuration.

This workshop aims at bringing together members of the NLP community  
that are users, developers or providers of either UIMA components or  
UIMA-related tools in order to explore and discuss the opportunities  
and challenges in using UIMA as a platform for modern, well-engineered  
NLP. In the context of an emerging NLP-oriented UIMA community, the  
challenge to create not only reusable, but also interoperable  
components raises particular interest. From a methodological  
perspective, interoperability relies largely on UIMA type systems.  
Technically, it includes issues related to the packaging and  
distribution of UIMA components. Also, tools are important, for  
example to assemble complex processing work flows, to manage the  
bodies of data that are to be analysed and to visualize, explore, and  
further deploy the analysis results. Finally, interoperability is also  
affected by legal issues, such as potentially incompatible licenses  
ofcomponents and tools.

The availability of ready-to-use components plays a major role in  
choosing UIMA over other alternatives. To accentuate this, the  
workshop puts a focus on UIMA-based components and tools that are  
freely available for research.


Participants are invited to present applications realized using UIMA,  
general experiences using UIMA as a platform for natural language  
processing, as well as technical papers on particular aspects of the  
UIMA framework. Alternatives to and comparisons of other frameworks -  
e.g. GATE, LingPipe, etc. - with UIMA are of interest, too. More  
specifically, workshop topics include, but are not limited to:

• UIMA components with a special focus on genericity and type-system  
• repositories of ready-to-use UIMA-based components
• (generic) type systems for UIMA
• distribution of UIMA components: documentation, licensing and  
• sophisticated tools to build and manage complex processing pipelines
• experience reports combining UIMA-based components from different  
sources, as well as solutions to interoperability issues
• processing of very large data collections: scale-out,  
parallelization, and performance optimization
• analysis of results: exploration, evaluation, visualization, and  
statistical analysis
• developing for UIMA: simplified APIs, debugging, unit testing, and  
limitations of UIMA

Organizers and Contact

• JULIE Lab, Friedrich-Schiller-Universität Jena
   • Udo Hahn
   • Katrin Tomanek
• UKP Lab, Technische Universität Darmstadt
   • Iryna Gurevych
   • Richard Eckart de Castilho

Please address any inquiries regarding the workshop to:

Program Committee

• Anni R. Coden, IBM T.J. Watson Research Center, USA
• Branimir K. Boguraev, IBM T.J. Watson Research Center, USA
• Graham Wilcock, University of Helsinki, Finland
• Iryna Gurevych, Technische Universität Darmstadt, Germany
• Katrin Tomanek, Friedrich-Schiller-Universität Jena, Germany
• Leo Ferres, University of Concepcion, Chile
• Michael Tanenblatt, IBM T.J. Watson Research Center, USA
• Nicolas Hernandez, Université de Nantes, France
• Philipp Cimiano, Delft University of Technology, Netherlands
• Richard Eckart de Castilho, Technische Universität Darmstadt, Germany
• Sophia Ananiadou, University of Manchester, Great Britain
• Stefan Geißler, TEMIS GmbH, Germany
• Udo Hahn, Friedrich-Schiller-Universität Jena, Germany

