lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tommaso Teofili <tommaso.teof...@gmail.com>
Subject Re: UIMA without API key
Date Mon, 04 Jul 2011 22:02:39 GMT
No, sorry maybe my explanation was just too abstract.
What I was suggesting is an alternative way of extracting language based on
stopwords dictionaries (using one DictionaryAnnotator instance for each
language) and a custom Annotator to evaluate which dictionary collected more
hits.
In general extracting language with UIMA without having an internet
connection can be done in various ways, if you need help on this however it
may be better asking about it on UIMA mailing list ( dev@uima.apache.org ).
Another option for language identification task which does not use UIMA but
exploits Tika capabilities is being discussed/developed on
https://issues.apache.org/jira/browse/SOLR-1979
Hope this helps,
Tommaso



2011/7/4 PacoPeralta <pacoperalta@hotmail.com>

>
>
> Sorry for my insistence...
> If I have configured into the uima_config  in the solrconfig.xml:
>
> <lst name="type">
>            <str
> name="name">org.apache.uima.alchemy.ts.language.LanguageFS</str>
>            <lst name="mapping">
>              <str name="feature">language</str>
>              <str name="field">language</str>
>            </lst>
>          </lst>
>
>  <lst name="type">
>           <str name="name">org.apache.uima.DictionaryEntry</str>
>           <lst name="mapping">
>             <str name="feature">coveredText</str>
>             <str name="field">tag</str>
>           </lst>
>         </lst>
>
> And I follow the steps that you listed, Could I extract language and
> dictionary entries form the indexed documents?
>
> Excuse my ignorance...
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/UIMA-without-API-key-tp3135299p3137478.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message