lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tommaso Teofili (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-2244) Add Language Identification support
Date Tue, 23 Nov 2010 11:41:15 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934813#action_12934813
] 

Tommaso Teofili commented on SOLR-2244:
---------------------------------------

bq. I'm going to suggest that we rename contrib/extraction to be contrib/tika and that we
just roll all of these things under one area, that way we don't have to muck with libraries,
etc.

nice suggestion

bq. Heck, it might even make sense at this point to just move it into core.

+1

> Add Language Identification support
> -----------------------------------
>
>                 Key: SOLR-2244
>                 URL: https://issues.apache.org/jira/browse/SOLR-2244
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>         Attachments: solr2244.patch
>
>
> For starters, Tika has language identification capabilities that we can likely leverage,
but moreover, make it easier for people to plug in language identification into the indexing
process.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message