lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Akshay K. Ukey (JIRA)" <>
Subject [jira] Updated: (SOLR-1358) Integration of Tika and DataImportHandler
Date Wed, 09 Dec 2009 13:26:18 GMT


Akshay K. Ukey updated SOLR-1358:

    Attachment: SOLR-1358.patch

> Integration of Tika and DataImportHandler
> -----------------------------------------
>                 Key: SOLR-1358
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>            Reporter: Sascha Szott
>            Assignee: Noble Paul
>         Attachments: SOLR-1358.patch, SOLR-1358.patch, SOLR-1358.patch
> At the moment, it's impossible to configure Solr such that it build up documents by using
data that comes from both pdf documents and database table columns. Currently, to accomplish
this task, it's up to the user to add some preprocessing that converts pdf files into plain
text files. Therefore, I would like to see an integration of Solr Cell into DIH that makes
those preprocessing obsolete.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message