lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Noble Paul (JIRA)" <>
Subject [jira] Updated: (SOLR-1358) Integration of Tika and DataImportHandler
Date Wed, 09 Dec 2009 04:47:18 GMT


Noble Paul updated SOLR-1358:

    Attachment: SOLR-1358.patch

onError implemented

> Integration of Tika and DataImportHandler
> -----------------------------------------
>                 Key: SOLR-1358
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>            Reporter: Sascha Szott
>            Assignee: Noble Paul
>         Attachments: SOLR-1358.patch, SOLR-1358.patch, SOLR-1358.patch
> At the moment, it's impossible to configure Solr such that it build up documents by using
data that comes from both pdf documents and database table columns. Currently, to accomplish
this task, it's up to the user to add some preprocessing that converts pdf files into plain
text files. Therefore, I would like to see an integration of Solr Cell into DIH that makes
those preprocessing obsolete.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message