lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Akshay K. Ukey (JIRA)" <j...@apache.org>
Subject [jira] Updated: (SOLR-1358) Integration of Tika and DataImportHandler
Date Tue, 08 Dec 2009 14:11:18 GMT

     [ https://issues.apache.org/jira/browse/SOLR-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Akshay K. Ukey updated SOLR-1358:
---------------------------------

    Attachment: SOLR-1358.patch

First cut patch. Not tested.

> Integration of Tika and DataImportHandler
> -----------------------------------------
>
>                 Key: SOLR-1358
>                 URL: https://issues.apache.org/jira/browse/SOLR-1358
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>            Reporter: Sascha Szott
>            Assignee: Noble Paul
>         Attachments: SOLR-1358.patch
>
>
> At the moment, it's impossible to configure Solr such that it build up documents by using
data that comes from both pdf documents and database table columns. Currently, to accomplish
this task, it's up to the user to add some preprocessing that converts pdf files into plain
text files. Therefore, I would like to see an integration of Solr Cell into DIH that makes
those preprocessing obsolete.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message