jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chetan Mehrotra <chetan.mehro...@gmail.com>
Subject Re: Full text indexing with Solr
Date Thu, 26 Sep 2013 10:59:57 GMT
Thanks for the details Tommaso. Would look at the code for
implementation details.
Chetan Mehrotra


On Wed, Sep 25, 2013 at 2:33 AM, Tommaso Teofili
<tommaso.teofili@gmail.com> wrote:
> Hi Chetan,
>
> I think that currently the complete binary is sent and the on the Solr side
> you have the ability to choose which field to use for indexing and
> searching properties of Binary type via the
> OakSolrConfiguration#getFieldNameFor(Type<?> propertyType) [1] method.
>
> Currently the default configuration and implementation use a Solr binary
> dynamic field of type so that a binary property called propname is indexed
> in a Solr field called propname_bin of type BinaryField [2], however my
> plan for it is to implement some dedicated analyzers that use Apache Tika
> to extract the text and index that instead (or too).
>
> Regards,
> Tommaso
>
> [1] :
> http://svn.apache.org/repos/asf/jackrabbit/oak/trunk/oak-solr-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/solr/OakSolrConfiguration.java
> [2] :
> http://lucene.apache.org/solr/4_4_0/solr-core/org/apache/solr/schema/BinaryField.html
>
>
>
> 2013/9/24 Chetan Mehrotra <chetan.mehrotra@gmail.com>
>
>> Hi,
>>
>> When Oak uses Solr then do we send the complete binary to Solr for
>> full text indexing or we extract the content on Oak side and send the
>> extracted content.
>>
>> And if send the complete binary content do we send it inline or it is
>> first uploaded to Solr and reference to that is passed?
>>
>> Chetan Mehrotra
>>

Mime
View raw message