manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: How to import data from Oracle to Solr
Date Wed, 18 Jul 2012 10:09:11 GMT
The way you create an enhancement request is through Jira, at
https://issues.apache.org/jira.  Just create a request for an
"improvement", and be sure to list any specific details that are
important to you.

Thanks,
Karl

On Wed, Jul 18, 2012 at 5:46 AM, Wolfgang Schreiber
<Wolfgang.Schreiber@isb-ag.de> wrote:
> Hi Karl,
> hi ManifoldCF team members,
>
>
> Using Solr's copyField element we managed to create separate fields for the
> different database columns:
>
> <field name="city" type="cityType" indexed="true" stored="true" />
> ...
> <copyField source="text" dest="city"/>
> ...
> <fieldType name="cityType" class="solr.TextField">
>         <analyzer>
>                 <tokenizer class="solr.PatternTokenizerFactory"
>                  pattern=".+city:(.+);.*" group="1" />
>         /analyzer>
> </fieldType>
>
> Anyhow, this solution has some drawbacks; e.g. the newly created fields all
> are text fields.
> In particular numeric and date fields are also copied to text fields and we
> cannot use type specific functions of Solr.
>
> So coming back to the offer in your first mail: Is it possible that you
> create a JDBC connector enhancement to support metadata?
> Is there a special request process we must follow?
>
> Best regards
> Wolfgang
>
>
>
>
> -----Ursprüngliche Nachricht-----
> Von: Karl Wright [mailto:daddywri@gmail.com]
> Gesendet: Di 17.07.2012 15:13
> An: user@manifoldcf.apache.org
> Betreff: Re: How to import data from Oracle to Solr
>
> "So if I understand correctly ...
>
> 1) ... all mappings added to the "Solr Field Mapping" tab are ignored in case
> of a JDBC resource connector?"
>
> Not exactly - the mappings aren't ignored, there just isn't any
> metadata associated with a JDBC connector document, so the mappings
> never apply.
>
> Regardless, I am glad you got the rest worked out.
>
> Karl
>
>
> On Tue, Jul 17, 2012 at 9:09 AM, Wolfgang Schreiber
> <Wolfgang.Schreiber@isb-ag.de> wrote:
>> Hello Karl,
>>
>> thank you very much for your quick answer!
>>
>> So if I understand correctly ...
>>
>> 1) ... all mappings added to the "Solr Field Mapping" tab are ignored in
> case
>> of a JDBC resource connector?
>>
>> 2) Our data query must look somehow like (regarding that || is Oracle's
>> concatenation operator):
>>    SELECT ID AS "$(IDCOLUMN)", ADDRESS_URL AS "$(URLCOLUMN)",
>>    'ZIP:' || ZIP || ';city:' || CITY || ';street:' || STREET
>>    AS "$(DATACOLUMN)" FROM ADDRESS WHERE ID IN $(IDLIST)
>>
>>    This would result into DATACOLUMN values like:
>>    ZIP:70173;City:Stuttgart;Street:Heilbronner
>>
>> We tried this statement and we got the data into the text field of our Solr
>> index.
>> It seems we are one step further!
>>
>> Thank you for your help! Best regards
>> Wolfgang
>>
>>
>> -----Ursprüngliche Nachricht-----
>> Von: Karl Wright [mailto:daddywri@gmail.com]
>> Gesendet: Di 17.07.2012 12:42
>> An: user@manifoldcf.apache.org
>> Betreff: Re: How to import data from Oracle to Solr
>>
>> Hi Wolfgang,
>>
>> ManifoldCF is meant to handle a binary document and its metadata.  You
>> must provide the document.  Metadata is optional.
>>
>> The JDBC connector does not currently support metadata.  In order to
>> index this, therefore, you will need to decide what should go into
>> your "binary document" from your database fields.  You can append
>> together multiple fields into one document by means of SQL, e.g. the
>> CONCAT operator or its Oracle equivalent.  This would go into one
>> field in Solr, then, which is what you'd search on.
>>
>> Alternatively, if you really need separate indexed fields in Solr for
>> search reasons, you can request a JDBC connector enhancement to add
>> metadata support.  You'd still need a binary document, although you
>> could return a blank value for that.
>>
>> So I guess the answer depends on what you are trying to do on the whole.
>>
>> Karl
>>
>>
>> On Tue, Jul 17, 2012 at 6:27 AM, Wolfgang Schreiber
>> <Wolfgang.Schreiber@isb-ag.de> wrote:
>>> Hello,
>>>
>>> we are trying to ingest data from an Oracle database into Solr.
>>> We managed to insert docs into Solr but only document IDs are inserted and
>> no
>>> other data fields.
>>>
>>> Can you provide an example how to setup the import job in ManifoldCF ?
>>>
>>>
>>> Assume we have the following initial situation:
>>>
>>> 1) Our Oracle table looks something like:
>>>
>>> ADDRESS
>>> --------------------------
>>> ID                      NUMBER
>>> ZIP                     NUMBER
>>> CITY                    VARCHAR(2)
>>> STREET                  VARCHAR(2)
>>>
>>>
>>> 2) In Solr's schema.xml we added the following fields for the database
>>> columns
>>> ...
>>>         <field name="ZIP" type="int" indexed="true" stored="true" />
>>>         <field name="City" type="string" indexed="true" stored="true" />
>>>         <field name="Street" type="string" indexed="true" stored="true" />
>>> ...
>>>
>>>
>>> So here are our questions:
>>>
>>> * How do we have to setup the queries for the ManifoldCF job?
>>>   In particular how exactly must the seeding query and the data query look
>>> like?
>>>
>>> * How do the Solr field mappings look like?
>>>
>>>
>>> We read your online documentation as well as your MEAP book but could not
>>> find a workíng example for a successful import between Oracle and Solr.
>>> Any help is welcome!
>>>
>>> Best regards
>>> Wolfgang
>>
>

Mime
View raw message