lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dileepa Jayakody <dileepajayak...@gmail.com>
Subject Re: Need additional data processing in Data Import Handler prior to indexing
Date Wed, 30 Oct 2013 05:56:54 GMT
Thanks guys for your ideas.

I will go through them and come back with questions.

Regards,
Dileepa


On Wed, Oct 30, 2013 at 7:00 AM, Erick Erickson <erickerickson@gmail.com>wrote:

> Third time tonight I've been able to paste this link....
>
> Also, you can consider just moving to SolrJ and
> taking DIH out of the process, see:
> http://searchhub.org/2012/02/14/indexing-with-solrj/
>
> Whichever approach fits your needs of course.
>
> Best,
> Erick
>
>
> On Tue, Oct 29, 2013 at 7:15 PM, Alexandre Rafalovitch
> <arafalov@gmail.com>wrote:
>
> > It's also possible to combine Update Request Processor with DIH. That way
> > if a debug entry needs to be inserted it could go through the same
> Stanbol
> > process.
> >
> > Just define a processing chain the DIH handler and write custom URP to
> call
> > out to Stanbol web service. You have access to a full record in URP, so
> can
> > add/delete/change the fields at will.
> >
> > Regards,
> >    Alex.
> >
> > Personal website: http://www.outerthoughts.com/
> > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> > - Time is the quality of nature that keeps events from happening all at
> > once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)
> >
> >
> > On Wed, Oct 30, 2013 at 4:09 AM, Michael Della Bitta <
> > michael.della.bitta@appinions.com> wrote:
> >
> > > Hi Dileepa,
> > >
> > > You can write your own Transformers in Java. If it doesn't make sense
> to
> > > run Stanbol calls in a Transformer, maybe setting up a web service that
> > > grabs a record out of MySQL, sends the data to Stanbol, and displays
> the
> > > results could be used in conjunction with HttpDataSource rather than
> > > JdbcDataSource.
> > >
> > > http://wiki.apache.org/solr/DIHCustomTransformer
> > >
> > >
> >
> http://wiki.apache.org/solr/DataImportHandler#Usage_with_XML.2FHTTP_Datasource
> > >
> > > Michael Della Bitta
> > >
> > > Applications Developer
> > >
> > > o: +1 646 532 3062  | c: +1 917 477 7906
> > >
> > > appinions inc.
> > >
> > > “The Science of Influence Marketing”
> > >
> > > 18 East 41st Street
> > >
> > > New York, NY 10017
> > >
> > > t: @appinions <https://twitter.com/Appinions> | g+:
> > > plus.google.com/appinions<
> > >
> >
> https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
> > > >
> > > w: appinions.com <http://www.appinions.com/>
> > >
> > >
> > > On Tue, Oct 29, 2013 at 4:47 PM, Dileepa Jayakody <
> > > dileepajayakody@gmail.com
> > > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > I'm a newbie to Solr, and I have a requirement to import data from a
> > > mysql
> > > > database; enhance  the imported content to identify Persons mentioned
> > >  and
> > > > index it as a separate field in Solr along with the other fields
> > defined
> > > > for the original db query.
> > > >
> > > > I'm using Apache Stanbol [1] for the content enhancement requirement.
> > > > I can get enhancement results for 'Person' type data in the content
> as
> > > the
> > > > enhancement result.
> > > >
> > > > The data flow will be;
> > > > mysql-db > Solr data-import handler > Stanbol enhancer > Solr
index
> > > >
> > > > For the above requirement I need to perform additional processing at
> > the
> > > > data-import handler prior to indexing to send a request to Stanbol
> and
> > > > process the enhancement response. I found some related examples on
> > > > modifying mysql data import handler to customize the query results in
> > > > db-data-config.xml by using a transformer script.
> > > > As per my requirement, In the data-import-handler I need to send a
> > > request
> > > > to Stanbol and process the response prior to indexing. But I'm not
> sure
> > > if
> > > > this can be achieved using a simple javascript.
> > > >
> > > > Is there any other better way of achieving my requirement? Maybe
> > writing
> > > a
> > > > custom filter in Solr?
> > > > Please share your thoughts. Appreciate any pointers as I'm a beginner
> > for
> > > > Solr.
> > > >
> > > > Thanks,
> > > > Dileepa
> > > >
> > > >
> > > > [1] https://stanbol.apache.org
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message