lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tommaso Teofili <tommaso.teof...@gmail.com>
Subject Re: PipeLine for Solr
Date Mon, 18 Apr 2011 11:44:00 GMT
Hello Roland,

I think a nice option would be using UIMA [1] which supports a pipeline
architecture to analyze unstructured information.
With that you can use CollectionReaders to get documents from various
sources, Annotators to eventually extract metadata from documents [2] and
then a Solr CAS Consumer to write everything to Solr [3].

You could also exploit the UIMA integration already committed under a
dedicated Solr contrib module [4][5] which uses a custom UpdateHandler.

Hope this helps,
Tommaso

[1] : http://uima.apache.org
[2] :
http://uima.apache.org/d/uimaj-2.3.1/overview_and_setup.html#ugr.ovv.conceptual.graduating_to_collection_processing
[3] : http://uima.apache.org/sandbox.html#solrcas.consumer
[4] : http://wiki.apache.org/solr/SolrUIMA
[5] : http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/contrib/uima/

2011/4/18 Roland Villemoes <rv@alpha-solutions.dk>

> Hi All,
>
>
>
> I know this question may have been asked before – but I really did not find
> any usable answers browsing the archives. So I have to try the developer
> list here.
>
>
>
> We at Alpha Solutions often need a Pipeline for handling crawling,
> analyzing and routing before we hit the UpdateRequestHandler in Solr. I know
> we could actually use the UpdateRequestHandler for this - but often we like
> to perform all these tasks before hitting Solr.
>
> We have been using OpenPipeline which does offer a GUI also which makes it
> rather nice to administer (if you tweak the GUI a bit!). I does seem though,
> that OpenPipeline will not really get going. Nothing happens, and there is
> not really any community around it – and it doesn’t seem that the guys
> that’s behind this will ever move this further.
>
>
>
> So we are looking around towards other “pipeline” projects that can work
> well with Solr.
>
>
>
> So – does any of you have any ideas on this? Any recommendations? Or any
> plans of this for Solr?
>
>
>
> Thanks a lot
>
> *Med venlig hilsen / Best regards*
>
> *Roland Villemoes*
> *Tel:* (+45) 22 69 59 62
> *E-mail:* rv@alpha-solutions.dk
>
> *Alpha Solutions A/S*
> Borgergade 2, 3.sal, DK-1300 Copenhagen K
> *Tel:* (+45) 70 20 65 38
> *Web:* www.alpha-solutions.dk
>
>
> ** This message including any attachments may contain confidential and/or
> privileged information
> intended only for the person or entity to which it is addressed. If you are
> not the intended recipient
> you should delete this message. Any printing, copying, distribution or
> other use of this message is strictly prohibited.
> If you have received this message in error, please notify the sender
> immediately by telephone
> or e-mail and delete all copies of this message and any attachments from
> your system. Thank you.
>
>
>

Mime
View raw message