lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Charlie Hull <char...@flax.co.uk>
Subject Re: Many patterns against many sentences, storing all results
Date Wed, 06 Jan 2016 10:17:54 GMT
On 05/01/2016 16:05, Allison, Timothy B. wrote:
> Might want to look into:
>
> https://github.com/flaxsearch/luwak

Yes, this sounds like a very good fit for Luwak. We built it originally 
for media monitoring applications where one also needs just a hit/no-hit 
result. It's running in production at much larger scale than this.

Best

Charlie

>
> or
>   https://github.com/OpenSextant/SolrTextTagger
>
> -----Original Message-----
> From: Will Moy [mailto:will@fullfact.org]
> Sent: Tuesday, January 05, 2016 11:02 AM
> To: solr-user@lucene.apache.org
> Subject: Many patterns against many sentences, storing all results
>
> Hello
>
> Please may I have your advice as to whether Solr is a good tool for this job?
>
> We have (per year) –
> Up to 50,000,000 sentences
> And about 5,000 search patterns (i.e. queries)
>
> Our task is to identify all matches between any sentence and any search pattern.
>
> That list of detections must be kept up to date as patterns are added or updated (a handful
an hour), and as new sentences are added.
>
> Some of the sentences will be added in real time, at probably max 100 / second and usually
much less. The detections on these should be provided within 3 seconds.
>
> It's an unusual application in that we want all results in an external DB, and also in
that every sentence is either a hit or not. we don't care about scoring results, only about
matches for the exact search pattern entered.
>
> The application is automatically detecting instances of factchecked statements.
>
> The smaller-scale prototype was done with postgres full text searching, but that can't
do exact phrase matching or other more sophisticated searches, so it's out.
>
> Thanks very much
>
> Will
>


-- 
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk

Mime
View raw message