Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: solr-user@lucene.apache.org
MIME-Version: 1.0
In-Reply-To: 
 <CAOpr1fnn0ZmzghGSTZ-kmKi5_WZ12GGYGOJUYi9WA5=BA7GWAw@mail.gmail.com>
References: 
 <CAOpr1fmnD7R9V_1dt8YwY6NippdmYUbEDLks=ZQ4OqvixmFO4A@mail.gmail.com>
	<CAEFAe-Fg6C-068zX0kMYbE8xbW-f83jhRdosD3vz82cy=FFz0g@mail.gmail.com>
	<CAOpr1fnn0ZmzghGSTZ-kmKi5_WZ12GGYGOJUYi9WA5=BA7GWAw@mail.gmail.com>
Date: Thu, 10 Sep 2015 21:08:03 -0700
Message-ID: 
 <CAN4YXvc5mf-6vWEvOr+-KnXK4K5YaEmPO0SobB9QYa7oVaD=2g@mail.gmail.com>
Subject: Re: Detect term occurrences
From: Erick Erickson <erickerickson@gmail.com>
To: solr-user@lucene.apache.org
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

_Assuming_ this isn't a high throughput _and_ the leaflet text isn't too bi=
g...

Index the thesaurus and fire all the terms of the query in a big OR
clause against the index as a _query_. Perhaps turn highlighting on
and highlight the entire leaflet text.

Note, this is just "off the top of my head", I really haven't thought
it through too far and a lot depends on how many leaflets you have to
process and how often....

Best,
Erick

On Thu, Sep 10, 2015 at 7:21 PM, Francisco Andr=C3=A9s Fern=C3=A1ndez
<franaf@gmail.com> wrote:
> Yes.
> I have many drug products leaflets, each corresponding to 1 product. In t=
he
> other hand we have a medical dictionary with about 10^5 terms.
> I want to detect all the occurrences of those terms for any leaflet
> document.
> Could you give me a clue about how is the best way to perform it?
> Perhaps, the best way is (as Walter suggests) to do all the queries every
> time, as needed.
> Regards,
>
> Francisco
>
> El jue., 10 de sept. de 2015 a la(s) 11:14 a. m., Alexandre Rafalovitch <
> arafalov@gmail.com> escribi=C3=B3:
>
>> Can you tell us a bit more about the business case? Not the current
>> technical one. Because it is entirely possible Solr can solve the
>> higher level problem out of the box without you doing manual term
>> comparisons.In which case, your problem scope is not quite right.
>>
>> Regards,
>>    Alex.
>> ----
>> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
>> http://www.solr-start.com/
>>
>>
>> On 10 September 2015 at 09:58, Francisco Andr=C3=A9s Fern=C3=A1ndez
>> <franaf@gmail.com> wrote:
>> > Hi all, I'm new to Solr.
>> > I want to detect all ocurrences of terms existing in a thesaurus into =
1
>> or
>> > more documents.
>> > What=C2=B4s the best strategy to make it?
>> > Doing a query for each term doesn't seem to be the best way.
>> > Many thanks,
>> >
>> > Francisco
>>