lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandre Rafalovitch <arafa...@gmail.com>
Subject Re: get the position of matched word in the response
Date Sun, 04 Aug 2019 12:39:46 GMT
What happens if they search for "hello monkey" and match against
"hello my monkeys"? What should it return? Why does your database not
contain "hello" instead of 199?

I am saying because if your clients are truly searching for just one
word, then Solr may be an overkill for you. Perhaps you are looking
for just "indexOf" within a string with parallel offset->OCR data
structure. So, there is a hidden question in there of "why do you
choose Solr".

Then, there is a point that Solr searches words/numbers/geo-spacial
but returns documents. So, sometimes, you need to understand what is a
"document" for your business case. And transform your content for
that. E.g. if you are really just searching for one word, then maybe
you index your whole book as a bunch of document each containing a
word, its OCR offset information, its book id. And if it is a couple
of words, maybe you have a secondary field with context of that
sentence (in index-only) form.

Don't be afraid to abandon your first schema. Your business
requirement is different enough.

Regards,
   Alex.


On Sun, 4 Aug 2019 at 07:46, eli chen <eli.c.new@gmail.com> wrote:
>
> every content field is actually a book content
> so let say someone search for the word "hello" and i found this word in the
> book "the story jungle" at position 199 (step by word not char)
>
> now i can look at my database and check the OCR of this word in this book
> (and show highlight on the picture and etc)
>
> my db is kinda of (just for simplicity)
>
> book     word     ocr
> ------     -------     ---------
> th....     199        1,1,1,1
>
> that the reason i need the offest of the word.
>
> and btw the content field is just a big text_general field
>
> thx again
>
> ‫בתאריך יום א׳, 4 באוג׳ 2019 ב-14:30 מאת ‪Erick Erickson‬‏
<‪
> erickerickson@gmail.com‬‏>:‬
>
> > Eli:
> >
> > What problem are you trying to solve? There’s no really convenient way to
> > do this that know of, although it could be done, probably with some
> > lucene-level code.
> >
> > This may be an XY problem, where you're asking how to do X (find the
> > position of the matched word) because you think it’ll help solve some
> > problem Y. What’s “Y”? Perhaps there’s an easier way to solve that problem
> > if we knew what it was….
> >
> > Best,
> > Erick
> >
> > > On Aug 4, 2019, at 6:55 AM, eli chen <eli.c.new@gmail.com> wrote:
> > >
> > > hi i'm new to solr so please be patient.
> > > how can i get the position of matched word in the results.
> > >
> > > and no, im not talking about highlighting the words. i talkng about
> > getting
> > > the postition of the word in the content
> > >
> > > i have field content which i do in q=content:"some_word"
> > >
> > > the content field is not stored but its
> > > Indexed +Tokenized+ Multivalued+ TermVector Stored +Store Offset With
> > > TermVector +Store Position With TermVector
> > >
> > > thx for the help
> >
> >

Mime
View raw message