lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tamás Barta <bartata...@gmail.com>
Subject Re: Sorting question
Date Mon, 04 Apr 2016 14:05:03 GMT
Hi,

FYI: the final solution I found is that I created a custom
"listpos(fieldName, listId)" function and now I can display a sorted list
via:

fq=listid_s:378
sort=listpos(listpos_s,378) asc

Regards,
Tamas

On Fri, Apr 1, 2016 at 8:55 PM, John Bickerstaff <john@johnbickerstaff.com>
wrote:

> Tamas,
>
> This feels a bit like a "user favorites" problem.
>
> I did a little searching and found this...  Don't know if it will help, but
> when I'm looking for stuff like this I find it helps to try to come up with
> generic or different descriptions of my problem and go search those as
> well...
>
>
> http://stackoverflow.com/questions/3931827/solr-merging-results-of-2-cores-into-only-those-results-that-have-a-matching-fie
>
> On Fri, Apr 1, 2016 at 12:40 PM, John Bickerstaff <
> john@johnbickerstaff.com>
> wrote:
>
> > Tamas,
> >
> > I'm brainstorming here - not being careful, just throwing out ideas...
> >
> > One thing that comes up is a separate document in SOLR - one doc for each
> > list.
> >
> > If a user adds a doc to their list, that doc's id gets added to this
> other
> > type of document...
> >
> > So, a document with the title "List 1" would have a multivalue field of
> > ID's and the list order number like so:
> >
> > ID            List Position
> > _________________
> > doc1 ID :           1
> > doc2 ID:            2
> > doc3 ID:            3
> >
> > and so on...  The big problem I see with this is keeping it organized
> > correctly.  More code would have to be written to handle this when the
> user
> > does any kind of "crud" on the list...
> >
> > I'm pretty sure there's a way to write a query that uses that list to
> > properly order the items returned by your primary search, although I
> > haven't written such a query yet.
> >
> > If you have the luxury of NOT being in production yet with this system,
> > I'd seriously consider pushing to keep application metadata OUT of your
> > product information store.  This particular problem (of ordering the
> > results based on arbitrary user choices) might be more easily handled
> via a
> > separate step that queries a relational database to handle list order -
> > once Solr gives you the documents that match the query and the user's
> list
> > number...
> >
> > Even if you can't use another relational data store - keeping that
> > metadata out of your individual product documents could be argued to be a
> > good design idea...
> >
> > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> > Here's an alternative brainstorm...
> >
> > Where does the user data live?  What about putting the information about
> > the order of document ID's in the User's lists with the User?  Then you
> can
> > get all documents that match the search terms and are on List X from
> Solr -
> > and then sort them by ID based on the data associated with the User (a
> list
> > of ID's, in order)
> >
> > There is even a way to write a plugin that will go after external data to
> > help sort Solr documents, although I'm guessing you'd rather avoid
> that...
> >
> >
> >
> > On Fri, Apr 1, 2016 at 11:59 AM, John Bickerstaff <
> > john@johnbickerstaff.com> wrote:
> >
> >> OK - I get it.  List order is totally arbitrary and cannot be tied to an
> >> hard data point.
> >>
> >> I'll have to think - Perhaps billnbell's solution will help, although
> I'm
> >> not totally sure I understand that suggestion yet.
> >>
> >> At this point, you could get all the documents for List X that match the
> >> search terms.  The next problem is sorting.  If you have the listpos
> field
> >> too, you could use that, and some regex to find the proper order for
> these
> >> documents before displaying them (in code I mean) but of course that
> means
> >> you need some kind of "interceptor" to deal with this before the results
> >> are displayed.
> >>
> >> If I had enough control to do this in code, behind the scenes, I'd grab
> >> that second part of the listops field, put it into a variable on each
> >> object and then sort by that.  Then I'd return the entire list to the
> UI.
> >>
> >> I understand that if you could get SOLR to do it all, that would be
> >> ideal...  There is the possibility of writing some new code and
> plugging it
> >> in to Solr, but I'm guessing you don't want to go that far..  As a final
> >> step in the process, with custom code to consume the listpos entry,
> sorting
> >> these would be fairly straightforward.  I'm not sure how you get away
> from
> >> the lispos multivalue field however...
> >>
> >> I'll keep thinking...
> >>
> >> On Fri, Apr 1, 2016 at 11:26 AM, Tamás Barta <bartatamas@gmail.com>
> >> wrote:
> >>
> >>> So, the list order is determined by the user. The user creates a list,
> >>> adds
> >>> products to it and i have to display these list using filters and
> >>> pagination.
> >>>
> >>> Let's assume there is list with 10000 products in it. In the website
> >>> where
> >>> i display the list only 50 products are displayed in a page. So if i
> >>> could
> >>> query solr to give me products from list X, ordered as user defined,
> but
> >>> only products with some criteria (status, amount, ..) from offset and
> 50
> >>> rows then it would be perfect and fast. If ordering would be outside of
> >>> solr then i have to retrive almost every 10000 documents from solr (a
> bit
> >>> less if filtered) to order them and display the page of 50 products.
> >>> 2016. ápr. 1. 19:15 ezt írta ("John Bickerstaff" <
> >>> john@johnbickerstaff.com
> >>> >):
> >>>
> >>> > Just to be clear - I don't mean who requests the list (application
or
> >>> user)
> >>> > I mean what "rule" determines the ordering of the list?
> >>> >
> >>> > Or, is there even a rule of any kind?
> >>> >
> >>> > In other words, does a user arbitrarily decide that documentA,
> >>> documentF,
> >>> > and documentW should be on a list of their own?  For reasons known
> >>> only to
> >>> > the user?
> >>> >
> >>> > Or - does the ordering of the list depend on some piece of data?
> >>> (like a
> >>> > date, or a manufacturer, or a price range or any other piece of
> "hard"
> >>> > data)
> >>> >
> >>> > ===
> >>> >
> >>> > To give an example from what I'm working on right now --
> >>> >
> >>> > My subject matter experts have given me a rule that says:
> >>> >
> >>> > *Documents of  content_type "bar" should come higher in the results
> >>> than
> >>> > documents of content_type "foo".*
> >>> >
> >>> > PsuedoCode: If (content_type == bar) then put this doc highest in the
> >>> > results.  If (content_type == foo) put those docs after the "bar"
> >>> > content_type docs.
> >>> >
> >>> >
> >>> > This is an example of the ordering being tied to a specific piece of
> >>> data
> >>> > which I can manipulate in a "sub query"  (that's probably the wrong
> >>> > term...)
> >>> >
> >>> >
> >>> > This isn't exactly what you're doing, but it's close -- IF you have
> >>> rules
> >>> > you can express clearly in this way...
> >>> >
> >>> > ---
> >>> >
> >>> > Also, I'm confused a little by your statement that SOLR does the
> >>> filtering
> >>> > and pagination, thus you can't sort the documents after Solr returns
> >>> > them...
> >>> >
> >>> > My mental model is that you ask Solr for all the documents that
> match a
> >>> > certain criteria.  Solr returns that "set" of documents and then for
> >>> your
> >>> > list, you sort those document titles or ID's according to some rule
> --
> >>> > possibly in the javascript on the web page...  But perhaps I'm not
> >>> > understanding your situation well enough...
> >>> >
> >>> > Oh - are you perhaps saying that your ONLY criteria for getting these
> >>> > documents is the list number?  That would make sense, although there
> >>> may
> >>> > still be room for sorting based on some kind of logic / data point
> >>> outside
> >>> > of SOlR.  You could get all the documents associated to list #4, and
> >>> then
> >>> > sort them based on some hard data point they all contain.  At the
> very
> >>> > least, your listpos "array" becomes simpler...
> >>> >
> >>> > What does your query currently look like?
> >>> >
> >>> > On Fri, Apr 1, 2016 at 10:51 AM, Tamás Barta <bartatamas@gmail.com>
> >>> wrote:
> >>> >
> >>> > > Some of the lists are created by users and some are generated
by
> >>> > > applications, it doesn't matter.
> >>> > >
> >>> > > It would be fine to solve it in Solr because Solr does the work
of
> >>> > > filtering and pagination. If sorting were done outside than I
would
> >>> have
> >>> > to
> >>> > > read every document from Solr to sort them. It is not an option,
I
> >>> have
> >>> > to
> >>> > > query onle one page.
> >>> > >
> >>> > > I don't understand how to solve it using subqueries.
> >>> > > 2016. ápr. 1. 18:42 ezt írta ("John Bickerstaff" <
> >>> > john@johnbickerstaff.com
> >>> > > >):
> >>> > >
> >>> > > > Specifically, what drives the position in the list?  Is it
> >>> arbitrary or
> >>> > > is
> >>> > > > it driven by some piece of data?
> >>> > > >
> >>> > > > If data-driven - code could do the sorting based on that
data...
> >>> > > separate
> >>> > > > from SOLR...
> >>> > > >
> >>> > > > Alternatively, if the data point exists in SOLR, a "sub-query"
> >>> might be
> >>> > > > used to get the right sort order on the items returned by
the
> >>> "main"
> >>> > > > search...  Possibly without having to resort to the
> clunky-feeling
> >>> > > listpos
> >>> > > > multivalued field...
> >>> > > >
> >>> > > > On Fri, Apr 1, 2016 at 10:32 AM, Tamás Barta <
> bartatamas@gmail.com
> >>> >
> >>> > > wrote:
> >>> > > >
> >>> > > > > For example I have to display sellable products which
are in
> >>> list X
> >>> > in
> >>> > > > the
> >>> > > > > correct order.
> >>> > > > >
> >>> > > > > If I add a "status" and "list" (multivalued) fields
to every
> >>> document
> >>> > > > > (products), then I can execute a query: status:sellable
AND
> >>> list:X,
> >>> > > > where X
> >>> > > > > is the ID of the list. The list field contains IDs of
the list
> in
> >>> > which
> >>> > > > the
> >>> > > > > product is in.
> >>> > > > >
> >>> > > > > The problem is that I can't sort the result. A product
has
> >>> different
> >>> > > > index
> >>> > > > > for every list.
> >>> > > > >
> >>> > > > > Is it clear now?
> >>> > > > >
> >>> > > > > Earlier I added a "listpos" field with multivalue content,
for
> >>> > example:
> >>> > > > >
> >>> > > > > 1:23
> >>> > > > > 2:4
> >>> > > > >
> >>> > > > > Which means that this product is in position 23 in list
1 and
> it
> >>> is
> >>> > in
> >>> > > > > position 4 in list 2. After that I created a custom
comparator
> >>> which
> >>> > > > parses
> >>> > > > > field values to get index for the specified list and
sorts by
> >>> that
> >>> > > index.
> >>> > > > >
> >>> > > > > But I didn't like that solution much. I wish there would
be a
> >>> better
> >>> > > > > solution. In SolrJ unfortunately I can't find an API
to set
> >>> custom
> >>> > > > > comparator like I did in Lucene. So I don't know how
to solve
> >>> this
> >>> > > > problem
> >>> > > > > in Solr.
> >>> > > > >
> >>> > > > > Thanks,
> >>> > > > > Tamás
> >>> > > > > 2016. ápr. 1. 17:25 ezt írta ("Alessandro Benedetti"
<
> >>> > > > > abenedetti@apache.org
> >>> > > > > >):
> >>> > > > >
> >>> > > > > > I think this is a classic XY Problem , you are
trying to
> solve
> >>> X
> >>> > with
> >>> > > > Y ,
> >>> > > > > > and you are asking us about Y .
> >>> > > > > > Could you describe us what is your X problem ?
What are you
> >>> trying
> >>> > to
> >>> > > > do
> >>> > > > > > with this ordered lists ?
> >>> > > > > >
> >>> > > > > > If not I would add a field to the product called
:
> >>> > > > > > list_position ( or a similar name) of type geo
point (x,y) .
> >>> > > > > > X could be your list ID
> >>> > > > > > Y the position.
> >>> > > > > > Then you can play with spatial search, to get what
you want.
> >>> > > > > >
> >>> > > > > > But again, let's try to solve X.
> >>> > > > > >
> >>> > > > > > Cheers
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message