lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Bickerstaff <j...@johnbickerstaff.com>
Subject Re: Sorting question
Date Mon, 04 Apr 2016 15:09:43 GMT
Thanks for sharing the solution Tamas -- I was hoping you'd let us know...

On Mon, Apr 4, 2016 at 8:05 AM, Tamás Barta <bartatamas@gmail.com> wrote:

> Hi,
>
> FYI: the final solution I found is that I created a custom
> "listpos(fieldName, listId)" function and now I can display a sorted list
> via:
>
> fq=listid_s:378
> sort=listpos(listpos_s,378) asc
>
> Regards,
> Tamas
>
> On Fri, Apr 1, 2016 at 8:55 PM, John Bickerstaff <john@johnbickerstaff.com
> >
> wrote:
>
> > Tamas,
> >
> > This feels a bit like a "user favorites" problem.
> >
> > I did a little searching and found this...  Don't know if it will help,
> but
> > when I'm looking for stuff like this I find it helps to try to come up
> with
> > generic or different descriptions of my problem and go search those as
> > well...
> >
> >
> >
> http://stackoverflow.com/questions/3931827/solr-merging-results-of-2-cores-into-only-those-results-that-have-a-matching-fie
> >
> > On Fri, Apr 1, 2016 at 12:40 PM, John Bickerstaff <
> > john@johnbickerstaff.com>
> > wrote:
> >
> > > Tamas,
> > >
> > > I'm brainstorming here - not being careful, just throwing out ideas...
> > >
> > > One thing that comes up is a separate document in SOLR - one doc for
> each
> > > list.
> > >
> > > If a user adds a doc to their list, that doc's id gets added to this
> > other
> > > type of document...
> > >
> > > So, a document with the title "List 1" would have a multivalue field of
> > > ID's and the list order number like so:
> > >
> > > ID            List Position
> > > _________________
> > > doc1 ID :           1
> > > doc2 ID:            2
> > > doc3 ID:            3
> > >
> > > and so on...  The big problem I see with this is keeping it organized
> > > correctly.  More code would have to be written to handle this when the
> > user
> > > does any kind of "crud" on the list...
> > >
> > > I'm pretty sure there's a way to write a query that uses that list to
> > > properly order the items returned by your primary search, although I
> > > haven't written such a query yet.
> > >
> > > If you have the luxury of NOT being in production yet with this system,
> > > I'd seriously consider pushing to keep application metadata OUT of your
> > > product information store.  This particular problem (of ordering the
> > > results based on arbitrary user choices) might be more easily handled
> > via a
> > > separate step that queries a relational database to handle list order -
> > > once Solr gives you the documents that match the query and the user's
> > list
> > > number...
> > >
> > > Even if you can't use another relational data store - keeping that
> > > metadata out of your individual product documents could be argued to
> be a
> > > good design idea...
> > >
> > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >
> > > Here's an alternative brainstorm...
> > >
> > > Where does the user data live?  What about putting the information
> about
> > > the order of document ID's in the User's lists with the User?  Then you
> > can
> > > get all documents that match the search terms and are on List X from
> > Solr -
> > > and then sort them by ID based on the data associated with the User (a
> > list
> > > of ID's, in order)
> > >
> > > There is even a way to write a plugin that will go after external data
> to
> > > help sort Solr documents, although I'm guessing you'd rather avoid
> > that...
> > >
> > >
> > >
> > > On Fri, Apr 1, 2016 at 11:59 AM, John Bickerstaff <
> > > john@johnbickerstaff.com> wrote:
> > >
> > >> OK - I get it.  List order is totally arbitrary and cannot be tied to
> an
> > >> hard data point.
> > >>
> > >> I'll have to think - Perhaps billnbell's solution will help, although
> > I'm
> > >> not totally sure I understand that suggestion yet.
> > >>
> > >> At this point, you could get all the documents for List X that match
> the
> > >> search terms.  The next problem is sorting.  If you have the listpos
> > field
> > >> too, you could use that, and some regex to find the proper order for
> > these
> > >> documents before displaying them (in code I mean) but of course that
> > means
> > >> you need some kind of "interceptor" to deal with this before the
> results
> > >> are displayed.
> > >>
> > >> If I had enough control to do this in code, behind the scenes, I'd
> grab
> > >> that second part of the listops field, put it into a variable on each
> > >> object and then sort by that.  Then I'd return the entire list to the
> > UI.
> > >>
> > >> I understand that if you could get SOLR to do it all, that would be
> > >> ideal...  There is the possibility of writing some new code and
> > plugging it
> > >> in to Solr, but I'm guessing you don't want to go that far..  As a
> final
> > >> step in the process, with custom code to consume the listpos entry,
> > sorting
> > >> these would be fairly straightforward.  I'm not sure how you get away
> > from
> > >> the lispos multivalue field however...
> > >>
> > >> I'll keep thinking...
> > >>
> > >> On Fri, Apr 1, 2016 at 11:26 AM, Tamás Barta <bartatamas@gmail.com>
> > >> wrote:
> > >>
> > >>> So, the list order is determined by the user. The user creates a
> list,
> > >>> adds
> > >>> products to it and i have to display these list using filters and
> > >>> pagination.
> > >>>
> > >>> Let's assume there is list with 10000 products in it. In the website
> > >>> where
> > >>> i display the list only 50 products are displayed in a page. So if
i
> > >>> could
> > >>> query solr to give me products from list X, ordered as user defined,
> > but
> > >>> only products with some criteria (status, amount, ..) from offset and
> > 50
> > >>> rows then it would be perfect and fast. If ordering would be outside
> of
> > >>> solr then i have to retrive almost every 10000 documents from solr
(a
> > bit
> > >>> less if filtered) to order them and display the page of 50 products.
> > >>> 2016. ápr. 1. 19:15 ezt írta ("John Bickerstaff" <
> > >>> john@johnbickerstaff.com
> > >>> >):
> > >>>
> > >>> > Just to be clear - I don't mean who requests the list (application
> or
> > >>> user)
> > >>> > I mean what "rule" determines the ordering of the list?
> > >>> >
> > >>> > Or, is there even a rule of any kind?
> > >>> >
> > >>> > In other words, does a user arbitrarily decide that documentA,
> > >>> documentF,
> > >>> > and documentW should be on a list of their own?  For reasons known
> > >>> only to
> > >>> > the user?
> > >>> >
> > >>> > Or - does the ordering of the list depend on some piece of data?
> > >>> (like a
> > >>> > date, or a manufacturer, or a price range or any other piece of
> > "hard"
> > >>> > data)
> > >>> >
> > >>> > ===
> > >>> >
> > >>> > To give an example from what I'm working on right now --
> > >>> >
> > >>> > My subject matter experts have given me a rule that says:
> > >>> >
> > >>> > *Documents of  content_type "bar" should come higher in the results
> > >>> than
> > >>> > documents of content_type "foo".*
> > >>> >
> > >>> > PsuedoCode: If (content_type == bar) then put this doc highest
in
> the
> > >>> > results.  If (content_type == foo) put those docs after the "bar"
> > >>> > content_type docs.
> > >>> >
> > >>> >
> > >>> > This is an example of the ordering being tied to a specific piece
> of
> > >>> data
> > >>> > which I can manipulate in a "sub query"  (that's probably the
wrong
> > >>> > term...)
> > >>> >
> > >>> >
> > >>> > This isn't exactly what you're doing, but it's close -- IF you
have
> > >>> rules
> > >>> > you can express clearly in this way...
> > >>> >
> > >>> > ---
> > >>> >
> > >>> > Also, I'm confused a little by your statement that SOLR does the
> > >>> filtering
> > >>> > and pagination, thus you can't sort the documents after Solr
> returns
> > >>> > them...
> > >>> >
> > >>> > My mental model is that you ask Solr for all the documents that
> > match a
> > >>> > certain criteria.  Solr returns that "set" of documents and then
> for
> > >>> your
> > >>> > list, you sort those document titles or ID's according to some
rule
> > --
> > >>> > possibly in the javascript on the web page...  But perhaps I'm
not
> > >>> > understanding your situation well enough...
> > >>> >
> > >>> > Oh - are you perhaps saying that your ONLY criteria for getting
> these
> > >>> > documents is the list number?  That would make sense, although
> there
> > >>> may
> > >>> > still be room for sorting based on some kind of logic / data point
> > >>> outside
> > >>> > of SOlR.  You could get all the documents associated to list #4,
> and
> > >>> then
> > >>> > sort them based on some hard data point they all contain.  At
the
> > very
> > >>> > least, your listpos "array" becomes simpler...
> > >>> >
> > >>> > What does your query currently look like?
> > >>> >
> > >>> > On Fri, Apr 1, 2016 at 10:51 AM, Tamás Barta <bartatamas@gmail.com
> >
> > >>> wrote:
> > >>> >
> > >>> > > Some of the lists are created by users and some are generated
by
> > >>> > > applications, it doesn't matter.
> > >>> > >
> > >>> > > It would be fine to solve it in Solr because Solr does the
work
> of
> > >>> > > filtering and pagination. If sorting were done outside than
I
> would
> > >>> have
> > >>> > to
> > >>> > > read every document from Solr to sort them. It is not an
option,
> I
> > >>> have
> > >>> > to
> > >>> > > query onle one page.
> > >>> > >
> > >>> > > I don't understand how to solve it using subqueries.
> > >>> > > 2016. ápr. 1. 18:42 ezt írta ("John Bickerstaff" <
> > >>> > john@johnbickerstaff.com
> > >>> > > >):
> > >>> > >
> > >>> > > > Specifically, what drives the position in the list?
 Is it
> > >>> arbitrary or
> > >>> > > is
> > >>> > > > it driven by some piece of data?
> > >>> > > >
> > >>> > > > If data-driven - code could do the sorting based on
that
> data...
> > >>> > > separate
> > >>> > > > from SOLR...
> > >>> > > >
> > >>> > > > Alternatively, if the data point exists in SOLR, a "sub-query"
> > >>> might be
> > >>> > > > used to get the right sort order on the items returned
by the
> > >>> "main"
> > >>> > > > search...  Possibly without having to resort to the
> > clunky-feeling
> > >>> > > listpos
> > >>> > > > multivalued field...
> > >>> > > >
> > >>> > > > On Fri, Apr 1, 2016 at 10:32 AM, Tamás Barta <
> > bartatamas@gmail.com
> > >>> >
> > >>> > > wrote:
> > >>> > > >
> > >>> > > > > For example I have to display sellable products
which are in
> > >>> list X
> > >>> > in
> > >>> > > > the
> > >>> > > > > correct order.
> > >>> > > > >
> > >>> > > > > If I add a "status" and "list" (multivalued) fields
to every
> > >>> document
> > >>> > > > > (products), then I can execute a query: status:sellable
AND
> > >>> list:X,
> > >>> > > > where X
> > >>> > > > > is the ID of the list. The list field contains
IDs of the
> list
> > in
> > >>> > which
> > >>> > > > the
> > >>> > > > > product is in.
> > >>> > > > >
> > >>> > > > > The problem is that I can't sort the result. A
product has
> > >>> different
> > >>> > > > index
> > >>> > > > > for every list.
> > >>> > > > >
> > >>> > > > > Is it clear now?
> > >>> > > > >
> > >>> > > > > Earlier I added a "listpos" field with multivalue
content,
> for
> > >>> > example:
> > >>> > > > >
> > >>> > > > > 1:23
> > >>> > > > > 2:4
> > >>> > > > >
> > >>> > > > > Which means that this product is in position 23
in list 1 and
> > it
> > >>> is
> > >>> > in
> > >>> > > > > position 4 in list 2. After that I created a custom
> comparator
> > >>> which
> > >>> > > > parses
> > >>> > > > > field values to get index for the specified list
and sorts by
> > >>> that
> > >>> > > index.
> > >>> > > > >
> > >>> > > > > But I didn't like that solution much. I wish there
would be a
> > >>> better
> > >>> > > > > solution. In SolrJ unfortunately I can't find an
API to set
> > >>> custom
> > >>> > > > > comparator like I did in Lucene. So I don't know
how to solve
> > >>> this
> > >>> > > > problem
> > >>> > > > > in Solr.
> > >>> > > > >
> > >>> > > > > Thanks,
> > >>> > > > > Tamás
> > >>> > > > > 2016. ápr. 1. 17:25 ezt írta ("Alessandro Benedetti"
<
> > >>> > > > > abenedetti@apache.org
> > >>> > > > > >):
> > >>> > > > >
> > >>> > > > > > I think this is a classic XY Problem , you
are trying to
> > solve
> > >>> X
> > >>> > with
> > >>> > > > Y ,
> > >>> > > > > > and you are asking us about Y .
> > >>> > > > > > Could you describe us what is your X problem
? What are you
> > >>> trying
> > >>> > to
> > >>> > > > do
> > >>> > > > > > with this ordered lists ?
> > >>> > > > > >
> > >>> > > > > > If not I would add a field to the product
called :
> > >>> > > > > > list_position ( or a similar name) of type
geo point (x,y)
> .
> > >>> > > > > > X could be your list ID
> > >>> > > > > > Y the position.
> > >>> > > > > > Then you can play with spatial search, to
get what you
> want.
> > >>> > > > > >
> > >>> > > > > > But again, let's try to solve X.
> > >>> > > > > >
> > >>> > > > > > Cheers
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message