lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nolim <>
Subject saving user actions on item in solr for later retrieval
Date Mon, 28 Apr 2014 18:48:40 GMT
We are using solr in production system for around ~500 users and we have
around ~10000 queries per day.
Our user's search topics most of the time static and repeat themselves over

We have in our system an option to specify "specific search subject" (we
also call it "specific information need") and most of our users are using
this option.
We keep in our system logs each query and document retrieved from each
"information need"
and the user can also give feedback if the document is relevant for his
"information need".

We also have special query expansion technique and diversity algorithm based
on MMR.

We want to use this information from logs as data set for training our
ranking system
and preforming "Learning To Rank" for each "information need" or cluster of
"information needs".
We also want to give the user the option filter by "relevant" and "read"
based on his actions\friends actions in the same topic.
When he runs a query again or similar one he can skip already read
documents. That's an important requirement to our users.

We think about 2 possibilities to implement it:
1. Updating each item in solr and creating 2 fields named: "read",
Each field is multivalue field with the corresponding label of the
"information need".
When the user reads a document an update is sent to solr and the field
"read" gets a label with
the "information need" the user is working on...
Will cause update when each item is read by user (still nothing compare to
new items coming in each day).
We are saving information that "belongs" to the application in solr which
may be wrong architecture.

2. Save the information In DB, and then preforming filtering on the
retrieved results.
this option is much more complicated (We now have "fields" that aren't solr
and the user uses them for search). We won't get facets, autocomplete and
other nice stuff that a regular field in solr can have.
cost in preformances, we can''t retrieve easy: "give me top 10 documents
that answer the query and unread from the information need" and more
complicated code to hold.

3. Do you have more ideas?

Which of those options is the better?

Thanks in advance!

View this message in context:
Sent from the Solr - User mailing list archive at

View raw message