lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nader Henein <>
Subject Re: Alerts for search results
Date Sun, 06 Mar 2005 12:35:56 GMT
Well since you're doing it by keyword, it's a little tricky coz if you 
want to batch like searches with each other there won't be much 
similarity,  especially if you allow the user to select Any Word | All 
the Words | Exact phrase, we run 45 000 search agents which dispatch an 
email to users daily to inform them of new jobs added to the site, we 
tor the date of the run and the Lucene search string and then the 
scheduler runs hourly to fetch results/ or result count and send the 
mailer to each person who's email is due. Since we allow for 19 separate 
search criteria batching has not helped us much because the probability 
of enough users having the same search criteria is lower. But you should 
try it though, allow your users to start adding alerts, and do your 
calculations whether batching would help or not.

That was the pull scenario, there is also the push scenario using a 
hashing function, so when a new article is added it's words are compared 
to the words that your users have flagged in their alerts and an email 
is dispatched to interested users, be aware that this only makes sense 
if the total number of distinct words in the document is less than the 
number of distinct keywords your users have flagged, which becomes the 
case when you accumulate a lot of agents.

Here's a quick and dirty calculation:
Pull Scenario: you will run as many searches as there are distinct alerts
Push scenario :  you will run as many searches as there are distinct 
words per document.

Hope this helps.

Nader Henein

Ben wrote:

>I would like to allow users on my site be able to create alerts base
>on the search keywords for every newly posted article. It would be
>good to send out the alert emails hourly, daily or weekly, etc.
>Anyone has any experience in this area? What are the best practices
>when implement such feature? I would imagine it's going to take a lot
>of resources to do a search for each keyword.
>Any guidance is greatly appreciated. Thanks!
>To unsubscribe, e-mail:
>For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message