Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 30149 invoked from network); 30 Oct 2006 16:35:41 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 30 Oct 2006 16:35:41 -0000 Received: (qmail 27258 invoked by uid 500); 30 Oct 2006 16:35:45 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 27233 invoked by uid 500); 30 Oct 2006 16:35:45 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 27219 invoked by uid 99); 30 Oct 2006 16:35:45 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Oct 2006 08:35:45 -0800 X-ASF-Spam-Status: No, hits=2.5 required=10.0 tests=DNS_FROM_RFC_ABUSE,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of peterlkeegan@gmail.com designates 66.249.92.175 as permitted sender) Received: from [66.249.92.175] (HELO ug-out-1314.google.com) (66.249.92.175) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Oct 2006 08:35:31 -0800 Received: by ug-out-1314.google.com with SMTP id k40so1244741ugc for ; Mon, 30 Oct 2006 08:35:09 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=YXqTGDe1aqHf46hrvHGbPYycTUUT8svLlsAjifhX/MLsvG8BHox1xfUyXS/E+u8SHu2j2lxKIMXG9XE4PduaBsa7zdzAcEn8zvdIUlvxJ61z8CQmcTgxFObhnYch3BmPL3YlEC+VMKO6YWylTfjvA+q6qBZnDPScb8RgvhwANLs= Received: by 10.66.216.6 with SMTP id o6mr4459921ugg; Mon, 30 Oct 2006 08:35:08 -0800 (PST) Received: by 10.66.239.7 with HTTP; Mon, 30 Oct 2006 08:35:08 -0800 (PST) Message-ID: Date: Mon, 30 Oct 2006 12:35:08 -0400 From: "Peter Keegan" To: java-user@lucene.apache.org Subject: Re: Announcement: Lucene powering Monster job search index (Beta) In-Reply-To: <85c493340610300813q1e1e29a1j54e4075308916703@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_126631_29236906.1162226108468" References: <6e3ae6310610271309o5df7d073v120781e0b0a8d293@mail.gmail.com> <85c493340610300813q1e1e29a1j54e4075308916703@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_126631_29236906.1162226108468 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline KEGan, >When you search by "4. Sort by Miles", I suppose the sorting by relevance >(of the search keyword) is lost? Since this is implemented using a custom >SortComparatorSource. Sorting by miles becomes the primary sort key, score and date become secondary sort fields (in the case of ties). >Also, I suppose, if FunctionQuery were used, we can make "job distance by >miles" part of the relavancy of the search results? Yes, this is my understanding of the power of FunctionQuery. Peter On 10/30/06, KEGan wrote: > > Peter, > > Congratulation on the beta launch :) > > If you dont mind, I would like to ask you more on the feature "4. Sort by > Miles". > > When you search by "4. Sort by Miles", I suppose the sorting by relevance > (of the search keyword) is lost? Since this is implemented using a custom > SortComparatorSource. > > Also, I suppose, if FunctionQuery were used, we can make "job distance by > miles" part of the relavancy of the search results? > > Could you comment or confirm my assertion ? Thanks :) > > > On 10/28/06, Peter Keegan wrote: > > > > On 10/27/06, Chris Lu wrote: > > > > > > Hi, Peter, > > > > > > Really great job! > > > > > > Thanks. (I'll tell the team) > > > > I am interested to know how you implemented "4. Sort by 'Miles'". For > > > example, if starting from a zip code, how to match items within 20 > > > miles? > > > > > > I can tell you how we use Lucene to accomplish this. > > At indexing time, each job's location is indexed as a special field. How > > you > > represent the location is up to you. Each time a new index is built the > > location data for all documents in the index are fetched via TermEnum > and > > TermDocs. This is practical because the searcher refresh is done at > > predictable times. At query time, a custom SortComparatorSource is > > created, > > using the 'reference' location (the zip/radius). The 'compare' method > > performs the calculation to compare the 2 documents' location values > > (saved > > from above) to the reference location. > > > > I believe this can also be accomplished with Solr's FunctionQuery, but I > > haven't tried that yet. > > > > Peter > > > > -- > > > Chris Lu > > > ------------------------- > > > Instant Full-Text Search On Any Database/Application > > > site: http://www.dbsight.net > > > demo: http://search.dbsight.com > > > > > > On 10/27/06, Peter Keegan wrote: > > > > I am pleased to announce the launch of Monster's new job search Beta > > web > > > > site, powered by Lucene, at: http://jobsearch.beta.monster.com(notice > > > the > > > > Lucene logo at the bottom of the page!). > > > > > > > > The jobs index is implemented with Java Lucene 2.0 on 64-bit Windows > > > (AMD > > > > and Intel processors) > > > > > > > > Here are some of the new features: > > > > > > > > 1. 'Improve your search by'... > > > > > > > > The job search results page allows you to browse and 'drill down' > > > through > > > > the results by job category, status, type and salary. The number of > > > matching > > > > jobs in each facet is displayed. There will likely be many more > facets > > > to > > > > browse by in the future. > > > > > > > > This feature is currently implemented with a custom HitCollector and > > the > > > > DocSet class from Solr. > > > > > > > > 2. 'More like this' > > > > > > > > Find more jobs like the one you see by clicking on the 'MORE LIKE > > THIS' > > > > link, which is visible when you hover the mouse over the job title. > > > > > > > > This feature is implemented with Lucene's term vectors and the > > > > 'MoreLikeThis' contribution class. If you are in 'detailed view', > the > > > term > > > > vectors from the job description are used. In 'brief' view, the job > > > title's > > > > term vectors are used. > > > > > > > > 3. 'Related Titles' > > > > > > > > When you do a 'keywords' search, click on a 'related titles' link to > > > filter > > > > you search by similar job titles. > > > > > > > > This feature is implemented via a separate Lucene.Net index. > > > > > > > > 4. Sort by 'Miles' > > > > > > > > Find jobs close to you via zip code/radius search. In the search > > results > > > > page, click on the 'Miles' column to sort the results by distance > from > > > your > > > > zip code/radius. > > > > > > > > This custom sorting feature is implemented via Lucene's > > > > 'SortComparatorSource' interface. > > > > > > > > 5. Search by date, salary, distance. > > > > > > > > Find jobs posted in the last day (or 2,3, etc) or by salary range or > > > > distance. > > > > > > > > Numeric range search is one of Lucene's weak points > (performance-wise) > > > so we > > > > have implemented this with a custom HitCollector and an extension to > > the > > > > Lucene index files that stores the numeric field values for all > > > documents. > > > > > > > > It is important to point out that this has all been implemented with > > the > > > > stock Lucene 2.0 library. No code changes were made to the Lucene > > core. > > > > > > > > If you have any feedback regarding the UI, please use the link on > the > > > web > > > > page ("send us your feedback"). You can hit me with any other > > > > questions/comments. > > > > > > > > Peter > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > > > > > > > > > ------=_Part_126631_29236906.1162226108468--