lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kunal Wku <wku_ku...@yahoo.com>
Subject Regarding Lucene & Nutch
Date Fri, 07 Sep 2007 16:49:57 GMT
Hello Everyone,
   
  I am using Lucene & Nutch in my project for searching content in the webpages.
For a webpage or any other document, Lucene takes all the words in the page and indexes them
and returns the result when searched.
   
  Lets say, I have 2 webpages as shown below:
   
  Webpage1
----------------------------------------------------------------------
This is the course page of Computer Science Department
  Subject: Operating System I
Professor: Qi Li
  Details:
The course operating system I deals with the basics of the operating system. Mainly the three
topics dealt are process management, storage management & memory mangement. etc............................................
..................................................................
----------------------------------------------------------------------
   
  Webpage2
----------------------------------------------------------------------
This is the home page of Computer Science Department
  The computer science department offers courses at undergradudate level and 
graduate level. The core courses for the graduate students are  Mathematical Foundations of
Computer Science, Compilers, Advanced Database, Analysis of Algorithms and Operating Systems.
etc............................
..................................................................
----------------------------------------------------------------------
   
  Now if I search using the word "operating system", the results shows both the webpages (webpage
1 & webpage2) since the word "operating system" exists in both the webpage. 
   
  But my requirement is different. If I want to search the word "Operating System" which should
appear in the subject field i.e., as in the webpage1, the result should show only webpage1.
How can I achieve this result ? 
   
  Please help me in this regard.
  Thanks & Regards,
Kunal Gosar


       
---------------------------------
Be a better Globetrotter. Get better travel answers from someone who knows.
Yahoo! Answers - Check it out.
Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message