Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@apache.org Received: (qmail 61610 invoked from network); 5 Mar 2003 15:53:34 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 5 Mar 2003 15:53:34 -0000 Received: (qmail 19212 invoked by uid 97); 5 Mar 2003 15:55:13 -0000 Delivered-To: qmlist-jakarta-archive-lucene-user@nagoya.betaversion.org Received: (qmail 19205 invoked from network); 5 Mar 2003 15:55:13 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 5 Mar 2003 15:55:13 -0000 Received: (qmail 61331 invoked by uid 500); 5 Mar 2003 15:53:30 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 61320 invoked from network); 5 Mar 2003 15:53:30 -0000 Received: from web20705.mail.yahoo.com (216.136.226.178) by daedalus.apache.org with SMTP; 5 Mar 2003 15:53:30 -0000 Message-ID: <20030305155332.81605.qmail@web20705.mail.yahoo.com> Received: from [64.215.87.3] by web20705.mail.yahoo.com via HTTP; Wed, 05 Mar 2003 07:53:32 PST Date: Wed, 5 Mar 2003 07:53:32 -0800 (PST) From: Pinky Iyer Subject: Re: Regarding Setup Lucine for my site To: Lucene Users List In-Reply-To: <003701c2e2f9$5a96edd0$0201a8c0@catalin> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="0-1461834584-1046879612=:80200" X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N --0-1461834584-1046879612=:80200 Content-Type: text/plain; charset=us-ascii Thanks, for the info, even I would be intrested to see the zip code esplly for indexer. This discussion has been a wonderful source of info esplly for we starters. Thanks to one and all. I guess once in a while such a discussion helps us too , to get to the level usually the discussion is! I would appreciate if anybody could tell me the documentation which was mentioned earliar which sheds light on the complete understanding of lucene. Thanks again! Catalin wrote:hi there ! we have almost the same configuration (site, index, paths, etc) like you. we used for our search on the site another approach. eg: use a small crawler to index some feeded urls, make the lucene index, make the web search app to use that index. for crawling: http://cvs.cabanova.ro/viewcvs.cgi/indexer/ for webapp: http://cvs.cabanova.ro/viewcvs.cgi/wsearch/ running online: http://www.anet.ro/search?query=star+wars the code of the indexer is based on i2a websearch application demo that is listed on lucene jakarta site. take a look, maybe you might find something usefull ! there is no .zip available for download. but if somebody requests the .zip we can put it online. have fun ! Catalin ----- Original Message ----- From: Samuel Alfonso Vel�zquez D�az To: Lucene Users List Sent: Wednesday, March 05, 2003 3:16 AM Subject: Re: Regarding Setup Lucine for my site Yes I have 1.- The directory with the files to index: C:/filesToIndex/www/ 2.- A path where the index files from the search engine will be created, lets say C:/index/ 3.- I have an internet domain whose name is: www.mysite.com 4.- A web application context that runs at http://www.mysite.com/search Once I have set all the above things I want to be able to use the search aplication: http://www.mysite.com/search/search.jsp And I dont want that the results that I get from the index (step 2) give me results like Your file is at C:/filesToIndex/www/some_html/my_doc.html The results should be: Your file is at http://www.mysite.com/some_html/my_doc.html For the comments I have read (THANK YOU VERY MUTCH) I conclude that there is no way to generate the index with some custom prefix (as http://www.mysite.com/ for the documents at C:/filesToIndex/www/). It seems that I have to modify my web application (http://www.mysite.com/search/search.jsp) to include some logic to repalce "C:/filesToIndex/www/" to "http://www.mysite.com/". If you could point me to the source code of lucene to include this logic and this way fix it once and for all, will appreciate a lot. The command I used to generate this index was: java org.apache.lucene.demo.IndexHTML -create -index index C:\index C:\filesToIndex\ www\ Now in the web application I have to modify IndexSearcher searcher; Query query; Hits hits; // some code after... hits = searcher.search(query); for ( /* search through the hit list*/) Document doc = hits.doc(i); String doctitle = doc.get("title"); String url = doc.get("url"); I have to do some thing like url = "http://www.mysite.com/" + url.substring("C:/filesToIndex/www/".length); Regards!!! And thanks again Pinky Iyer wrote: I dont understand the explanantion. When I try and index the documents as mentioned in the examples, and then when i run the app and do a sample search, it does point to the directory structure say "c:/filesToIndex/www/" instead of "http://localhost:8080/www/". So how can this be changed to reflect the website domain as mentioned by you. Could you explain again. Say my docs are under a directory c:/filesToIndex/www/ and the wesite is as you said http://localhost:8080/ , then how to proceed! Thanks in advance! Samuel Alfonso Vel�zquez D�az wrote: Oh ok, I thougth it was going to be some thing like the egothor search engine (A java based search engine). When you create the Index, you issue a command like: java org.egothor.indexer.mirror.DoTanker /tmp/my_www Project/Egothor/var/www as http://localhost:8080 /thmp/my_www: Is the path to the directory where the index is to be created Project/Egothor/var/www: is the path to the local file system files to be indexed. and as http://localhost:8080 is the prefix that the index will keep on the hit list. This way the index will be relative to http://localhost:8080. Even if your production site may be an other site. Thanks for your comments, any way now I know that I have to modify code to do this. Regards! Jeff Linwood wrote:Hi, I'm not a hundred percent sure I understand what you are asking, but when you get the results back from Lucene (the hits) it's up to you to format them to display on a web page - you can always do the modification there when you display the links to the results. Jeff ----- Original Message ----- From: "Samuel Alfonso Vel�zquez D�az" To: "Lucene Users List" Sent: Tuesday, March 04, 2003 11:33 AM Subject: Regarding Setup Lucine for my site > > The documentation says: > > Once you've gotten this far you're probably itching to go. Let's start by creating the index you'll need for the web examples. Since you've already set your classpath in the previous examples, all you need to do is type "java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..". You'll need to do this from a (any) subdirectory of your {tomcat}/webapps directory (make sure you didn't leave off the ".." or you'll get a null pointer exception). {index-dir} should be a directory that Tomcat has permission to read and write, but is outside of a web accessible context. By default the webapp is configured to look in /opt/lucene/index for this index. > > A copy of my site is in: > > C:\CopiaSite20030228\ > > My web application runs on > > http://mydomain.com/search/index.jsp > > how can I make the lucene index map the URLs of the indexed files to: > > http://mydomain.com/ > > > > Please help! > > > Samuel Alfonso Vel�zquez D�az > http://www.geocities.com/samuelvd > samuelvd@yahoo.com > > > --------------------------------- > Do you Yahoo!? > Yahoo! Tax Center - forms, calculators, tips, and more --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org Samuel Alfonso Vel�zquez D�az http://www.geocities.com/samuelvd samuelvd@yahoo.com --------------------------------- Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, and more --------------------------------- Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, and more Samuel Alfonso Vel�zquez D�az http://www.geocities.com/samuelvd samuelvd@yahoo.com --------------------------------- Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, and more --------------------------------- Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, and more --0-1461834584-1046879612=:80200--