Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@apache.org Received: (qmail 76666 invoked from network); 5 Mar 2003 16:27:43 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 5 Mar 2003 16:27:43 -0000 Received: (qmail 21081 invoked by uid 97); 5 Mar 2003 16:29:21 -0000 Delivered-To: qmlist-jakarta-archive-lucene-user@nagoya.betaversion.org Received: (qmail 21073 invoked from network); 5 Mar 2003 16:29:21 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 5 Mar 2003 16:29:21 -0000 Received: (qmail 71550 invoked by uid 500); 5 Mar 2003 16:26:18 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 71407 invoked from network); 5 Mar 2003 16:26:16 -0000 Received: from cabanova.ro (HELO dristor.cabanova.ro) (213.157.162.113) by daedalus.apache.org with SMTP; 5 Mar 2003 16:26:16 -0000 Received: (qmail 28192 invoked from network); 5 Mar 2003 16:32:06 -0000 Received: from catalin.dristor.cabanova.ro (HELO catalin) (192.168.1.2) by dristor.cabanova.ro with SMTP; 5 Mar 2003 16:32:05 -0000 Message-ID: <014701c2e334$0292e0e0$0201a8c0@catalin> From: "Catalin" To: "Lucene Users List" References: <20030305011624.96709.qmail@web14006.mail.yahoo.com> <003701c2e2f9$5a96edd0$0201a8c0@catalin> <053301c2e332$b5c51fe0$0200a8c0@whale> Subject: Re: Regarding Setup Lucine for my site Date: Wed, 5 Mar 2003 18:26:43 +0200 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0144_01C2E344.C6089E10" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-Spam-Rating: dristor.cabanova.ro 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N ------=_NextPart_000_0144_01C2E344.C6089E10 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable hi there all ! the .zip is available (by request)=20 at:=20 http://dev.cabanova.ro/java/lucene/ have fun ! Catalin ----- Original Message -----=20 From: maurits van wijland=20 To: Lucene Users List=20 Sent: Wednesday, March 05, 2003 6:17 PM Subject: Re: Regarding Setup Lucine for my site Catalin, could you send me a zip file with your implementation? Thanks, maurits ----- Original Message ----- From: "Catalin" To: "Lucene Users List" Sent: Wednesday, March 05, 2003 10:26 AM Subject: Re: Regarding Setup Lucine for my site hi there ! we have almost the same configuration (site, index, paths, etc) like = you. we used for our search on the site another approach. eg: use a small crawler to index some feeded urls, make the lucene index, make the web search app to use that index. for crawling: http://cvs.cabanova.ro/viewcvs.cgi/indexer/ for webapp: http://cvs.cabanova.ro/viewcvs.cgi/wsearch/ running online: http://www.anet.ro/search?query=3Dstar+wars the code of the indexer is based on i2a websearch application demo that is listed on lucene jakarta site. take a look, maybe you might find something usefull ! there is no .zip available for download. but if somebody requests the .zip we can put it online. have fun ! Catalin ----- Original Message ----- From: Samuel Alfonso Vel=E1zquez D=EDaz To: Lucene Users List Sent: Wednesday, March 05, 2003 3:16 AM Subject: Re: Regarding Setup Lucine for my site Yes I have 1.- The directory with the files to index: C:/filesToIndex/www/ 2.- A path where the index files from the search engine will be = created, lets say C:/index/ 3.- I have an internet domain whose name is: www.mysite.com 4.- A web application context that runs at = http://www.mysite.com/search Once I have set all the above things I want to be able to use the = search aplication: http://www.mysite.com/search/search.jsp And I dont want that the results that I get from the index (step 2) = give me results like Your file is at C:/filesToIndex/www/some_html/my_doc.html The results should be: Your file is at http://www.mysite.com/some_html/my_doc.html For the comments I have read (THANK YOU VERY MUTCH) I conclude that = there is no way to generate the index with some custom prefix (as http://www.mysite.com/ for the documents at C:/filesToIndex/www/). It seems that I have to modify my web application (http://www.mysite.com/search/search.jsp) to include some logic to = repalce "C:/filesToIndex/www/" to "http://www.mysite.com/". If you could point me to the source code of lucene to include this = logic and this way fix it once and for all, will appreciate a lot. The command I used to generate this index was: java org.apache.lucene.demo.IndexHTML -create -index index C:\index C:\filesToIndex\ www\ Now in the web application I have to modify IndexSearcher searcher; Query query; Hits hits; // some code after... hits =3D searcher.search(query); for ( /* search through the hit list*/) Document doc =3D hits.doc(i); String doctitle =3D doc.get("title"); String url =3D doc.get("url"); I have to do some thing like url =3D "http://www.mysite.com/" + url.substring("C:/filesToIndex/www/".length); Regards!!! And thanks again Pinky Iyer wrote: I dont understand the explanantion. When I try and index the = documents as mentioned in the examples, and then when i run the app and do a sample search, it does point to the directory structure say = "c:/filesToIndex/www/" instead of "http://localhost:8080/www/". So how can this be changed to reflect the website domain as mentioned by you. Could you explain = again. Say my docs are under a directory c:/filesToIndex/www/ and the wesite is = as you said http://localhost:8080/ , then how to proceed! Thanks in advance! Samuel Alfonso Vel=E1zquez D=EDaz wrote: Oh ok, I thougth it was going to be some thing like the egothor = search engine (A java based search engine). When you create the Index, you = issue a command like: java org.egothor.indexer.mirror.DoTanker /tmp/my_www Project/Egothor/var/www as http://localhost:8080 /thmp/my_www: Is the path to the directory where the index is to be created Project/Egothor/var/www: is the path to the local file system files = to be indexed. and as http://localhost:8080 is the prefix that the index will keep = on the hit list. This way the index will be relative to = http://localhost:8080. Even if your production site may be an other site. Thanks for your comments, any way now I know that I have to modify = code to do this. Regards! Jeff Linwood wrote:Hi, I'm not a hundred percent sure I understand what you are asking, but = when you get the results back from Lucene (the hits) it's up to you to = format them to display on a web page - you can always do the modification = there when you display the links to the results. Jeff ----- Original Message ----- From: "Samuel Alfonso Vel=E1zquez D=EDaz" To: "Lucene Users List" Sent: Tuesday, March 04, 2003 11:33 AM Subject: Regarding Setup Lucine for my site > > The documentation says: > > Once you've gotten this far you're probably itching to go. Let's = start by creating the index you'll need for the web examples. Since you've = already set your classpath in the previous examples, all you need to do is = type "java org.apache.lucene.demo.IndexHTML -create -index {index-dir} = ..". You'll need to do this from a (any) subdirectory of your = {tomcat}/webapps directory (make sure you didn't leave off the ".." or you'll get a = null pointer exception). {index-dir} should be a directory that Tomcat = has permission to read and write, but is outside of a web accessible = context. By default the webapp is configured to look in /opt/lucene/index for = this index. > > A copy of my site is in: > > C:\CopiaSite20030228\ > > My web application runs on > > http://mydomain.com/search/index.jsp > > how can I make the lucene index map the URLs of the indexed files = to: > > http://mydomain.com/ > > > > Please help! > > > Samuel Alfonso Vel=E1zquez D=EDaz > http://www.geocities.com/samuelvd > samuelvd@yahoo.com > > > --------------------------------- > Do you Yahoo!? > Yahoo! Tax Center - forms, calculators, tips, and more = --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org Samuel Alfonso Vel=E1zquez D=EDaz http://www.geocities.com/samuelvd samuelvd@yahoo.com --------------------------------- Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, and more --------------------------------- Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, and more Samuel Alfonso Vel=E1zquez D=EDaz http://www.geocities.com/samuelvd samuelvd@yahoo.com --------------------------------- Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, and more --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org ------=_NextPart_000_0144_01C2E344.C6089E10--