lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: a complete solution for building a website search with lucene
Date Sat, 09 Jan 2010 04:20:48 GMT
Nutch is written in Java, so Nutch itself *should* work on other non-Linux OSs that the JVM
supports.
But it does contain some shell scripts, as does Hadoop that Nutch uses.  Oh, I guess Windows
people run it under Cygwin?
 Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----
> From: "jyzhou817@yahoo.com" <jyzhou817@yahoo.com>
> To: java-user@lucene.apache.org
> Sent: Fri, January 8, 2010 5:03:41 AM
> Subject: Re: a complete solution for building a website search with lucene
> 
> Hi Paul,
> 
> Thanks. 
> Use Nutch to do crawling. and integrate Lucene to the web application, so that 
> can do search online.
> 
> BTW, Nutch seems to have only Linux version, what my development is on Windows. 
> Am i right?
> 
> Zhou
> 
> --- On Fri, 8/1/10, Paul Libbrecht wrote:
> 
> From: Paul Libbrecht 
> Subject: Re: a complete solution for building a website search with lucene
> To: java-user@lucene.apache.org
> Date: Friday, 8 January, 2010, 4:27 PM
> 
> Zhou,
> 
> Lucene is a back-end library, it's very useful for developer but it is not a 
> complete site-search-engine.
> A lucene-based site-search-engine is Nutch, it does crawl.
> Solr also provides functions close to these with a large amount of thoughts on 
> flexible integration; crawling methods are rather based on feeds or other 
> acquisition methods (see DIH for example).
> 
> paul
> 
> 
> 
> 
> Le 08-janv.-10 à 08:08, a écrit :
> 
> > Hi ,
> > 
> > I am new in Lucene.
> > 
> > To build a web search function, it need to have a backendc indexing function. 
> But, before that, should run a Crawler? because Lucene index based on Html 
> documents, while Crawler can change the website pages to Html documents. Am i 
> right?
> > 
> > If so, please anyone suggest to me a Crawler? like Nutch?
> > Thanks
> > Zhou
> > 
> > 
> > 
> > 
> >      New Email names for you!
> > Get the Email name you've always wanted on the new @ymail and @rocketmail.
> > Hurry before someone else does!
> > http://mail.promotions.yahoo.com/newdomains/sg/
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 
> 
>       New Email names for you! 
> Get the Email name you've always wanted on the new @ymail and @rocketmail. 
> Hurry before someone else does!
> http://mail.promotions.yahoo.com/newdomains/sg/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message