lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Libbrecht <p...@activemath.org>
Subject Re: a complete solution for building a website search with lucene
Date Fri, 08 Jan 2010 08:27:37 GMT
Zhou,

Lucene is a back-end library, it's very useful for developer but it is  
not a complete site-search-engine.
A lucene-based site-search-engine is Nutch, it does crawl.
Solr also provides functions close to these with a large amount of  
thoughts on flexible integration; crawling methods are rather based on  
feeds or other acquisition methods (see DIH for example).

paul




Le 08-janv.-10 à 08:08, <jyzhou817@yahoo.com> a écrit :

> Hi ,
>
> I am new in Lucene.
>
> To build a web search function, it need to have a backendc indexing  
> function. But, before that, should run a Crawler? because Lucene  
> index based on Html documents, while Crawler can change the website  
> pages to Html documents. Am i right?
>
> If so, please anyone suggest to me a Crawler? like Nutch?
> Thanks
> Zhou
>
>
>
>
>      New Email names for you!
> Get the Email name you&#39;ve always wanted on the new @ymail and  
> @rocketmail.
> Hurry before someone else does!
> http://mail.promotions.yahoo.com/newdomains/sg/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message