lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <jyzhou...@yahoo.com>
Subject Re: a complete solution for building a website search with lucene
Date Mon, 11 Jan 2010 00:39:29 GMT
Thanks.

--- On Sat, 9/1/10, Simon Willnauer <simon.willnauer@googlemail.com> wrote:

From: Simon Willnauer <simon.willnauer@googlemail.com>
Subject: Re: a complete solution for building a website search with lucene
To: java-user@lucene.apache.org
Date: Saturday, 9 January, 2010, 6:16 PM

I don't know that much about nutch but hadoop shouldn't really run
under windows in production. If you use windows for development this
should not be a big issue.
Oatis is right you should use cygwin together with hadoop. look at
http://wiki.apache.org/hadoop/FAQ for initial info.

simon

On Sat, Jan 9, 2010 at 5:20 AM, Otis Gospodnetic
<otis_gospodnetic@yahoo.com> wrote:
> Nutch is written in Java, so Nutch itself *should* work on other non-Linux OSs that the
JVM supports.
> But it does contain some shell scripts, as does Hadoop that Nutch uses.  Oh, I guess
Windows people run it under Cygwin?
>  Otis
> --
> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>
>
>
> ----- Original Message ----
>> From: "jyzhou817@yahoo.com" <jyzhou817@yahoo.com>
>> To: java-user@lucene.apache.org
>> Sent: Fri, January 8, 2010 5:03:41 AM
>> Subject: Re: a complete solution for building a website search with lucene
>>
>> Hi Paul,
>>
>> Thanks.
>> Use Nutch to do crawling. and integrate Lucene to the web application, so that
>> can do search online.
>>
>> BTW, Nutch seems to have only Linux version, what my development is on Windows.
>> Am i right?
>>
>> Zhou
>>
>> --- On Fri, 8/1/10, Paul Libbrecht wrote:
>>
>> From: Paul Libbrecht
>> Subject: Re: a complete solution for building a website search with lucene
>> To: java-user@lucene.apache.org
>> Date: Friday, 8 January, 2010, 4:27 PM
>>
>> Zhou,
>>
>> Lucene is a back-end library, it's very useful for developer but it is not a
>> complete site-search-engine.
>> A lucene-based site-search-engine is Nutch, it does crawl.
>> Solr also provides functions close to these with a large amount of thoughts on
>> flexible integration; crawling methods are rather based on feeds or other
>> acquisition methods (see DIH for example).
>>
>> paul
>>
>>
>>
>>
>> Le 08-janv.-10 à 08:08, a écrit :
>>
>> > Hi ,
>> >
>> > I am new in Lucene.
>> >
>> > To build a web search function, it need to have a backendc indexing function.
>> But, before that, should run a Crawler? because Lucene index based on Html
>> documents, while Crawler can change the website pages to Html documents. Am i
>> right?
>> >
>> > If so, please anyone suggest to me a Crawler? like Nutch?
>> > Thanks
>> > Zhou
>> >
>> >
>> >
>> >
>> >      New Email names for you!
>> > Get the Email name you've always wanted on the new @ymail and @rocketmail.
>> > Hurry before someone else does!
>> > http://mail.promotions.yahoo.com/newdomains/sg/
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>>
>>       New Email names for you!
>> Get the Email name you've always wanted on the new @ymail and @rocketmail.
>> Hurry before someone else does!
>> http://mail.promotions.yahoo.com/newdomains/sg/
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org




      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message