Mailing list archives: April 2007

Site index · List index
Message list« Previous · 1 · 2 · 3 · 4 · Next »Thread · Author · Date
Ken Krugler Re: Nutch encoding problem Mon, 30 Apr, 13:49
Ken Krugler Re: Nutch encoding problem Mon, 30 Apr, 19:08
Ken Krugler Re: Nutch encoding problem Mon, 30 Apr, 23:43
Lauren Massa Lochridge 0.9 ClassCastException: org.apache.hadoop.io.Text Sun, 22 Apr, 22:58
Lauren Massa Lochridge Re: 0.9 ClassCastException: org.apache.hadoop.io.Text Tue, 24 Apr, 02:42
Marcin Okraszewski Can I make a custom web searcher with Nutch? Wed, 25 Apr, 20:41
Marcin Okraszewski Can I make a custom web searcher with Nutch? Wed, 25 Apr, 20:42
Matze Crawling only Links Fri, 13 Apr, 12:26
Meryl Silverburgh Using nutch as a web crawler Wed, 04 Apr, 02:42
Meryl Silverburgh Re: Using nutch as a web crawler Thu, 05 Apr, 05:45
Meryl Silverburgh Trying to setup Nutch Fri, 06 Apr, 19:08
Meryl Silverburgh Re: Trying to setup Nutch Sat, 07 Apr, 00:54
Meryl Silverburgh Re: Trying to setup Nutch Sat, 07 Apr, 01:02
Meryl Silverburgh Re: Trying to setup Nutch Sat, 07 Apr, 01:12
Meryl Silverburgh Re: Trying to setup Nutch Sat, 07 Apr, 01:27
Meryl Silverburgh NullPointerException during Fetch Sat, 07 Apr, 02:23
Meryl Silverburgh Re: NullPointerException during Fetch Sat, 07 Apr, 16:29
Meryl Silverburgh Re: NullPointerException during Fetch Tue, 10 Apr, 03:24
Meryl Silverburgh Re: Trying to setup Nutch Wed, 11 Apr, 05:10
Meryl Silverburgh How to config nutch just crawl html links? Thu, 12 Apr, 01:48
Meryl Silverburgh How to dump all the valid links which has been crawled? Thu, 12 Apr, 03:53
Meryl Silverburgh Re: How to dump all the valid links which has been crawled? Thu, 12 Apr, 04:15
Meryl Silverburgh Re: How to config nutch just crawl html links? Fri, 13 Apr, 04:27
Meryl Silverburgh how to use craw-urlfilter.txt Fri, 13 Apr, 04:32
Meryl Silverburgh Crawl www.yahoo.com with nutch Mon, 16 Apr, 03:32
Meryl Silverburgh Re: Crawl www.yahoo.com with nutch Mon, 16 Apr, 04:07
Meryl Silverburgh Re: Crawl www.yahoo.com with nutch Mon, 16 Apr, 04:15
Meryl Silverburgh Re: Re: Crawl www.yahoo.com with nutch Mon, 16 Apr, 18:35
Meryl Silverburgh regex.RegexURLNormalizer - can't find rules for scope 'outlink', using default Tue, 17 Apr, 04:08
Meryl Silverburgh Re: Nutch Crawl Question Wed, 18 Apr, 02:12
Meryl Silverburgh Re: Nutch Crawl Question Wed, 18 Apr, 03:40
Meryl Silverburgh Re: Nutch Crawl Question Wed, 18 Apr, 04:04
Meryl Silverburgh Re: incremental crawling Wed, 18 Apr, 18:55
Meryl Silverburgh Re: incremental crawling Thu, 19 Apr, 15:04
Meryl Silverburgh Re: How to dump all the valid links which has been crawled? Fri, 20 Apr, 03:49
Michael McDougall updating crawls with Nutch 0.9 Mon, 23 Apr, 21:40
Michael Wechner Re: Using nutch as a web crawler Wed, 04 Apr, 08:32
Michael Wechner Re: Trying to setup Nutch Tue, 10 Apr, 08:06
Mike Brzozowski Nutch crawl crashing during merge with ArrayIndexOutOfBoundsException Fri, 27 Apr, 17:51
Mike Brzozowski Re: Iterate through stored pages Mon, 30 Apr, 15:46
Neal Whitley Re: Long URL's in results Sat, 14 Apr, 22:03
Nuther nutch-09 start problem Thu, 12 Apr, 06:56
Nuther nutch-0.9.release: Odd Fetcher behaviour Thu, 19 Apr, 06:29
Nuther Re: nutch-0.9.release: Odd Fetcher behaviour Thu, 19 Apr, 06:46
Nuther Nutch admin GUI for 0.9 Thu, 19 Apr, 08:08
Nuther nutch freegen bug? Thu, 26 Apr, 06:20
Nuther Problems during Merging Indexes Fri, 27 Apr, 07:06
Paul Liddelow Nutch changes 0.9.txt Fri, 06 Apr, 06:45
Paul Liddelow Re: Nutch changes 0.9.txt Fri, 06 Apr, 10:58
Paul Liddelow Long URL's in results Sat, 14 Apr, 08:01
Paul Liddelow Re: Long URL's in results Sun, 15 Apr, 07:10
Paul Liddelow Re: Long URL's in results Sun, 15 Apr, 07:12
Paul Liddelow Index compression Sun, 15 Apr, 07:28
RP Re: incremental crawling Thu, 19 Apr, 13:55
Ratnesh,V2Solutions India How to delete already stored indexed fields??? Mon, 02 Apr, 07:47
Ratnesh,V2Solutions India Can we store field as subcollection name??? Mon, 02 Apr, 10:20
Ratnesh,V2Solutions India How to prevent a page from being index during crawl or after crawl?? Mon, 02 Apr, 11:34
Ratnesh,V2Solutions India Re: How to delete already stored indexed fields??? Tue, 03 Apr, 05:04
Ratnesh,V2Solutions India Re: How to delete already stored indexed fields??? Tue, 03 Apr, 05:29
Ratnesh,V2Solutions India how to get rid of some of the fields that are indexed by default eg. content,title,url etc. Tue, 03 Apr, 13:08
Ratnesh,V2Solutions India Re: ERROR org.apache.nutch.protocol.http.Http:?java.net.SocketTimeoutException: Read timed out Wed, 04 Apr, 11:41
Ratnesh,V2Solutions India WARN mapred.LocalJobRunner - job_fajjx6 Wed, 04 Apr, 11:53
Ratnesh,V2Solutions India WARN mapred.LocalJobRunner - job_fajjx6 Wed, 04 Apr, 11:55
Ratnesh,V2Solutions India Re: ERROR org.apache.nutch.protocol.http.Http:?java.net.SocketTimeoutException: Read timed out Thu, 05 Apr, 07:18
Ratnesh,V2Solutions India Re: NullPointerException during Fetch Sat, 07 Apr, 10:02
Ratnesh,V2Solutions India Re: NullPointerException during Fetch Mon, 09 Apr, 04:36
Ratnesh,V2Solutions India Re: Running Nutch on Windows Thu, 12 Apr, 10:12
Ratnesh,V2Solutions India Re: ParseException while crawling Thu, 12 Apr, 10:14
Ratnesh,V2Solutions India Re: Have anybody thought of replacing CrawlDb with any kind of Rational DB? Thu, 12 Apr, 11:27
Ratnesh,V2Solutions India Re: How to config nutch just crawl html links? Thu, 12 Apr, 11:38
Ratnesh,V2Solutions India Re: nutch-09 start problem Thu, 12 Apr, 13:13
Ratnesh,V2Solutions India Re: How to config nutch just crawl html links? Fri, 13 Apr, 05:12
Ratnesh,V2Solutions India Can anybody tell me how the Nutch-0.9 is different than nutch-0.8.1 Fri, 20 Apr, 06:09
Ratnesh,V2Solutions India Re: having problems with search reading word docs and pdf's in 0.8.1 Fri, 20 Apr, 06:25
Ratnesh,V2Solutions India Re: How to delete already stored indexed fields??? Fri, 20 Apr, 11:46
Ratnesh,V2Solutions India Re: How to delete already stored indexed fields??? Mon, 23 Apr, 04:36
Ratnesh,V2Solutions India Can any body explain me the new features of nutch-0.9 Mon, 23 Apr, 05:49
Ravi Chintakunta Re: Query on regular expression Wed, 04 Apr, 13:52
Sami Siren Re: Crawling + Indexing staging vs. production and URL conflict Sun, 01 Apr, 19:38
Sami Siren Re: Fetcher2 too many spinWaiting, How to tune? Mon, 02 Apr, 16:29
Sami Siren Re: How to recude the tmp disk space usage during linkdb process? Wed, 11 Apr, 16:48
Sami Siren Re: Snippet size Thu, 12 Apr, 15:24
Sami Siren Re: Classpath and plugins question Thu, 19 Apr, 14:14
Sami Siren Re: Can anybody tell me how the Nutch-0.9 is different than nutch-0.8.1 Fri, 20 Apr, 14:14
Sami Siren Re: Nutch and running crawls within a container. Mon, 30 Apr, 15:35
Sean Dean Re: How to recude the tmp disk space usage during linkdb process? Wed, 11 Apr, 13:33
Sean Dean Re: How to recude the tmp disk space usage during linkdb process? Wed, 11 Apr, 15:18
Sean Dean Re: Index compression Sun, 15 Apr, 07:55
Sean Dean Re: Hardware Crashes and Garbage Collection on Nutch/Hadoop Sat, 21 Apr, 06:45
Siddharth Jonathan Re: How to delete already stored indexed fields??? Tue, 03 Apr, 05:02
Siddharth Jonathan Re: How to delete already stored indexed fields??? Tue, 03 Apr, 05:25
Somnath Banerjee Crawling fixed set of urls (newbie question) Mon, 30 Apr, 15:12
Somnath Banerjee Re: Crawling fixed set of urls (newbie question) Tue, 01 May, 06:46
Sridhar Teegala ParseException while crawling Wed, 11 Apr, 20:48
Sridhar Teegala Running Nutch on Windows Wed, 11 Apr, 20:56
Stephen Wilkinson having problems with search reading word docs and pdf's in 0.8.1 Thu, 19 Apr, 13:58
Stjepan Marjanovic Nutch - incorrect JavaScript url Wed, 04 Apr, 14:06
TCXO crystal Sun, 29 Apr, 08:18
Tomi N/A Re: Crawling + Indexing staging vs. production and URL conflict Sun, 01 Apr, 14:38
Tomi N/A Re: Index updates between machines Tue, 03 Apr, 17:42
Message list« Previous · 1 · 2 · 3 · 4 · Next »Thread · Author · Date
Box list
Dec 200959
Nov 2009308
Oct 2009258
Sep 2009184
Aug 2009199
Jul 2009312
Jun 2009196
May 2009163
Apr 2009247
Mar 2009408
Feb 2009214
Jan 2009204
Dec 2008229
Nov 2008193
Oct 2008171
Sep 2008269
Aug 2008165
Jul 2008122
Jun 2008243
May 2008220
Apr 2008294
Mar 2008209
Feb 2008191
Jan 2008272
Dec 2007145
Nov 2007228
Oct 2007261
Sep 2007273
Aug 2007292
Jul 2007339
Jun 2007392
May 2007242
Apr 2007309
Mar 2007283
Feb 2007188
Jan 2007370
Dec 2006225
Nov 2006160
Oct 2006251
Sep 2006412
Aug 2006450
Jul 2006315
Jun 2006380
May 2006232
Apr 2006458
Mar 2006659
Feb 2006581
Jan 2006592
Dec 2005430
Nov 2005398
Oct 2005304
Sep 2005404
Aug 2005278
Jul 2005342
Jun 2005216
May 2005151
Apr 2005220
Mar 2005167