Mailing list archives: April 2007

Site index · List index
Message list« Previous · 1 · 2 · 3 · 4 · Next »Thread · Author · Date
Meryl Silverburgh Re: incremental crawling Thu, 19 Apr, 15:04
Briggs Nutch and Crawl Frequency Thu, 19 Apr, 19:02
Gal Nitzan RE: Nutch and Crawl Frequency Thu, 19 Apr, 20:26
Briggs Re: Nutch and Crawl Frequency Thu, 19 Apr, 20:47
Briggs Re: Forcing update of some URLs Thu, 19 Apr, 21:55
Briggs Re: How to dump all the valid links which has been crawled? Thu, 19 Apr, 21:57
Tomi N/A Re: Fetching outside the domain ? Thu, 19 Apr, 23:03
Tomi N/A Re: Nutch and Crawl Frequency Thu, 19 Apr, 23:16
Antony Bowesman Re: Classpath and plugins question Fri, 20 Apr, 01:43
Antony Bowesman Office 2007 + XML parser Fri, 20 Apr, 02:08
David Xiao Re: Office 2007 + XML parser Fri, 20 Apr, 03:04
Antony Bowesman Re: Office 2007 + XML parser Fri, 20 Apr, 03:29
Meryl Silverburgh Re: How to dump all the valid links which has been crawled? Fri, 20 Apr, 03:49
Ratnesh,V2Solutions India Can anybody tell me how the Nutch-0.9 is different than nutch-0.8.1 Fri, 20 Apr, 06:09
Ratnesh,V2Solutions India Re: having problems with search reading word docs and pdf's in 0.8.1 Fri, 20 Apr, 06:25
Andrzej Bialecki Re: Fetching outside the domain ? Fri, 20 Apr, 06:41
franklinb4u Re: How to delete already stored indexed fields??? Fri, 20 Apr, 11:39
Ratnesh,V2Solutions India Re: How to delete already stored indexed fields??? Fri, 20 Apr, 11:46
franklinb4u Re: How to delete already stored indexed fields??? Fri, 20 Apr, 13:38
Sami Siren Re: Can anybody tell me how the Nutch-0.9 is different than nutch-0.8.1 Fri, 20 Apr, 14:14
Briggs Re: How to delete already stored indexed fields??? Fri, 20 Apr, 15:17
Briggs Re: How to dump all the valid links which has been crawled? Fri, 20 Apr, 15:26
derevo Plugin to index categories by url rules Fri, 20 Apr, 23:16
Dennis Kubes Hardware Crashes and Garbage Collection on Nutch/Hadoop Sat, 21 Apr, 00:50
derevo Re: Plugin to index categories by url rules Sat, 21 Apr, 01:43
Sean Dean Re: Hardware Crashes and Garbage Collection on Nutch/Hadoop Sat, 21 Apr, 06:45
franklinb4u Re: How to delete already stored indexed fields??? Sat, 21 Apr, 09:49
Andrzej Bialecki Re: Hardware Crashes and Garbage Collection on Nutch/Hadoop Sat, 21 Apr, 10:20
Dennis Kubes Re: Hardware Crashes and Garbage Collection on Nutch/Hadoop Sat, 21 Apr, 14:06
derevo Re: Plugin to index categories by url rules Sat, 21 Apr, 17:08
Chee Wu Re: Any way for removing pages with same title in index? Sun, 22 Apr, 10:12
Lauren Massa Lochridge 0.9 ClassCastException: org.apache.hadoop.io.Text Sun, 22 Apr, 22:58
Ken Krugler Re: 0.9 ClassCastException: org.apache.hadoop.io.Text Mon, 23 Apr, 02:21
Ratnesh,V2Solutions India Re: How to delete already stored indexed fields??? Mon, 23 Apr, 04:36
Ratnesh,V2Solutions India Can any body explain me the new features of nutch-0.9 Mon, 23 Apr, 05:49
openxu Why Nutch returns 0 results? Mon, 23 Apr, 06:06
qi wu Re: Can any body explain me the new features of nutch-0.9 Mon, 23 Apr, 06:12
Dennis Kubes Re: Why Nutch returns 0 results? Mon, 23 Apr, 07:07
openxu Re: Why Nutch returns 0 results? Mon, 23 Apr, 07:23
openxu Re: Why Nutch returns 0 results? Mon, 23 Apr, 12:23
Trond Andersen Optional terms Mon, 23 Apr, 13:40
Ben Szekely strange URL filter behavior Mon, 23 Apr, 16:04
Michael McDougall updating crawls with Nutch 0.9 Mon, 23 Apr, 21:40
Lauren Massa Lochridge Re: 0.9 ClassCastException: org.apache.hadoop.io.Text Tue, 24 Apr, 02:42
franklinb4u Re: Compile Nutch Tue, 24 Apr, 06:00
Antony Bowesman ExcelExtractor performance Tue, 24 Apr, 09:22
ekoje ekoje Query pdf, etc.. Tue, 24 Apr, 13:01
ekoje ekoje Index Tue, 24 Apr, 13:06
Lourival Júnior Re: Query pdf, etc.. Tue, 24 Apr, 13:07
Briggs Re: Index Tue, 24 Apr, 14:05
ekoje ekoje Re: Index Tue, 24 Apr, 16:15
ekoje ekoje Re: Query pdf, etc.. Tue, 24 Apr, 16:18
Briggs Re: Index Tue, 24 Apr, 16:46
Lourival Júnior Re: Query pdf, etc.. Tue, 24 Apr, 17:00
Annona Keene Nutch 0.9 recrawl Tue, 24 Apr, 21:57
John Kleven Using nutch just for the crawler/fetcher Wed, 25 Apr, 04:57
derevo Re: Plugin to index categories by url rules Wed, 25 Apr, 07:50
Doğacan Güney Re: Plugin to index categories by url rules Wed, 25 Apr, 07:54
Abdelhakim Diab search in more than one index. Wed, 25 Apr, 09:51
Abdelhakim Diab search in more than one index. Wed, 25 Apr, 12:53
Abdelhakim Diab search in more than one index. Wed, 25 Apr, 12:54
Briggs Re: Using nutch just for the crawler/fetcher Wed, 25 Apr, 14:19
John Kleven Re: Using nutch just for the crawler/fetcher Wed, 25 Apr, 17:45
karthik085 nutch-site.xml score Wed, 25 Apr, 17:55
karthik085 nutch-0.9 plugins Wed, 25 Apr, 18:43
Marcin Okraszewski Can I make a custom web searcher with Nutch? Wed, 25 Apr, 20:41
Marcin Okraszewski Can I make a custom web searcher with Nutch? Wed, 25 Apr, 20:42
Antony Bowesman Outlinks during parsing Wed, 25 Apr, 23:03
karthik085 nutch search results problem Thu, 26 Apr, 01:01
karthik085 Re: Why Nutch returns 0 results? Thu, 26 Apr, 01:24
Nuther nutch freegen bug? Thu, 26 Apr, 06:20
John Kleven Re: Using nutch just for the crawler/fetcher Thu, 26 Apr, 06:42
Arun Kaundal Re: Nutch 0.9 recrawl Thu, 26 Apr, 10:28
Ilya Vishnevsky Adding documents to already created distributed index Thu, 26 Apr, 12:03
Ilya Vishnevsky How to reIndex after reCrawl? Thu, 26 Apr, 15:08
karthik085 Case Sensitive Thu, 26 Apr, 23:07
Briggs Re: Case Sensitive Fri, 27 Apr, 00:15
John Kleven Re: Using nutch just for the crawler/fetcher Fri, 27 Apr, 00:37
qi wu Re: Case Sensitive Fri, 27 Apr, 00:51
Nuther Problems during Merging Indexes Fri, 27 Apr, 07:06
franklinb4u Re: [Nutch-general] Removing pages from index immediately Fri, 27 Apr, 12:34
karthik085 Re: Case Sensitive Fri, 27 Apr, 13:10
Briggs Re: [Nutch-general] Removing pages from index immediately Fri, 27 Apr, 16:16
Briggs Re: [Nutch-general] Removing pages from index immediately Fri, 27 Apr, 16:18
Briggs Re: [Nutch-general] Removing pages from index immediately Fri, 27 Apr, 16:24
songjue Re: Problems during Merging Indexes Fri, 27 Apr, 17:49
Mike Brzozowski Nutch crawl crashing during merge with ArrayIndexOutOfBoundsException Fri, 27 Apr, 17:51
karthik085 Ignore Robots meta tag Fri, 27 Apr, 18:47
karthik085 Re: Ignore Robots meta tag Fri, 27 Apr, 19:35
c wanek query filter ordering Fri, 27 Apr, 22:34
TCXO crystal Sun, 29 Apr, 08:18
James liu Question: Crawl web page and parse Mon, 30 Apr, 02:15
Zsolt Horváth Nutch encoding problem Mon, 30 Apr, 07:29
Ken Krugler Re: Nutch encoding problem Mon, 30 Apr, 13:49
Anton Beza Iterate through stored pages Mon, 30 Apr, 14:07
Briggs Nutch and running crawls within a container. Mon, 30 Apr, 14:45
Somnath Banerjee Crawling fixed set of urls (newbie question) Mon, 30 Apr, 15:12
Sami Siren Re: Nutch and running crawls within a container. Mon, 30 Apr, 15:35
Briggs Re: Nutch and running crawls within a container. Mon, 30 Apr, 15:46
Mike Brzozowski Re: Iterate through stored pages Mon, 30 Apr, 15:46
Message list« Previous · 1 · 2 · 3 · 4 · Next »Thread · Author · Date
Box list
Dec 200959
Nov 2009308
Oct 2009258
Sep 2009184
Aug 2009199
Jul 2009312
Jun 2009196
May 2009163
Apr 2009247
Mar 2009408
Feb 2009214
Jan 2009204
Dec 2008229
Nov 2008193
Oct 2008171
Sep 2008269
Aug 2008165
Jul 2008122
Jun 2008243
May 2008220
Apr 2008294
Mar 2008209
Feb 2008191
Jan 2008272
Dec 2007145
Nov 2007228
Oct 2007261
Sep 2007273
Aug 2007292
Jul 2007339
Jun 2007392
May 2007242
Apr 2007309
Mar 2007283
Feb 2007188
Jan 2007370
Dec 2006225
Nov 2006160
Oct 2006251
Sep 2006412
Aug 2006450
Jul 2006315
Jun 2006380
May 2006232
Apr 2006458
Mar 2006659
Feb 2006581
Jan 2006592
Dec 2005430
Nov 2005398
Oct 2005304
Sep 2005404
Aug 2005278
Jul 2005342
Jun 2005216
May 2005151
Apr 2005220
Mar 2005167