Mailing list archives: April 2009

Site index · List index
Message list« Previous · 1 · 2Thread · Author · Date
Grease How to ensure that a particular URL is not crawled (ever) again Thu, 16 Apr, 05:41
Felix Zimmermann How to index segments after converted from Heritrix ARC-files. Thu, 16 Apr, 20:50
Dennis Kubes   Re: How to index segments after converted from Heritrix ARC-files. Thu, 16 Apr, 21:29
Bradford Stephens Seattle / PNW Hadoop + Lucene User Group? Thu, 16 Apr, 22:27
Bradford Stephens   Re: Seattle / PNW Hadoop + Lucene User Group? Sat, 18 Apr, 00:08
Amin Mohammed-Coleman     Re: Seattle / PNW Hadoop + Lucene User Group? Sat, 18 Apr, 06:57
Matthew Hall       Re: Seattle / PNW Hadoop + Lucene User Group? Mon, 20 Apr, 14:22
Bradford Stephens         Re: Seattle / PNW Hadoop + Lucene User Group? Mon, 20 Apr, 23:28
Lauren Cooney           Re: Seattle / PNW Hadoop + Lucene User Group? Tue, 21 Apr, 01:31
Tushar Jain             Re: Seattle / PNW Hadoop + Lucene User Group? Tue, 21 Apr, 06:00
Quoi Nghia Chung     RE: Seattle / PNW Hadoop + Lucene User Group? Sat, 18 Apr, 15:14
Bradford Stephens       Re: Seattle / PNW Hadoop + Lucene User Group? Sat, 18 Apr, 18:11
Gosavi.Shyam Spell checker in nutch 0.9 Fri, 17 Apr, 08:21
Zanzico Gioele nutch search score Fri, 17 Apr, 09:35
Zanzico Gioele nutch multiple site Fri, 17 Apr, 09:37
Felix Zimmermann Odd results and broken docs when indexing converted ARC-files. Fri, 17 Apr, 12:47
Ken Krugler   Re: Odd results and broken docs when indexing converted ARC-files. Fri, 17 Apr, 23:35
Dennis Kubes     Re: Odd results and broken docs when indexing converted ARC-files. Sat, 18 Apr, 04:45
Felix Zimmermann Odd results and broken docs when indexing converted ARC-files (-> link to gif). Fri, 17 Apr, 12:54
Dennis Kubes   Re: Odd results and broken docs when indexing converted ARC-files (-> link to gif). Sat, 18 Apr, 04:58
Ilia chachkhunashvili getting WORDLIST Fri, 17 Apr, 19:35
John Whelan Nutch-based Application for Windows Sat, 18 Apr, 02:44
John Whelan   Re: Nutch-based Application for Windows Sun, 19 Apr, 00:07
Re: fetcher questions
Dennis Kubes   Re: fetcher questions Sat, 18 Apr, 04:56
ML mail Dedup not working any more (Lock obtain timed out) Sun, 19 Apr, 07:53
Raymond Balmès Query-more problem Sun, 19 Apr, 16:09
Raymond Balmès   Re: Query-more problem Sun, 19 Apr, 16:54
Raymond Balmès     Re: Query-more problem Mon, 20 Apr, 17:09
wu fuheng ebook resources - including lucene in action Mon, 20 Apr, 03:58
Grant Ingersoll   Re: ebook resources - including lucene in action Mon, 20 Apr, 16:02
Saurabh Bhutyani =?UTF-8?B?UmU6ZWJvb2sgcmVzb3VyY2VzIC0gaW5jbHVkaW5nIGx1Y2VuZSBpbiBhY3Rpb24=?= Mon, 20 Apr, 05:58
Lukas, Ray   RE: ebook resources - including lucene in action Tue, 21 Apr, 11:49
Anshum     Re: ebook resources - including lucene in action Tue, 21 Apr, 12:03
Filipe Antunes Can't build Nutch Mon, 20 Apr, 10:00
yanky young   Re: Can't build Nutch Mon, 20 Apr, 10:11
Ken Krugler     Re: Can't build Nutch Mon, 20 Apr, 13:02
Goddard, Michael J.   Re: Can't build Nutch Mon, 20 Apr, 14:21
David M. Cole   Re: Can't build Nutch Mon, 20 Apr, 16:31
ianwong how to restrict search result in defined domains? Mon, 20 Apr, 12:56
Dmitry Lihachev   Re: how to restrict search result in defined domains? Wed, 22 Apr, 06:45
Ian.huang     Re: how to restrict search result in defined domains? Thu, 23 Apr, 08:50
Dennis Kubes       Re: how to restrict search result in defined domains? Thu, 23 Apr, 13:02
Re: Multiple "site:" in query
ianwong   Re: Multiple "site:" in query Mon, 20 Apr, 13:22
Ilia chachkhunashvili way to get list of indexed URLS and list of words Mon, 20 Apr, 14:25
Jason Todd Slack-Moehrle Nutch Crawling Questions Mon, 20 Apr, 23:10
Ken Krugler   Re: Nutch Crawling Questions Tue, 21 Apr, 00:46
David M. Cole   Re: Nutch Crawling Questions Tue, 21 Apr, 01:05
Alexander Aristov running two crawlers at the same time Tue, 21 Apr, 12:21
Alex Basa   Re: running two crawlers at the same time Tue, 21 Apr, 14:04
Dennis Kubes   Re: running two crawlers at the same time Tue, 21 Apr, 14:20
Jaime Martín nutch 1.0 Tue, 21 Apr, 21:45
David M. Cole   Re: nutch 1.0 Tue, 21 Apr, 22:25
Raymond Balmès   Re: nutch 1.0 Wed, 22 Apr, 08:38
askNutch hi Kubes:the question about develop environment! Wed, 22 Apr, 05:41
Alexander Aristov   Re: hi Kubes:the question about develop environment! Wed, 22 Apr, 06:12
Dennis Kubes     Re: hi Kubes:the question about develop environment! Wed, 22 Apr, 14:04
Alexander Aristov       Re: hi Kubes:the question about develop environment! Wed, 22 Apr, 17:50
Lukas, Ray         Hadoop thread seems to remain alive Wed, 22 Apr, 20:30
Lukas, Ray           RE: Hadoop thread seems to remain alive Thu, 23 Apr, 11:32
Raymond Balmès             Re: Hadoop thread seems to remain alive Thu, 23 Apr, 12:22
Dennis Kubes               Re: Hadoop thread seems to remain alive Thu, 23 Apr, 12:55
Lukas, Ray               RE: Hadoop thread seems to remain alive Thu, 23 Apr, 13:20
Andrzej Bialecki                 Re: Hadoop thread seems to remain alive Thu, 23 Apr, 14:35
Lukas, Ray                   RE: Hadoop thread seems to remain alive Thu, 23 Apr, 14:47
Lukas, Ray                 RE: Hadoop thread seems to remain alive Thu, 23 Apr, 14:42
Raymond Balmès                   Re: Hadoop thread seems to remain alive Fri, 24 Apr, 06:51
Lukas, Ray                     RE: Hadoop thread seems to remain alive Fri, 24 Apr, 11:54
Lukas, Ray                     RE: Hadoop thread seems to remain alive Fri, 24 Apr, 12:03
Raymond Balmès                       Re: Hadoop thread seems to remain alive Sat, 25 Apr, 09:27
Lukas, Ray                         RE: Hadoop thread seems to remain alive Sat, 25 Apr, 21:53
Dennis Kubes   Re: hi Kubes:the question about develop environment! Wed, 22 Apr, 14:04
askNutch     Re: hi Kubes:the question about develop environment! Thu, 23 Apr, 06:39
Dennis Kubes       Re: hi Kubes:the question about develop environment! Thu, 23 Apr, 12:59
Susam Pal       Re: hi Kubes:the question about develop environment! Thu, 23 Apr, 13:10
Re: AW: Nutch Training Seminar
brainstorm   Re: AW: Nutch Training Seminar Wed, 22 Apr, 10:01
askNutch run nutch on eclipse problem? Thu, 23 Apr, 06:24
Raymond Balmès   Re: run nutch on eclipse problem? Thu, 23 Apr, 08:18
askNutch     Re: run nutch on eclipse problem? Thu, 23 Apr, 09:48
Alejandro Gonzalez       Re: run nutch on eclipse problem? Thu, 23 Apr, 10:09
Sherjeel Niazi How to resume crawler after crash Thu, 23 Apr, 15:02
Lukas, Ray   Using nutchBean Thu, 23 Apr, 20:36
Lukas, Ray     RE: Using nutchBean Thu, 23 Apr, 21:06
Andrzej Bialecki       Re: Using nutchBean Thu, 23 Apr, 21:32
Lukas, Ray         RE: Using nutchBean Thu, 23 Apr, 21:45
Lukas, Ray           RE: Using nutchBean Thu, 23 Apr, 22:26
Dennis Kubes   Re: How to resume crawler after crash Fri, 24 Apr, 04:08
MyD URL Scoring Fri, 24 Apr, 08:14
Dennis Kubes   Re: URL Scoring Fri, 24 Apr, 12:42
sgirao How to get the html that i crawled Mon, 27 Apr, 11:28
Raymond Balmès   Re: How to get the html that i crawled Mon, 27 Apr, 21:11
sgirao     Re: How to get the html that i crawled Tue, 28 Apr, 07:36
Dennis Kubes       Re: How to get the html that i crawled Thu, 30 Apr, 13:46
fa...@butterflycluster.net   Re: How to get the html that i crawled Tue, 28 Apr, 07:40
jqq Searching multiple indexes with Nutch-2 servers,0 segments Mon, 27 Apr, 12:58
kazam Nutch fetch creates too many http sessions Mon, 27 Apr, 16:25
Dennis Kubes   Re: Nutch fetch creates too many http sessions Mon, 27 Apr, 22:28
kazam     Re: Nutch fetch creates too many http sessions Tue, 28 Apr, 22:09
Joel Halbert Unable to register IndexingFilter extesion plugin - N 0.9 Mon, 27 Apr, 17:40
Raymond Balmès   Re: Unable to register IndexingFilter extesion plugin - N 0.9 Mon, 27 Apr, 20:58
Joel Halbert     Re: Unable to register IndexingFilter extesion plugin - N 0.9 Tue, 28 Apr, 09:25
Mayank Kamthan Problem in generating the war file Mon, 27 Apr, 18:47
Raymond Balmès   Re: Problem in generating the war file Mon, 27 Apr, 21:03
Mayank Kamthan     Re: Problem in generating the war file Mon, 27 Apr, 21:38
Raymond Balmès       Re: Problem in generating the war file Mon, 27 Apr, 22:08
Raymond Balmès dual core and crawling Mon, 27 Apr, 21:17
Dennis Kubes   Re: dual core and crawling Mon, 27 Apr, 22:24
Raymond Balmès     Re: dual core and crawling Tue, 28 Apr, 07:24
Dennis Kubes       Re: dual core and crawling Tue, 28 Apr, 15:37
Raymond Balmès         Re: dual core and crawling Tue, 28 Apr, 15:54
Alex Basa           Re: dual core and crawling Tue, 28 Apr, 16:00
Raymond Balmès             Re: dual core and crawling Tue, 28 Apr, 16:44
Raymond Balmès               Re: dual core and crawling Tue, 28 Apr, 21:57
Dennis Kubes                 Re: dual core and crawling Wed, 29 Apr, 03:00
Raymond Balmès                   Re: dual core and crawling Wed, 29 Apr, 11:33
Mayank Kamthan Adding a new class in Nutch and using it in a JSP Mon, 27 Apr, 21:46
zxh116116 in nutch1.0 incread summary problem Tue, 28 Apr, 14:18
N 0.9 - fetcher.threads.per.host
Joel Halbert   N 0.9 - fetcher.threads.per.host Tue, 28 Apr, 16:34
Joel Halbert   N 0.9 - fetcher.threads.per.host Tue, 28 Apr, 16:42
Joel Halbert     Re: N 0.9 - fetcher.threads.per.host Tue, 28 Apr, 17:15
Joel Halbert Possible bug in when fetching page relative links after redirects - N 1.0. Wed, 29 Apr, 09:07
Joel Halbert Possible bug in when fetching relative links after a redirect - N 1.0 Wed, 29 Apr, 09:27
Andrzej Bialecki   Re: Possible bug in when fetching relative links after a redirect - N 1.0 Wed, 29 Apr, 10:15
v...@free.fr Is it possible to avoid Nutch 1.0 from indexing local directories ? Thu, 30 Apr, 09:14
Dennis Kubes   Re: Is it possible to avoid Nutch 1.0 from indexing local directories ? Thu, 30 Apr, 13:42
v...@free.fr   Re: Is it possible to avoid Nutch 1.0 from indexing local directories ? Thu, 30 Apr, 14:56
Rahil Baig General queries Thu, 30 Apr, 15:06
Message list« Previous · 1 · 2Thread · Author · Date
Box list
Dec 200981
Nov 2009308
Oct 2009258
Sep 2009184
Aug 2009199
Jul 2009312
Jun 2009196
May 2009163
Apr 2009247
Mar 2009408
Feb 2009214
Jan 2009204
Dec 2008229
Nov 2008193
Oct 2008171
Sep 2008269
Aug 2008165
Jul 2008122
Jun 2008243
May 2008220
Apr 2008294
Mar 2008209
Feb 2008191
Jan 2008272
Dec 2007145
Nov 2007228
Oct 2007261
Sep 2007273
Aug 2007292
Jul 2007339
Jun 2007392
May 2007242
Apr 2007309
Mar 2007283
Feb 2007188
Jan 2007370
Dec 2006225
Nov 2006160
Oct 2006251
Sep 2006412
Aug 2006450
Jul 2006315
Jun 2006380
May 2006232
Apr 2006458
Mar 2006659
Feb 2006581
Jan 2006592
Dec 2005430
Nov 2005398
Oct 2005304
Sep 2005404
Aug 2005278
Jul 2005342
Jun 2005216
May 2005151
Apr 2005220
Mar 2005167