Mailing list archives: April 2009

Site index · List index
Message list« Previous · 1 · 2 · 3 · Next »Thread · Author · Date
dealmaker Re: Hadoop java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232) while indexing. Fri, 10 Apr, 05:53
John Whelan Sizing Guide? Sat, 11 Apr, 21:46
dealmaker How come getContent returns HTML Entities? Sun, 12 Apr, 05:05
Fadzi Ushewokunze fetcher issues Mon, 13 Apr, 02:52
yanky young Re: fetcher issues Mon, 13 Apr, 03:17
Fadzi Ushewokunze Re: fetcher issues Mon, 13 Apr, 03:33
Dennis Kubes Re: fetcher issues Mon, 13 Apr, 03:44
yanky young Re: fetcher issues Mon, 13 Apr, 03:52
Fadzi Ushewokunze Re: fetcher issues Mon, 13 Apr, 04:23
yanky young Re: fetcher issues Mon, 13 Apr, 04:47
Kunal Wku Multi-Lingual Support in Nutch Mon, 13 Apr, 15:30
Niraj Aswani Null pointer exception Tue, 14 Apr, 14:18
Niraj Aswani null-pointer exception Tue, 14 Apr, 14:18
wku_kunal Re: Language Identifier plugin Tue, 14 Apr, 15:17
dealmaker How does Nutch Fetch Files in Relative Path? Tue, 14 Apr, 20:35
Raymond Balmès Problems with custom field query Wed, 15 Apr, 14:47
Julien Nioche Re: Problems with custom field query Wed, 15 Apr, 15:57
Raymond Balmès Re: Problems with custom field query Wed, 15 Apr, 16:38
Grease How to ensure that a particular URL is not crawled (ever) again Thu, 16 Apr, 05:41
Felix Zimmermann How to index segments after converted from Heritrix ARC-files. Thu, 16 Apr, 20:50
Dennis Kubes Re: How to index segments after converted from Heritrix ARC-files. Thu, 16 Apr, 21:29
Bradford Stephens Seattle / PNW Hadoop + Lucene User Group? Thu, 16 Apr, 22:27
fishg Re: Hadoop java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232) while indexing. Fri, 17 Apr, 03:24
Gosavi.Shyam Spell checker in nutch 0.9 Fri, 17 Apr, 08:21
Zanzico Gioele nutch search score Fri, 17 Apr, 09:35
Zanzico Gioele nutch multiple site Fri, 17 Apr, 09:37
Felix Zimmermann Odd results and broken docs when indexing converted ARC-files. Fri, 17 Apr, 12:47
Felix Zimmermann Odd results and broken docs when indexing converted ARC-files (-> link to gif). Fri, 17 Apr, 12:54
Ilia chachkhunashvili getting WORDLIST Fri, 17 Apr, 19:35
Ken Krugler Re: Odd results and broken docs when indexing converted ARC-files. Fri, 17 Apr, 23:35
Bradford Stephens Re: Seattle / PNW Hadoop + Lucene User Group? Sat, 18 Apr, 00:08
John Whelan Nutch-based Application for Windows Sat, 18 Apr, 02:44
Dennis Kubes Re: Odd results and broken docs when indexing converted ARC-files. Sat, 18 Apr, 04:45
Dennis Kubes Re: fetcher questions Sat, 18 Apr, 04:56
Dennis Kubes Re: Odd results and broken docs when indexing converted ARC-files (-> link to gif). Sat, 18 Apr, 04:58
Amin Mohammed-Coleman Re: Seattle / PNW Hadoop + Lucene User Group? Sat, 18 Apr, 06:57
Quoi Nghia Chung RE: Seattle / PNW Hadoop + Lucene User Group? Sat, 18 Apr, 15:14
Raymond Balmès Re: Problems with custom field query Sat, 18 Apr, 15:58
Bradford Stephens Re: Seattle / PNW Hadoop + Lucene User Group? Sat, 18 Apr, 18:11
John Whelan Re: Nutch-based Application for Windows Sun, 19 Apr, 00:07
ML mail Dedup not working any more (Lock obtain timed out) Sun, 19 Apr, 07:53
Raymond Balmès Query-more problem Sun, 19 Apr, 16:09
Raymond Balmès Re: Query-more problem Sun, 19 Apr, 16:54
wu fuheng ebook resources - including lucene in action Mon, 20 Apr, 03:58
Saurabh Bhutyani =?UTF-8?B?UmU6ZWJvb2sgcmVzb3VyY2VzIC0gaW5jbHVkaW5nIGx1Y2VuZSBpbiBhY3Rpb24=?= Mon, 20 Apr, 05:58
Filipe Antunes Can't build Nutch Mon, 20 Apr, 10:00
yanky young Re: Can't build Nutch Mon, 20 Apr, 10:11
ianwong how to restrict search result in defined domains? Mon, 20 Apr, 12:56
Ken Krugler Re: Can't build Nutch Mon, 20 Apr, 13:02
ianwong Re: Multiple "site:" in query Mon, 20 Apr, 13:22
Goddard, Michael J. Re: Can't build Nutch Mon, 20 Apr, 14:21
Matthew Hall Re: Seattle / PNW Hadoop + Lucene User Group? Mon, 20 Apr, 14:22
Ilia chachkhunashvili way to get list of indexed URLS and list of words Mon, 20 Apr, 14:25
Grant Ingersoll Re: ebook resources - including lucene in action Mon, 20 Apr, 16:02
David M. Cole Re: Can't build Nutch Mon, 20 Apr, 16:31
Raymond Balmès Re: Query-more problem Mon, 20 Apr, 17:09
Raymond Balmès Re: Problems with custom field query Mon, 20 Apr, 17:16
Jason Todd Slack-Moehrle Nutch Crawling Questions Mon, 20 Apr, 23:10
Bradford Stephens Re: Seattle / PNW Hadoop + Lucene User Group? Mon, 20 Apr, 23:28
Ken Krugler Re: Nutch Crawling Questions Tue, 21 Apr, 00:46
David M. Cole Re: Nutch Crawling Questions Tue, 21 Apr, 01:05
Lauren Cooney Re: Seattle / PNW Hadoop + Lucene User Group? Tue, 21 Apr, 01:31
Tushar Jain Re: Seattle / PNW Hadoop + Lucene User Group? Tue, 21 Apr, 06:00
Lukas, Ray RE: ebook resources - including lucene in action Tue, 21 Apr, 11:49
Anshum Re: ebook resources - including lucene in action Tue, 21 Apr, 12:03
Alexander Aristov running two crawlers at the same time Tue, 21 Apr, 12:21
Alex Basa Re: running two crawlers at the same time Tue, 21 Apr, 14:04
Dennis Kubes Re: running two crawlers at the same time Tue, 21 Apr, 14:20
Jaime Martín nutch 1.0 Tue, 21 Apr, 21:45
David M. Cole Re: nutch 1.0 Tue, 21 Apr, 22:25
askNutch hi Kubes:the question about develop environment! Wed, 22 Apr, 05:41
Alexander Aristov Re: hi Kubes:the question about develop environment! Wed, 22 Apr, 06:12
Dmitry Lihachev Re: how to restrict search result in defined domains? Wed, 22 Apr, 06:45
Raymond Balmès Re: nutch 1.0 Wed, 22 Apr, 08:38
brainstorm Re: AW: Nutch Training Seminar Wed, 22 Apr, 10:01
Dennis Kubes Re: hi Kubes:the question about develop environment! Wed, 22 Apr, 14:04
Dennis Kubes Re: hi Kubes:the question about develop environment! Wed, 22 Apr, 14:04
Alexander Aristov Re: hi Kubes:the question about develop environment! Wed, 22 Apr, 17:50
Lukas, Ray Hadoop thread seems to remain alive Wed, 22 Apr, 20:30
askNutch run nutch on eclipse problem? Thu, 23 Apr, 06:24
askNutch Re: hi Kubes:the question about develop environment! Thu, 23 Apr, 06:39
Raymond Balmès Re: run nutch on eclipse problem? Thu, 23 Apr, 08:18
Ian.huang Re: how to restrict search result in defined domains? Thu, 23 Apr, 08:50
askNutch Re: run nutch on eclipse problem? Thu, 23 Apr, 09:48
Alejandro Gonzalez Re: run nutch on eclipse problem? Thu, 23 Apr, 10:09
Lukas, Ray RE: Hadoop thread seems to remain alive Thu, 23 Apr, 11:32
Raymond Balmès Re: Hadoop thread seems to remain alive Thu, 23 Apr, 12:22
Dennis Kubes Re: Hadoop thread seems to remain alive Thu, 23 Apr, 12:55
Dennis Kubes Re: hi Kubes:the question about develop environment! Thu, 23 Apr, 12:59
Dennis Kubes Re: how to restrict search result in defined domains? Thu, 23 Apr, 13:02
Susam Pal Re: hi Kubes:the question about develop environment! Thu, 23 Apr, 13:10
Lukas, Ray RE: Hadoop thread seems to remain alive Thu, 23 Apr, 13:20
Andrzej Bialecki Re: Hadoop thread seems to remain alive Thu, 23 Apr, 14:35
Lukas, Ray RE: Hadoop thread seems to remain alive Thu, 23 Apr, 14:42
Lukas, Ray RE: Hadoop thread seems to remain alive Thu, 23 Apr, 14:47
Sherjeel Niazi How to resume crawler after crash Thu, 23 Apr, 15:02
Lukas, Ray Using nutchBean Thu, 23 Apr, 20:36
Lukas, Ray RE: Using nutchBean Thu, 23 Apr, 21:06
Andrzej Bialecki Re: Using nutchBean Thu, 23 Apr, 21:32
Lukas, Ray RE: Using nutchBean Thu, 23 Apr, 21:45
Message list« Previous · 1 · 2 · 3 · Next »Thread · Author · Date
Box list
Dec 200960
Nov 2009308
Oct 2009258
Sep 2009184
Aug 2009199
Jul 2009312
Jun 2009196
May 2009163
Apr 2009247
Mar 2009408
Feb 2009214
Jan 2009204
Dec 2008229
Nov 2008193
Oct 2008171
Sep 2008269
Aug 2008165
Jul 2008122
Jun 2008243
May 2008220
Apr 2008294
Mar 2008209
Feb 2008191
Jan 2008272
Dec 2007145
Nov 2007228
Oct 2007261
Sep 2007273
Aug 2007292
Jul 2007339
Jun 2007392
May 2007242
Apr 2007309
Mar 2007283
Feb 2007188
Jan 2007370
Dec 2006225
Nov 2006160
Oct 2006251
Sep 2006412
Aug 2006450
Jul 2006315
Jun 2006380
May 2006232
Apr 2006458
Mar 2006659
Feb 2006581
Jan 2006592
Dec 2005430
Nov 2005398
Oct 2005304
Sep 2005404
Aug 2005278
Jul 2005342
Jun 2005216
May 2005151
Apr 2005220
Mar 2005167