Mailing list archives: January 2007

Site index · List index
Message list« Previous · 1 · 2 · 3 · Next »Thread · Author · Date
Shrinivas Patwardhan DFS with nutch- 0.72 Fri, 12 Jan, 05:22
kauu   Re: DFS with nutch- 0.72 Fri, 12 Jan, 05:33
yl...@ifrance.com problems to exclude subdirectories in a web site Fri, 12 Jan, 14:16
Alvaro Cabrerizo   Re: problems to exclude subdirectories in a web site Tue, 16 Jan, 15:54
yl...@ifrance.com   Re: Re: problems to exclude subdirectories in a web site Fri, 19 Jan, 14:05
yl...@ifrance.com BUG with error: failure closing block of file with Hadoop 0.9.2 and Nutch 0.8.1 Fri, 12 Jan, 14:26
Andrzej Bialecki   Re: BUG with error: failure closing block of file with Hadoop 0.9.2 and Nutch 0.8.1 Tue, 16 Jan, 11:07
Steve Kallestad Nutch Crawler (.81) picking up strange links Fri, 12 Jan, 20:20
Dennis Kubes   Re: Nutch Crawler (.81) picking up strange links Fri, 12 Jan, 21:44
karthik085 Nutch support for frames Fri, 12 Jan, 21:03
Shrinivas Patwardhan alternative for dmoz rdf ? Sat, 13 Jan, 06:30
Sean Dean   Re: alternative for dmoz rdf ? Sat, 13 Jan, 07:22
Shrinivas Patwardhan     Re: alternative for dmoz rdf ? Sat, 13 Jan, 07:26
Iain     RE: alternative for dmoz rdf ? Sat, 13 Jan, 16:05
Sean Dean   Re: alternative for dmoz rdf ? Sat, 13 Jan, 16:17
Insurance Squared Inc.     Re: alternative for dmoz rdf ? Sat, 13 Jan, 18:45
Iain       RE: alternative for dmoz rdf ? Mon, 15 Jan, 10:07
Sean Dean   Re: alternative for dmoz rdf ? Mon, 15 Jan, 11:27
Iain     RE: alternative for dmoz rdf ? Mon, 15 Jan, 13:23
Shrinivas Patwardhan nutch server Sat, 13 Jan, 09:54
Alexey V. Labunko   Re: nutch server Tue, 16 Jan, 08:22
Mathijs Homminga Redirect source remains unfetched Sat, 13 Jan, 13:34
Eelco Lempsink   Re: Redirect source remains unfetched Sat, 13 Jan, 15:07
Mathijs Homminga     Re: Redirect source remains unfetched Sun, 14 Jan, 14:54
Eelco Lempsink       Re: Redirect source remains unfetched Sun, 14 Jan, 18:53
chee wu Crawling but no indexing.. Sat, 13 Jan, 16:21
visava crawling url list Sun, 14 Jan, 04:49
kauu   Re: crawling url list Sun, 14 Jan, 12:25
visava     Re: crawling url list Sun, 14 Jan, 19:57
kauu       Re: crawling url list Mon, 15 Jan, 01:25
kauu         Re: crawling url list Mon, 15 Jan, 01:27
Shrinivas Patwardhan   Re: crawling url list Mon, 15 Jan, 04:25
visava     Re: crawling url list Mon, 15 Jan, 21:53
kauu       Re: crawling url list Tue, 16 Jan, 08:56
Gal Nitzan Where have all the flowers gone... err... the logs :) Mon, 15 Jan, 08:58
Lukas Vlcek   Re: Where have all the flowers gone... err... the logs :) Mon, 15 Jan, 14:56
termo...@gmail.com Problem finding out the number of crawled pages per domain Mon, 15 Jan, 13:38
kauu   Re: Problem finding out the number of crawled pages per domain Tue, 16 Jan, 09:01
Lukas Vlcek   Re: Problem finding out the number of crawled pages per domain Wed, 17 Jan, 15:30
Alvaro Cabrerizo Problems stressing "./bin/nutch server" command Mon, 15 Jan, 17:24
Brian Whitman checksum error in segment merger Mon, 15 Jan, 17:30
Andrzej Bialecki   Re: checksum error in segment merger Mon, 15 Jan, 18:36
Brian Whitman     Re: checksum error in segment merger Mon, 15 Jan, 18:38
Andrzej Bialecki       Re: checksum error in segment merger Mon, 15 Jan, 18:45
Brian Whitman         Re: checksum error in segment merger Mon, 15 Jan, 19:05
Andrzej Bialecki           Re: checksum error in segment merger Mon, 15 Jan, 19:41
Brian Whitman           Re: checksum error in segment merger Tue, 16 Jan, 16:41
Andrzej Bialecki             Re: checksum error in segment merger Tue, 16 Jan, 17:00
bb...@mail.ru not indexing Mon, 15 Jan, 17:36
Renaud Richardet   Re: not indexing Mon, 15 Jan, 21:22
bb...@mail.ru   Re: not indexing Tue, 16 Jan, 09:01
srinath Issue While Creating Inverted Links Tue, 16 Jan, 06:18
Andrzej Bialecki   Re: Issue While Creating Inverted Links Tue, 16 Jan, 11:02
Libor ©tefek Searcher doesn't find what expected Tue, 16 Jan, 06:25
kauu   Re: Searcher doesn't find what expected Tue, 16 Jan, 08:51
Alvaro Cabrerizo     Re: Searcher doesn't find what expected Wed, 17 Jan, 12:25
Libor Štefek       Re: Searcher doesn't find what expected Mon, 22 Jan, 11:33
cesar voulgaris DB_unfetched status Wed, 17 Jan, 04:57
Sean Dean   Re: DB_unfetched status Wed, 17 Jan, 07:02
cesar voulgaris     Re: DB_unfetched status Thu, 18 Jan, 01:02
Andrzej Bialecki       Re: DB_unfetched status Thu, 18 Jan, 08:09
Shailendra Mudgal NameNode throws FileNotFoundException: Parent path does not exist on startup Wed, 17 Jan, 08:26
Sean Dean   Re: NameNode throws FileNotFoundException: Parent path does not exist on startup Wed, 17 Jan, 08:37
Shailendra Mudgal     Re: NameNode throws FileNotFoundException: Parent path does not exist on startup Wed, 17 Jan, 08:48
Shailendra Mudgal       Re: NameNode throws FileNotFoundException: Parent path does not exist on startup Wed, 17 Jan, 11:37
Albert Chern         Re: NameNode throws FileNotFoundException: Parent path does not exist on startup Wed, 17 Jan, 17:15
yo_keller search or Tomcat ill response Wed, 17 Jan, 08:44
Sean Dean   Re: search or Tomcat ill response Wed, 17 Jan, 09:00
yo_keller     Re: search or Tomcat ill response Wed, 17 Jan, 14:28
Shailendra Mudgal How to recover data from filesystem Wed, 17 Jan, 10:28
Andrzej Bialecki   Re: How to recover data from filesystem Wed, 17 Jan, 11:22
Brian Whitman out of memory error at end of indexing Wed, 17 Jan, 16:57
Brian Whitman   Re: out of memory error at end of indexing Wed, 17 Jan, 18:23
Shailendra Mudgal How to stop a slow fetch? Thu, 18 Jan, 05:26
Sean Dean   Re: How to stop a slow fetch? Thu, 18 Jan, 06:46
Shailendra Mudgal     Re: How to stop a slow fetch? Thu, 18 Jan, 06:54
Sean Dean   Re: How to stop a slow fetch? Thu, 18 Jan, 07:07
Sami Siren     Re: How to stop a slow fetch? Thu, 18 Jan, 20:16
termo...@gmail.com Nutch 0.8 cannot find all the links on a page Thu, 18 Jan, 08:30
Andrzej Bialecki   Re: Nutch 0.8 cannot find all the links on a page Thu, 18 Jan, 13:44
Vlador     Re: Nutch 0.8 cannot find all the links on a page Fri, 19 Jan, 09:12
Reduce segment size
Ledio Ago   Reduce segment size Fri, 19 Jan, 01:57
Sean Dean     Re: Reduce segment size Fri, 19 Jan, 07:04
Ledio Ago       RE: Reduce segment size Fri, 19 Jan, 17:56
Ledio Ago         RE: Reduce segment size Fri, 19 Jan, 18:36
Sean Dean     Re: Reduce segment size Fri, 19 Jan, 19:19
Ledio Ago       RE: Reduce segment size Fri, 19 Jan, 19:34
Sean Dean     Re: Reduce segment size Fri, 19 Jan, 20:00
Ledio Ago   Reduce segment size Fri, 19 Jan, 17:53
Andrzej Bialecki     Re: Reduce segment size Fri, 19 Jan, 20:22
Ledio Ago       RE: Reduce segment size Fri, 19 Jan, 21:35
Gal Nitzan notch 0.9 + hadoop 0.10.1 problem Fri, 19 Jan, 09:44
Sean Dean   Re: notch 0.9 + hadoop 0.10.1 problem Fri, 19 Jan, 10:03
Gal Nitzan java.lang.OutOfMemoryError - trunk Fri, 19 Jan, 15:57
Sean Dean   Re: java.lang.OutOfMemoryError - trunk Fri, 19 Jan, 18:24
Gal Nitzan   RE: java.lang.OutOfMemoryError - trunk Fri, 19 Jan, 18:38
Espen Amble Kolstad     Re: java.lang.OutOfMemoryError - trunk Sat, 20 Jan, 12:04
Gal Nitzan   RE: java.lang.OutOfMemoryError - trunk Fri, 19 Jan, 18:41
DS jha how to use PorterStemFilter with NutchDocumentAnalyzer Fri, 19 Jan, 17:14
Alvaro Cabrerizo   Re: how to use PorterStemFilter with NutchDocumentAnalyzer Tue, 23 Jan, 08:34
DS jha     Re: how to use PorterStemFilter with NutchDocumentAnalyzer Tue, 23 Jan, 15:21
Alvaro Cabrerizo       Re: how to use PorterStemFilter with NutchDocumentAnalyzer Mon, 29 Jan, 18:39
yl...@ifrance.com Input directory urls/url-fr.txt in localhost:9000 is invalid with Hadoop 0.4.0patched and Nutch 0.8.1 Fri, 19 Jan, 18:05
Andrzej Bialecki   Re: Input directory urls/url-fr.txt in localhost:9000 is invalid with Hadoop 0.4.0patched and Nutch 0.8.1 Fri, 19 Jan, 20:19
Gal Nitzan Does nutch segments from hadoop .7.1 different from hadoop .10.1 Fri, 19 Jan, 21:28
Bharat Beedu Unique out of memory exception while fetching.. Sat, 20 Jan, 08:58
Vlador Limiting the total number of urls to crawl on a single website Sun, 21 Jan, 17:10
Tobias Zahn Indexing only some filetypes with Nutch Sun, 21 Jan, 17:50
Vlador   Re: Indexing only some filetypes with Nutch Sun, 21 Jan, 20:29
Tobias Zahn     Re: Indexing only some filetypes with Nutch Wed, 24 Jan, 20:04
Sami Siren       Re: Indexing only some filetypes with Nutch Wed, 24 Jan, 20:09
Tobias Zahn         Re: Indexing only some filetypes with Nutch Wed, 24 Jan, 20:18
Dennis Kubes   Re: Indexing only some filetypes with Nutch Mon, 22 Jan, 21:07
Jonathan Hunter Compiling PruneIndexTool trouble Mon, 22 Jan, 05:56
Sami Siren   Re: Compiling PruneIndexTool trouble Mon, 22 Jan, 15:07
Jonathan Hunter     Re: Compiling PruneIndexTool trouble Tue, 23 Jan, 23:44
Renaud Richardet       Re: Compiling PruneIndexTool trouble Wed, 24 Jan, 00:06
Nicolás Lichtmaier "Or" searches in nutch Mon, 22 Jan, 20:51
Scott Green Can I generate nutch index without crawling? Tue, 23 Jan, 17:08
Sean Dean   Re: Can I generate nutch index without crawling? Tue, 23 Jan, 22:51
The Golden Condor !     Re: Can I generate nutch index without crawling? Wed, 24 Jan, 00:31
Scott Green     Re: Can I generate nutch index without crawling? Wed, 24 Jan, 02:53
Enis Soztutar       Re: Can I generate nutch index without crawling? Thu, 25 Jan, 14:13
Nicolás Lichtmaier Boolean searches, again Tue, 23 Jan, 19:08
Enis Soztutar   Re: Boolean searches, again Wed, 24 Jan, 09:08
Nicolás Lichtmaier     Re: Boolean searches, again Wed, 24 Jan, 22:15
Renaud Richardet cannot search by url (url:) with Nutch 0.8 Wed, 24 Jan, 00:34
Denis Pimenov nutch scrawls only relative links Wed, 24 Jan, 15:16
Denis Pimenov   Re: nutch scrawls only relative links Wed, 24 Jan, 15:35
Alan Tanaman   RE: nutch scrawls only relative links Wed, 24 Jan, 18:34
Aďcha exact matches and stemming Wed, 24 Jan, 17:13
Alvaro Cabrerizo   Re: exact matches and stemming Fri, 26 Jan, 08:10
Message list« Previous · 1 · 2 · 3 · Next »Thread · Author · Date
Box list
Dec 2009103
Nov 2009308
Oct 2009258
Sep 2009184
Aug 2009199
Jul 2009312
Jun 2009196
May 2009163
Apr 2009247
Mar 2009408
Feb 2009214
Jan 2009204
Dec 2008229
Nov 2008193
Oct 2008171
Sep 2008269
Aug 2008165
Jul 2008122
Jun 2008243
May 2008220
Apr 2008294
Mar 2008209
Feb 2008191
Jan 2008272
Dec 2007145
Nov 2007228
Oct 2007261
Sep 2007273
Aug 2007292
Jul 2007339
Jun 2007392
May 2007242
Apr 2007309
Mar 2007283
Feb 2007188
Jan 2007370
Dec 2006225
Nov 2006160
Oct 2006251
Sep 2006412
Aug 2006450
Jul 2006315
Jun 2006380
May 2006232
Apr 2006458
Mar 2006659
Feb 2006581
Jan 2006592
Dec 2005430
Nov 2005398
Oct 2005304
Sep 2005404
Aug 2005278
Jul 2005342
Jun 2005216
May 2005151
Apr 2005220
Mar 2005167