Mailing list archives: July 2009

Site index · List index
Message list« Previous · 1 · 2 · 3 · 4 · Next »Thread · Author · Date
Saurabh Suman How to index other fields in solr Mon, 27 Jul, 06:34
Saurabh Suman How to add new field in indexing in SolrIndexer.java Wed, 29 Jul, 05:38
Saurabh Suman How fetcher works Thu, 30 Jul, 04:17
Saurabh Suman Meaning of ProtocolStatus.ACCESS_DENIED Thu, 30 Jul, 13:59
Saurabh Suman denied by robots.txt rules Fri, 31 Jul, 03:28
Saurabh Suman denied by robots.txt rules Fri, 31 Jul, 03:29
Sjaiful Bahri Re: recrawling Tue, 14 Jul, 07:30
Sudhi Seshachala Re: Support needed Tue, 28 Jul, 18:45
Sudhi Seshachala Re: Host specific parsing Tue, 28 Jul, 19:11
SunGod Re: Favorite Linux Distribution for Nutch Sat, 04 Jul, 16:21
SunGod Re: how to crawl a page but not index it Mon, 13 Jul, 12:51
SunGod Re: how to crawl a page but not index it Mon, 13 Jul, 12:56
SunGod Re: Job failed help Mon, 13 Jul, 13:00
Susam Pal Re: Authentication Not Occuring Mon, 06 Jul, 12:49
Tomislav Poljak mergesegs disk space Wed, 15 Jul, 16:31
Tomislav Poljak Re: mergesegs disk space Tue, 21 Jul, 18:50
Vijay Optimal size of a segments sub-directory and a couple of other questions relating to Nutch response times Fri, 03 Jul, 01:15
Will Daley indexing meta tags in 1.0 Thu, 16 Jul, 10:12
Xiangjun(XJ) Wang Re: Hoe to search Nutch DB Mon, 06 Jul, 22:52
Xiangjun(XJ) Wang Re: Show db_gone in crawlDB Thu, 09 Jul, 17:31
Yaidel Guedes Beltran how parse chm files Mon, 06 Jul, 13:02
Yaidel Guedes Beltran Problems when index .chm files Mon, 06 Jul, 17:16
Zaihan Search results return 0 Sun, 12 Jul, 17:05
Zaihan Integrating Nutch frontend with Backend. Mon, 13 Jul, 12:57
Zaihan Pages with Specific URLS. Thu, 23 Jul, 13:50
alx...@aim.com Re: Nutch Tutorial 1.0 based off of the French Version Tue, 14 Jul, 01:04
alx...@aim.com Nutch in C++ Thu, 30 Jul, 19:13
alx...@aim.com how to exclude some external links Fri, 31 Jul, 01:15
ben bouzid mohamed Re: Favorite Linux Distribution for Nutch Sat, 04 Jul, 15:16
caezar Nutch crawling status Mon, 27 Jul, 14:27
caezar Re: Nutch crawling status Mon, 27 Jul, 14:41
claus westerkamp Re: Problems when deploy nutch-1.0.war Tue, 07 Jul, 12:17
gunnapranay Ontology-Clearing Cache... Fri, 10 Jul, 21:16
ilayaraja Changing fieldsNorm at query time Sun, 12 Jul, 14:24
johan.sjob...@findwise.se Re: what's the relationship between nutch, solr, lucene, and hadoop Fri, 03 Jul, 19:54
kevin chen Re: dump all outlinks Sat, 18 Jul, 03:06
lei wang Re: How torunning nutch on 2G memory tasknode Thu, 02 Jul, 11:58
lei wang nutch crawldb failed for java heap space Thu, 02 Jul, 16:21
lei wang Re: nutch crawldb failed for java heap space Sat, 04 Jul, 04:45
lei wang Re: nutch crawldb failed for java heap space Sun, 05 Jul, 14:06
lei wang Re: nutch crawldb failed for java heap space Sun, 05 Jul, 14:12
lei wang Arc to segements failed for " Task attempt_200907091108_0001_m_000520_0 failed to report status for 602 seconds. Killing!" Fri, 10 Jul, 01:56
lei wang Re: Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. Fri, 10 Jul, 08:29
lei wang job failed for "Too many fetch-failures" Sat, 11 Jul, 02:46
lei wang Re: Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. Sat, 11 Jul, 02:48
lei wang Re: how to allow every url to b accepted Sat, 11 Jul, 02:50
lei wang Too many fether failures Sun, 12 Jul, 06:58
lei wang job failed for "java.io.IOException: Task process exit with nonzero status of 255." Tue, 14 Jul, 11:05
lei wang Re: job failed for "java.io.IOException: Task process exit with nonzero status of 255." Wed, 15 Jul, 00:51
oh...@cox.net Just getting started w/tutorial- errors in crawl.log Tue, 14 Jul, 00:58
oh...@cox.net Re: Just getting started w/tutorial- errors in crawl.log Tue, 14 Jul, 14:04
oh...@cox.net Tutorial followup - Nutch webapp not seeing stuff? Tue, 14 Jul, 15:09
oh...@cox.net Re: Tutorial followup - Nutch webapp not seeing stuff? Tue, 14 Jul, 15:35
oh...@cox.net Re: Tutorial followup - Nutch webapp not seeing stuff? Tue, 14 Jul, 16:53
oh...@cox.net Re: Tutorial followup - Nutch webapp not seeing stuff? Tue, 14 Jul, 18:17
oh...@cox.net Re: Tutorial followup - Nutch webapp not seeing stuff? Tue, 14 Jul, 19:17
oh...@cox.net Re: Tutorial followup - Nutch webapp not seeing stuff? Wed, 15 Jul, 18:08
oh...@cox.net Problem crawling local filesystem Thu, 16 Jul, 17:36
oh...@cox.net Re: Problem crawling local filesystem Thu, 16 Jul, 17:54
oh...@cox.net Question about crawling local filesystem and directories Thu, 16 Jul, 20:57
oh...@cox.net Using Nutch (w/custom plugin) to crawl vs. custom Lucene app Mon, 27 Jul, 19:35
postusenet How to get lastModified or create-date content from html pages? Sat, 04 Jul, 17:26
postusenet call for answer Thu, 09 Jul, 20:40
reinhard schwab Re: A few questions about crawl-urlfilter.txt Thu, 16 Jul, 10:09
reinhard schwab Re: Why cant I inject a google link to the database? Fri, 17 Jul, 12:17
reinhard schwab Re: Why cant I inject a google link to the database? Fri, 17 Jul, 12:26
reinhard schwab Re: Why cant I inject a google link to the database? Fri, 17 Jul, 12:30
reinhard schwab Re: Why cant I inject a google link to the database? Fri, 17 Jul, 12:33
reinhard schwab Re: Why cant I inject a google link to the database? Fri, 17 Jul, 14:15
reinhard schwab Re: Why cant I inject a google link to the database? Fri, 17 Jul, 14:49
reinhard schwab dump all outlinks Fri, 17 Jul, 16:43
reinhard schwab wrong outlinks Fri, 17 Jul, 19:48
reinhard schwab Re: wrong outlinks Fri, 17 Jul, 22:43
reinhard schwab Re: wrong outlinks Fri, 17 Jul, 22:46
reinhard schwab Re: dump all outlinks Sun, 19 Jul, 18:33
reinhard schwab Re: Pages with Specific URLS. Thu, 23 Jul, 14:17
reinhard schwab crawl-tool.xml Sun, 26 Jul, 11:55
reinhard schwab Re: crawl-tool.xml Mon, 27 Jul, 08:28
reinhard schwab Re: question Mon, 27 Jul, 18:40
reinhard schwab Re: Dumping what I have? Tue, 28 Jul, 16:26
reinhard schwab Re: Include/exclude lists Wed, 29 Jul, 09:28
reinhard schwab Re: mergesegs disk space Wed, 29 Jul, 10:11
reinhard schwab Re: mergesegs disk space Wed, 29 Jul, 11:04
reinhard schwab Re: How fetcher works Thu, 30 Jul, 07:29
schroedi How To Generate the JavaDoc Thu, 02 Jul, 18:58
schroedi Re: Problems when deploy nutch-1.0.war Sat, 04 Jul, 09:02
schroedi Favorite Linux Distribution for Nutch Sat, 04 Jul, 14:50
schroedi Re: Running Nutch on VMs Wed, 08 Jul, 15:52
schroedi Show db_gone in crawlDB Thu, 09 Jul, 04:05
schroedi Re: Favorite Linux Distribution for Nutch Thu, 09 Jul, 05:37
schroedi Re: Nutch Tutorial 1.0 based off of the French Version Tue, 14 Jul, 03:55
schroedi Dumping CrawlDB into database Fri, 24 Jul, 14:59
schroedi Re: Dumping what I have? Thu, 30 Jul, 15:19
schroedi Dumping Crawl DB with XML Thu, 30 Jul, 15:19
sf30098 Support needed Mon, 27 Jul, 21:01
stefan.kai...@hartmann.info How to search part of words? Fri, 10 Jul, 12:57
stefan.kai...@hartmann.info How to search for part of words? Fri, 10 Jul, 13:04
wadaley Meta tag plugin for 1.0 Thu, 16 Jul, 19:26
xiao yang what's the relationship between nutch, solr, lucene, and hadoop Fri, 03 Jul, 19:06
xiao yang Problems when deploy nutch-1.0.war Sat, 04 Jul, 07:41
Message list« Previous · 1 · 2 · 3 · 4 · Next »Thread · Author · Date
Box list
Dec 200981
Nov 2009308
Oct 2009258
Sep 2009184
Aug 2009199
Jul 2009312
Jun 2009196
May 2009163
Apr 2009247
Mar 2009408
Feb 2009214
Jan 2009204
Dec 2008229
Nov 2008193
Oct 2008171
Sep 2008269
Aug 2008165
Jul 2008122
Jun 2008243
May 2008220
Apr 2008294
Mar 2008209
Feb 2008191
Jan 2008272
Dec 2007145
Nov 2007228
Oct 2007261
Sep 2007273
Aug 2007292
Jul 2007339
Jun 2007392
May 2007242
Apr 2007309
Mar 2007283
Feb 2007188
Jan 2007370
Dec 2006225
Nov 2006160
Oct 2006251
Sep 2006412
Aug 2006450
Jul 2006315
Jun 2006380
May 2006232
Apr 2006458
Mar 2006659
Feb 2006581
Jan 2006592
Dec 2005430
Nov 2005398
Oct 2005304
Sep 2005404
Aug 2005278
Jul 2005342
Jun 2005216
May 2005151
Apr 2005220
Mar 2005167