nutch-user mailing list archives: February 2012

Site index · List index
Message list1 · 2 · 3 · Next »Thread · Author · Date
alx...@aim.com Solrdedup fails due to date format Wed, 01 Feb, 05:26
Alexander Aristov   Re: Solrdedup fails due to date format Wed, 01 Feb, 05:34
alx...@aim.com     Re: Solrdedup fails due to date format Wed, 01 Feb, 05:50
alx...@aim.com     Re: Solrdedup fails due to date format Thu, 02 Feb, 03:56
remi tassing is it necessary to merge DBs before solrindex? Wed, 01 Feb, 06:23
Re: why nutch dosen't crawl Arabic sites well?
mina   Re: why nutch dosen't crawl Arabic sites well? Wed, 01 Feb, 07:44
remi tassing     Re: why nutch dosen't crawl Arabic sites well? Wed, 01 Feb, 17:55
Re: Focused crawling with nutch
Vijith   Re: Focused crawling with nutch Wed, 01 Feb, 10:53
Lewis John Mcgibbney     Re: Focused crawling with nutch Wed, 01 Feb, 13:08
Vijith       Re: Focused crawling with nutch Thu, 02 Feb, 05:18
Vijith         Re: Focused crawling with nutch Fri, 03 Feb, 10:33
Markus Jelsma           Re: Focused crawling with nutch Fri, 03 Feb, 10:36
Lewis John Mcgibbney           Re: Focused crawling with nutch Fri, 03 Feb, 10:41
Vijith             Re: Focused crawling with nutch Fri, 03 Feb, 11:22
Markus Jelsma               Re: Focused crawling with nutch Fri, 03 Feb, 11:29
Vijith                 Re: Focused crawling with nutch Fri, 03 Feb, 12:34
Markus Jelsma                   Re: Focused crawling with nutch Fri, 03 Feb, 12:35
Vijith                     Re: Focused crawling with nutch Sat, 04 Feb, 05:25
Julien Nioche                       Re: Focused crawling with nutch Mon, 06 Feb, 10:05
Re: why nutch dosen't crawl all links
mina   Re: why nutch dosen't crawl all links Wed, 01 Feb, 13:09
mina Bad Request in nutch when i use parsechecker? Wed, 01 Feb, 13:12
mina   Re: Bad Request in nutch when i use parsechecker? Wed, 01 Feb, 13:14
Markus Jelsma     Re: Bad Request in nutch when i use parsechecker? Wed, 01 Feb, 15:05
mina       Re: Bad Request in nutch when i use parsechecker? Wed, 01 Feb, 15:26
Markus Jelsma         Re: Bad Request in nutch when i use parsechecker? Wed, 01 Feb, 15:57
mina           Re: Bad Request in nutch when i use parsechecker? Thu, 02 Feb, 06:37
Joshua J Pavel Error with solrindex Wed, 01 Feb, 14:00
Markus Jelsma   Re: Error with solrindex Wed, 01 Feb, 14:21
Joshua J Pavel     Re: Error with solrindex Wed, 01 Feb, 16:29
Joshua J Pavel     Re: Error with solrindex Wed, 01 Feb, 18:05
Re: invalid uri with "three dots"
remi tassing   Re: invalid uri with "three dots" Wed, 01 Feb, 18:18
kaveh minooie org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: Thu, 02 Feb, 00:32
mina how can i use patch-with-utf8-encoding.diff in https://issues.apache.org/jira/browse/NUTCH-1098? Thu, 02 Feb, 11:49
Julien Nioche   Re: how can i use patch-with-utf8-encoding.diff in https://issues.apache.org/jira/browse/NUTCH-1098? Thu, 02 Feb, 14:08
Lewis John Mcgibbney Nutch 2.0 Webapp Thu, 02 Feb, 14:06
Dean Pullen Failed fetching Thu, 02 Feb, 16:44
Dean Pullen   Re: Failed fetching Thu, 02 Feb, 17:11
Dean Pullen     Re: Failed fetching Thu, 02 Feb, 17:22
Lewis John Mcgibbney       Re: Failed fetching Thu, 02 Feb, 18:01
Dean Pullen         Re: Failed fetching Fri, 03 Feb, 11:06
tiagorcs           Re: Failed fetching Mon, 06 Feb, 03:31
tiagorcs             Re: Failed fetching Mon, 06 Feb, 04:37
Lewis John Mcgibbney               Re: Failed fetching Fri, 10 Feb, 21:18
remi tassing                 Re: Failed fetching Tue, 14 Feb, 18:03
Lewis John Mcgibbney                   Re: Failed fetching Tue, 14 Feb, 18:08
tiagorcs                     Re: Failed fetching Wed, 15 Feb, 01:46
remi tassing                       Re: Failed fetching Wed, 15 Feb, 09:50
tiagorcs                       Re: Failed fetching Wed, 22 Feb, 01:11
Markus Jelsma       Re: Failed fetching Thu, 02 Feb, 18:17
tiagorcs         Re: Failed fetching Fri, 03 Feb, 10:01
tiagorcs           Re: Failed fetching Fri, 03 Feb, 10:06
Lewis John Mcgibbney             Re: Failed fetching Fri, 03 Feb, 10:11
tiagorcs               Re: Failed fetching Fri, 03 Feb, 10:22
Markus Jelsma                 Re: Failed fetching Fri, 03 Feb, 10:22
tiagorcs                   Re: Failed fetching Fri, 03 Feb, 10:48
Markus Jelsma                     Re: Failed fetching Fri, 03 Feb, 10:49
tiagorcs                       Re: Failed fetching Fri, 03 Feb, 10:57
Markus Jelsma                         Re: Failed fetching Fri, 03 Feb, 11:02
abhayd index-blacklist-whitelist pluign for multiple set of urls Fri, 03 Feb, 00:10
Elisabeth Adler   Re: index-blacklist-whitelist pluign for multiple set of urls Fri, 03 Feb, 08:32
abhayd     Re: index-blacklist-whitelist pluign for multiple set of urls Fri, 03 Feb, 14:58
Matt Poff How parse *only* specific URLs under a domain... -depth 1 -topN 1 does not work as desired Fri, 03 Feb, 00:37
Markus Jelsma   Re: How parse *only* specific URLs under a domain... -depth 1 -topN 1 does not work as desired Fri, 03 Feb, 09:07
Matt Poff     Re: How parse *only* specific URLs under a domain... -depth 1 -topN 1 does not work as desired Fri, 03 Feb, 10:39
Adriana Farina   Re: How parse *only* specific URLs under a domain... -depth 1 -topN 1 does not work as desired Fri, 03 Feb, 09:13
costas0811 Crawling Local Files within Cygwin Fri, 03 Feb, 04:06
Adriana Farina   Re: Crawling Local Files within Cygwin Fri, 03 Feb, 08:59
costas0811     RE: Crawling Local Files within Cygwin Fri, 03 Feb, 13:53
webdev1977       RE: Crawling Local Files within Cygwin Tue, 07 Feb, 12:20
costas0811         RE: Crawling Local Files within Cygwin Tue, 07 Feb, 15:10
webdev1977           RE: Crawling Local Files within Cygwin Tue, 21 Feb, 20:43
costas0811             RE: Crawling Local Files within Cygwin Tue, 21 Feb, 20:45
nutchsolruser Nutch unfetched urls count Fri, 03 Feb, 08:37
Markus Jelsma   Re: Nutch unfetched urls count Mon, 06 Feb, 15:47
Marek Bachmann Is it still possible to create a pure lucene index? Fri, 03 Feb, 18:55
Lewis John Mcgibbney   Re: Is it still possible to create a pure lucene index? Fri, 03 Feb, 19:15
Marek Bachmann     Re: Is it still possible to create a pure lucene index? Sat, 04 Feb, 21:08
Lewis John Mcgibbney       Re: Is it still possible to create a pure lucene index? Sun, 05 Feb, 09:54
SUJIT PAL One WebPage to many NutchDocuments Sat, 04 Feb, 18:59
SUJIT PAL   Re: One WebPage to many NutchDocuments Sat, 04 Feb, 20:17
Markus Jelsma     Re: One WebPage to many NutchDocuments Sat, 04 Feb, 21:16
SUJIT PAL       Re: One WebPage to many NutchDocuments Sat, 04 Feb, 21:57
Joshua J Pavel Custom Plugin - Multiple Title values error Sat, 04 Feb, 19:49
kaveh minooie nutch logs when run over hadoop Sat, 04 Feb, 23:09
Markus Jelsma   Re: nutch logs when run over hadoop Mon, 06 Feb, 15:46
kaveh minooie     Re: nutch logs when run over hadoop Mon, 06 Feb, 20:04
Markus Jelsma       Re: nutch logs when run over hadoop Mon, 06 Feb, 20:08
kaveh minooie         Re: nutch logs when run over hadoop Mon, 06 Feb, 20:28
Markus Jelsma           Re: nutch logs when run over hadoop Mon, 06 Feb, 20:36
kaveh minooie             Thread spinWaiting, utilizing bandwidth and connection time out error Mon, 06 Feb, 22:22
Xiao Li Just fetch a specified URL list Mon, 06 Feb, 05:07
Markus Jelsma   Re: Just fetch a specified URL list Mon, 06 Feb, 15:46
ka...@plutoz.com     Re: Just fetch a specified URL list Wed, 08 Feb, 04:15
Michael Kazekin RSS parser Mon, 06 Feb, 13:10
dspathis   Re: RSS parser Tue, 07 Feb, 15:07
Lewis John Mcgibbney     Re: RSS parser Tue, 07 Feb, 18:39
Michael Kazekin     Re: RSS parser Wed, 08 Feb, 08:44
Lewis John Mcgibbney       Re: RSS parser Wed, 08 Feb, 10:54
Michael Kazekin         Re: RSS parser Tue, 14 Feb, 16:31
Lewis John Mcgibbney           Re: RSS parser Tue, 14 Feb, 18:24
dspathis       Re: RSS parser Wed, 08 Feb, 14:44
Michael Kazekin         Re: RSS parser Fri, 10 Feb, 12:24
Lewis John Mcgibbney           Re: RSS parser Fri, 10 Feb, 20:12
Danicela nutch Too few parsed pages Mon, 06 Feb, 15:44
Markus Jelsma   Re: Too few parsed pages Mon, 06 Feb, 15:45
Re : Re: Too few parsed pages
Danicela nutch   Re : Re: Too few parsed pages Mon, 06 Feb, 16:03
Markus Jelsma     Re: Too few parsed pages Mon, 06 Feb, 16:06
Danicela nutch   Re : Re: Too few parsed pages Fri, 17 Feb, 16:32
how are CSV/TXT files handled
remi tassing   how are CSV/TXT files handled Tue, 07 Feb, 07:16
remi tassing     Re: how are CSV/TXT files handled Tue, 07 Feb, 07:58
remi tassing       Re: how are CSV/TXT files handled Tue, 07 Feb, 08:08
Markus Jelsma         Re: how are CSV/TXT files handled Tue, 07 Feb, 09:17
remi tassing           Re: how are CSV/TXT files handled Wed, 08 Feb, 09:22
Lewis John Mcgibbney             Re: how are CSV/TXT files handled Wed, 08 Feb, 10:50
remi tassing               Re: how are CSV/TXT files handled Wed, 08 Feb, 14:04
Lewis John Mcgibbney                 Re: how are CSV/TXT files handled Fri, 10 Feb, 21:16
remi tassing                   Re: how are CSV/TXT files handled Wed, 15 Feb, 13:33
remi tassing   how are CSV/TXT files handled Tue, 07 Feb, 14:37
conta...@complexityintelligence.com Dump into Cassandra using Nutch 1.x Tue, 07 Feb, 14:12
Julien Nioche   Re: Dump into Cassandra using Nutch 1.x Tue, 07 Feb, 14:38
conta...@complexityintelligence.com   RE: Dump into Cassandra using Nutch 1.x Tue, 07 Feb, 15:15
Peyman Mohajerian     Re: Dump into Cassandra using Nutch 1.x Wed, 08 Feb, 15:43
conta...@complexityintelligence.com   RE: Dump into Cassandra using Nutch 1.x Tue, 14 Feb, 15:13
Lewis John Mcgibbney     Re: Dump into Cassandra using Nutch 1.x Tue, 14 Feb, 15:16
conta...@complexityintelligence.com   RE: Dump into Cassandra using Nutch 1.x Tue, 14 Feb, 16:05
Julien Nioche     Re: Dump into Cassandra using Nutch 1.x Tue, 14 Feb, 16:27
Lewis John Mcgibbney Solandra & Nutch [WAS] Re: Dump into Cassandra using Nutch 1.x Wed, 08 Feb, 16:46
Julien Nioche   Re: Solandra & Nutch [WAS] Re: Dump into Cassandra using Nutch 1.x Wed, 08 Feb, 17:50
Lewis John Mcgibbney     Re: Solandra & Nutch [WAS] Re: Dump into Cassandra using Nutch 1.x Wed, 08 Feb, 18:45
Peyman Mohajerian       Re: Solandra & Nutch [WAS] Re: Dump into Cassandra using Nutch 1.x Wed, 08 Feb, 18:52
Re: Java out of memory error
webdev1977   Re: Java out of memory error Wed, 08 Feb, 19:10
Bai Shen     Re: Java out of memory error Thu, 09 Feb, 17:39
Sudip Datta Seed urls not getting crawled. Thu, 09 Feb, 07:26
Lewis John Mcgibbney   Re: Seed urls not getting crawled. Fri, 10 Feb, 21:00
Haggai R WARN regex.RegexURLNormalizer: Can't load the default rules! during Nutch Crawl Thu, 09 Feb, 08:16
Lewis John Mcgibbney   Re: WARN regex.RegexURLNormalizer: Can't load the default rules! during Nutch Crawl Fri, 10 Feb, 20:46
kaveh minooie generate.count.mode host vs. domain Thu, 09 Feb, 19:57
Markus Jelsma   Re: generate.count.mode host vs. domain Thu, 09 Feb, 20:05
Joshua J Pavel How do "content" and "parseResult" relate? Thu, 09 Feb, 22:22
Joshua J Pavel   Re: How do "content" and "parseResult" relate? Fri, 10 Feb, 15:43
gauravchaudhary Problem in crawling a button (which contains a link) through Nutch Fri, 10 Feb, 07:28
Lewis John Mcgibbney   Re: Problem in crawling a button (which contains a link) through Nutch Fri, 10 Feb, 20:28
kaveh minooie fetch status in hadoop jobtasks.jsp Fri, 10 Feb, 18:58
Markus Jelsma   Re: fetch status in hadoop jobtasks.jsp Sat, 11 Feb, 09:38
kaveh minooie number of map tasks for a fetch job Fri, 10 Feb, 19:24
Julien Nioche   Re: number of map tasks for a fetch job Fri, 10 Feb, 19:29
kaveh minooie     Re: number of map tasks for a fetch job Fri, 10 Feb, 19:38
Lewis John Mcgibbney Understanding NutchConfigration properly Sat, 11 Feb, 22:21
Markus Jelsma   Re: Understanding NutchConfigration properly Sat, 11 Feb, 22:58
Lewis John Mcgibbney     Re: Understanding NutchConfigration properly Sun, 12 Feb, 17:04
Julien Nioche       Re: Understanding NutchConfigration properly Sun, 12 Feb, 17:05
Lewis John Mcgibbney         Re: Understanding NutchConfigration properly Sun, 12 Feb, 17:07
Julien Nioche           Re: Understanding NutchConfigration properly Sun, 12 Feb, 17:48
remi tassing             Re: Understanding NutchConfigration properly Sun, 12 Feb, 17:54
Julien Nioche   Re: Understanding NutchConfigration properly Sun, 12 Feb, 15:03
webdev1977 Stylesheet in plugin not found when run in distributed mode Mon, 13 Feb, 15:48
Julien Nioche   Re: Stylesheet in plugin not found when run in distributed mode Mon, 13 Feb, 16:05
webdev1977     Re: Stylesheet in plugin not found when run in distributed mode Mon, 13 Feb, 16:08
Julien Nioche       Re: Stylesheet in plugin not found when run in distributed mode Mon, 13 Feb, 16:21
webdev1977         Re: Stylesheet in plugin not found when run in distributed mode Mon, 13 Feb, 16:30
webdev1977           Re: Stylesheet in plugin not found when run in distributed mode Wed, 15 Feb, 15:27
webdev1977             Re: Stylesheet in plugin not found when run in distributed mode Thu, 16 Feb, 12:46
kaveh minooie Invalid uri? Mon, 13 Feb, 23:57
Sebastian Nagel   Re: Invalid uri? Tue, 14 Feb, 00:38
remi tassing     Re: Invalid uri? Tue, 14 Feb, 03:34
Sebastian Nagel     Re: Invalid uri? Tue, 14 Feb, 19:48
Puneet Pandey Build a pipeline using nutch Tue, 14 Feb, 05:12
Lewis John Mcgibbney   Re: Build a pipeline using nutch Tue, 14 Feb, 10:36
Puneet Pandey     Re: Build a pipeline using nutch Wed, 15 Feb, 19:29
Markus Jelsma       Re: Build a pipeline using nutch Wed, 15 Feb, 19:40
Magnús Skúlason         Re: Build a pipeline using nutch Wed, 15 Feb, 23:47
Puneet Pandey           Re: Build a pipeline using nutch Thu, 16 Feb, 09:55
Markus Jelsma   Re: Build a pipeline using nutch Wed, 15 Feb, 19:31
remi tassing     Re: Build a pipeline using nutch Thu, 16 Feb, 04:28
Markus Jelsma       Re: Build a pipeline using nutch Thu, 16 Feb, 10:21
Puneet Pandey     Re: Build a pipeline using nutch Thu, 16 Feb, 10:19
Markus Jelsma       Re: Build a pipeline using nutch Thu, 16 Feb, 10:22
Puneet Pandey       Fwd: Build a pipeline using nutch Thu, 16 Feb, 13:36
Akash Ashok Are Injector, Generator, Fetcher and Parser Pluggable? Tue, 14 Feb, 10:54
Message list1 · 2 · 3 · Next »Thread · Author · Date
Box list
Jul 201441
Jun 2014123
May 2014188
Apr 2014127
Mar 2014228
Feb 2014149
Jan 2014109
Dec 2013193
Nov 2013164
Oct 2013207
Sep 201383
Aug 2013251
Jul 2013362
Jun 2013481
May 2013215
Apr 2013219
Mar 2013305
Feb 2013350
Jan 2013279
Dec 2012174
Nov 2012309
Oct 2012314
Sep 2012206
Aug 2012387
Jul 2012336
Jun 2012309
May 2012348
Apr 2012208
Mar 2012235
Feb 2012349
Jan 2012319
Dec 2011319
Nov 2011322
Oct 2011291
Sep 2011305
Aug 2011305
Jul 2011606
Jun 2011283
May 2011159
Apr 2011178
Mar 2011222
Feb 2011241
Jan 2011236
Dec 2010184
Nov 2010266
Oct 2010240
Sep 2010279
Aug 2010230
Jul 2010204
Jun 2010151
May 2010173
Apr 2010194
Mar 2010148
Feb 2010136
Jan 2010193
Dec 2009259
Nov 2009308
Oct 2009258
Sep 2009184
Aug 2009199
Jul 2009312
Jun 2009196
May 2009163
Apr 2009247
Mar 2009408
Feb 2009214
Jan 2009204
Dec 2008249
Nov 2008194
Oct 2008171
Sep 2008269
Aug 2008165
Jul 2008122
Jun 2008243
May 2008220
Apr 2008294
Mar 2008209
Feb 2008194
Jan 2008284
Dec 2007146
Nov 2007233
Oct 2007268
Sep 2007273
Aug 2007301
Jul 2007339
Jun 2007392
May 2007242
Apr 2007309
Mar 2007283
Feb 2007188
Jan 2007370
Dec 2006225
Nov 2006160
Oct 2006251
Sep 2006412
Aug 2006450
Jul 2006315
Jun 2006380
May 2006232
Apr 2006458
Mar 2006659
Feb 2006581
Jan 2006592
Dec 2005430
Nov 2005398
Oct 2005304
Sep 2005404
Aug 2005278
Jul 2005342
Jun 2005216
May 2005151
Apr 2005220
Mar 2005167