Mailing list archives: September 2008

Site index · List index
Message list1 · 2 · 3 · Next »Thread · Author · Date
getting content from url - encoding problem
Onur Deniz   getting content from url - encoding problem Mon, 01 Sep, 08:36
Onur Deniz     Re:Re: getting content from url - encoding problem Tue, 02 Sep, 13:47
郑世强       =?utf-8?B?UmU6IFJlOlJlOiBnZXR0aW5nIGNvbnRlbnQgZnJvbSB1cmwgLSBlbmNvZGluZyBwcm9ibGVt?= Tue, 02 Sep, 14:32
Onur Deniz   getting content from url - encoding problem Mon, 01 Sep, 09:00
Onur Deniz   Re: getting content from url - encoding problem Mon, 01 Sep, 12:37
Ö£ÊÀÇ¿     Re:Re: getting content from url - encoding problem Tue, 02 Sep, 11:54
convoyer How to Oracle instead of file to fetch url Mon, 01 Sep, 09:48
David Smith Nutch ignoring robots.txt Tue, 02 Sep, 02:59
¹¬ÕÕ can not deal too many files under one folder Tue, 02 Sep, 03:43
Onur Deniz   Re: can not deal too many files under one folder Tue, 02 Sep, 13:25
¹¬ÕÕ     Re: can not deal too many files under one folder Thu, 04 Sep, 02:04
Srinivas Gokavarapu   Re: can not deal too many files under one folder Tue, 02 Sep, 13:28
convoyer How to get the search responce as xml or json Tue, 02 Sep, 11:04
Edward Quick invalid urls Tue, 02 Sep, 21:00
karthik085 Skipping certain characters to special urls Tue, 02 Sep, 21:10
Edward Quick FW: invalid urls Tue, 02 Sep, 21:45
zhengsj03   Re: FW: invalid urls Wed, 03 Sep, 01:56
Edward Quick     RE: invalid urls Wed, 03 Sep, 08:05
Mohammad Monirul Hoque problems: crawling specific domain Wed, 03 Sep, 04:53
David Jashi   Re: problems: crawling specific domain Wed, 03 Sep, 10:22
Re: A problem for web site needing username & password
Michael Piccuirro   Re: A problem for web site needing username & password Wed, 03 Sep, 15:10
zhengsj03 User     Re: A problem for web site needing username & password Wed, 03 Sep, 16:29
Edward Quick intranet crawling Thu, 04 Sep, 14:56
David Jashi   Re: intranet crawling Thu, 04 Sep, 15:42
Edward Quick Job failed! Fri, 05 Sep, 08:46
zhengsj03   Re: Job failed! Fri, 05 Sep, 09:28
Edward Quick     RE: Job failed! Fri, 05 Sep, 09:45
Edward Quick     FW: Job failed! Fri, 05 Sep, 21:09
Edward Quick     FW: Job failed! Fri, 05 Sep, 21:49
Edward Quick     FW: Job failed! Fri, 05 Sep, 21:58
Edward Quick     FW: Job failed! Sat, 06 Sep, 07:10
Edward Quick     FW: Job failed! Sun, 07 Sep, 14:41
Edward Quick error parsing Microsoft documents Fri, 05 Sep, 10:09
Looking to count links with Nutch
Kevin MacDonald   Looking to count links with Nutch Fri, 05 Sep, 23:00
Kevin MacDonald   Looking to count links with Nutch Fri, 05 Sep, 23:07
kevin chen     Re: Looking to count links with Nutch Sat, 06 Sep, 15:19
Kevin MacDonald       Re: Looking to count links with Nutch Sat, 06 Sep, 21:57
Dennis Kubes         Re: Looking to count links with Nutch Sun, 07 Sep, 02:13
Kevin MacDonald           Re: Looking to count links with Nutch Sun, 07 Sep, 02:44
Dennis Kubes             Re: Looking to count links with Nutch Sun, 07 Sep, 04:43
Kevin MacDonald               Re: Looking to count links with Nutch Mon, 08 Sep, 21:21
Dennis Kubes                 Re: Looking to count links with Nutch Wed, 10 Sep, 00:34
afan0804 Nutch searcher keeps reading CVS directories Fri, 05 Sep, 23:14
Dennis Kubes   Re: Nutch searcher keeps reading CVS directories Sun, 07 Sep, 02:35
afan0804     Re: Nutch searcher keeps reading CVS directories Mon, 08 Sep, 20:37
Kevin MacDonald Debugging Nutch in Netbeans Mon, 08 Sep, 17:12
Kevin MacDonald   Re: Debugging Nutch in Netbeans Mon, 08 Sep, 22:37
Andrzej Bialecki     Re: Debugging Nutch in Netbeans Mon, 08 Sep, 22:57
Kevin MacDonald Running in 'local' mode Mon, 08 Sep, 21:42
Kevin MacDonald Working with the Link database Tue, 09 Sep, 00:53
Amitabha Banerjee Problems Indexing Tue, 09 Sep, 02:54
Mohammad Monirul Hoque Is it possible to add new urls while nutch crawler is still running? Tue, 09 Sep, 11:18
Dennis Kubes   Re: Is it possible to add new urls while nutch crawler is still running? Wed, 10 Sep, 00:40
Kevin MacDonald Outlinks not being processed Tue, 09 Sep, 17:22
Amitabha Banerjee   Re: Outlinks not being processed Tue, 09 Sep, 17:30
Kevin MacDonald     Re: Outlinks not being processed Tue, 09 Sep, 18:25
Kevin MacDonald       Re: Outlinks not being processed Tue, 09 Sep, 18:57
nutch fetch issue - empty content
Viral Shah   nutch fetch issue - empty content Tue, 09 Sep, 22:09
Viral Shah   nutch fetch issue - empty content Tue, 09 Sep, 23:54
jcze resulting URL isnt really the URL where the keyword is Wed, 10 Sep, 06:11
Edward Quick influencing the page scores Wed, 10 Sep, 10:32
Edward Quick relative urls Wed, 10 Sep, 10:53
Edward Quick   RE: relative urls Wed, 10 Sep, 15:43
Edward Quick     RE: relative urls Wed, 10 Sep, 16:05
Kevin MacDonald       Re: relative urls Wed, 10 Sep, 16:56
DoÄŸacan Güney       Re: relative urls Wed, 10 Sep, 17:06
Andrzej Bialecki         Re: relative urls Wed, 10 Sep, 18:08
Kevin MacDonald Deploying nutch Wed, 10 Sep, 19:36
Kevin MacDonald   Re: Deploying nutch Wed, 10 Sep, 20:22
Andrzej Bialecki     Re: Deploying nutch Wed, 10 Sep, 21:17
Kevin MacDonald       Re: Deploying nutch Wed, 10 Sep, 22:20
zhengping deng         nutch speed problem Thu, 11 Sep, 01:39
zhengping deng         how to improve nutch crawl speed? Thu, 11 Sep, 14:54
Edward Quick           RE: how to improve nutch crawl speed? Thu, 11 Sep, 17:32
Amitabha Banerjee Unable to crawl all links Thu, 11 Sep, 03:29
Kevin MacDonald   Re: Unable to crawl all links Thu, 11 Sep, 06:09
vishal vachhani   Re: Unable to crawl all links Fri, 12 Sep, 07:00
Chetan Patel     Re: Unable to crawl all links Fri, 26 Sep, 13:16
Kevin MacDonald       Re: Unable to crawl all links Fri, 26 Sep, 15:19
Chetan Patel         Re: Unable to crawl all links Sat, 27 Sep, 06:18
Edward Quick           RE: Unable to crawl all links Sat, 27 Sep, 09:01
Chetan Patel             RE: Unable to crawl all links Sat, 27 Sep, 09:48
vishal vachhani               Re: Unable to crawl all links Sat, 27 Sep, 11:49
Edward Quick               RE: Unable to crawl all links Sat, 27 Sep, 11:56
Saurabh Bhutyani   Re:Unable to crawl all links Fri, 12 Sep, 10:28
Kevin MacDonald     Re: Unable to crawl all links Fri, 12 Sep, 22:36
con       Re: Unable to crawl all links Wed, 24 Sep, 06:18
Chetan Patel     Re: Re:Unable to crawl all links Fri, 26 Sep, 13:05
Matthias W. Edit index structure Thu, 11 Sep, 08:53
Raj Malhotra getting exception while creating folder in OPencms Thu, 11 Sep, 14:00
Raj Malhotra   Fwd: getting exception while creating folder in OPencms Thu, 11 Sep, 14:27
Kevin MacDonald Allowing http and https crawling Thu, 11 Sep, 22:39
Kevin MacDonald   Re: Allowing http and https crawling Thu, 11 Sep, 23:07
David Jashi Problems with highlighter Fri, 12 Sep, 07:02
Lyndon Maydwell   Re: Problems with highlighter Fri, 12 Sep, 09:34
David Jashi     Re: Problems with highlighter Fri, 12 Sep, 09:48
Kevin MacDonald Optimizing nutch Sat, 13 Sep, 22:53
Kevin MacDonald   Re: Optimizing nutch Sat, 13 Sep, 23:45
zhengping deng   RE: Optimizing nutch Tue, 16 Sep, 01:55
Crawling password protected pages in NUTCH...
Rout Biswajit-B16078   Crawling password protected pages in NUTCH... Mon, 15 Sep, 11:04
Rout Biswajit-B16078   Crawling password protected pages in NUTCH... Mon, 15 Sep, 11:37
Rout Biswajit-B16078   Crawling password protected pages in NUTCH... Mon, 15 Sep, 11:42
Re: hadoop dfs -ls and nutch generate/fetch commands
Chetan Patel   Re: hadoop dfs -ls and nutch generate/fetch commands Mon, 15 Sep, 11:43
DoÄŸacan Güney     Re: hadoop dfs -ls and nutch generate/fetch commands Mon, 15 Sep, 11:56
Chetan Patel       Re: hadoop dfs -ls and nutch generate/fetch commands Mon, 15 Sep, 12:26
Dennis Kubes         Re: hadoop dfs -ls and nutch generate/fetch commands Mon, 15 Sep, 13:12
Chetan Patel         Re: hadoop dfs -ls and nutch generate/fetch commands Mon, 15 Sep, 13:49
Rout Biswajit-B16078 Not able to crawl password protected pages using NUTCH 0.9 Mon, 15 Sep, 12:37
Kunthar   Re: Not able to crawl password protected pages using NUTCH 0.9 Mon, 15 Sep, 12:57
Susam Pal   Re: Not able to crawl password protected pages using NUTCH 0.9 Mon, 15 Sep, 13:03
biswajit_rout     Re: Not able to crawl password protected pages using NUTCH 0.9 Mon, 15 Sep, 13:20
Susam Pal       Re: Not able to crawl password protected pages using NUTCH 0.9 Mon, 15 Sep, 17:48
biswajit_rout         Re: Not able to crawl password protected pages using NUTCH 0.9 Tue, 16 Sep, 08:03
biswajit_rout           Re: Not able to crawl password protected pages using NUTCH 0.9 Tue, 16 Sep, 08:06
Susam Pal           Re: Not able to crawl password protected pages using NUTCH 0.9 Tue, 16 Sep, 08:07
biswajit_rout             Re: Not able to crawl password protected pages using NUTCH 0.9 Tue, 16 Sep, 12:33
biswajit_rout               Re: Not able to crawl password protected pages using NUTCH 0.9 Tue, 16 Sep, 15:33
Susam Pal                 Re: Not able to crawl password protected pages using NUTCH 0.9 Tue, 16 Sep, 16:38
biswajit_rout                   Re: Not able to crawl password protected pages using NUTCH 0.9 Tue, 16 Sep, 17:24
Susam Pal                     Re: Not able to crawl password protected pages using NUTCH 0.9 Tue, 16 Sep, 17:35
biswajit_rout                       Re: Not able to crawl password protected pages using NUTCH 0.9 Thu, 18 Sep, 13:10
biswajit_rout                         Re: Not able to crawl password protected pages using NUTCH 0.9 Fri, 19 Sep, 05:37
biswajit_rout                         Re: Not able to crawl password protected pages using NUTCH 0.9 Fri, 19 Sep, 05:38
Susam Pal                           Re: Not able to crawl password protected pages using NUTCH 0.9 Fri, 19 Sep, 14:56
biswajit_rout                             Re: Not able to crawl password protected pages using NUTCH 0.9 Mon, 22 Sep, 08:10
Susam Pal                               Re: Not able to crawl password protected pages using NUTCH 0.9 Mon, 22 Sep, 08:16
biswajit_rout                                 Re: Not able to crawl password protected pages using NUTCH 0.9 Thu, 25 Sep, 06:33
Kevin MacDonald Fetcher vs. Fetcher2 Mon, 15 Sep, 16:32
Kevin MacDonald   Re: Fetcher vs. Fetcher2 Mon, 15 Sep, 17:22
David Grandinetti     Re: Fetcher vs. Fetcher2 Mon, 15 Sep, 17:40
Kevin MacDonald       Re: Fetcher vs. Fetcher2 Mon, 15 Sep, 18:08
Kevin MacDonald         Re: Fetcher vs. Fetcher2 Mon, 15 Sep, 18:35
Kevin MacDonald Extracting Content-Length Mon, 15 Sep, 23:07
Srinivas Gokavarapu Re: Temporary storage during crawling Tue, 16 Sep, 05:20
Susam Pal   Re: Temporary storage during crawling Tue, 16 Sep, 05:28
Srinivas Gokavarapu     Re: Temporary storage during crawling Tue, 16 Sep, 16:36
Onur Deniz modifiying a core class (Content.java) using plugins? Tue, 16 Sep, 13:09
Onur Deniz   Re: modifiying a core class (Content.java) using plugins? Wed, 17 Sep, 13:33
Kevin MacDonald Creating custom segment dumps Tue, 16 Sep, 15:58
Message list1 · 2 · 3 · Next »Thread · Author · Date
Box list
Nov 2009268
Oct 2009258
Sep 2009184
Aug 2009199
Jul 2009312
Jun 2009196
May 2009163
Apr 2009247
Mar 2009408
Feb 2009214
Jan 2009204
Dec 2008229
Nov 2008193
Oct 2008171
Sep 2008269
Aug 2008165
Jul 2008122
Jun 2008243
May 2008220
Apr 2008294
Mar 2008209
Feb 2008191
Jan 2008272
Dec 2007145
Nov 2007228
Oct 2007261
Sep 2007273
Aug 2007292
Jul 2007339
Jun 2007392
May 2007242
Apr 2007309
Mar 2007283
Feb 2007188
Jan 2007370
Dec 2006225
Nov 2006160
Oct 2006251
Sep 2006412
Aug 2006450
Jul 2006315
Jun 2006380
May 2006232
Apr 2006458
Mar 2006659
Feb 2006581
Jan 2006592
Dec 2005430
Nov 2005398
Oct 2005304
Sep 2005404
Aug 2005278
Jul 2005342
Jun 2005216
May 2005151
Apr 2005220
Mar 2005167