Mailing list archives: May 2007

Site index · List index
Message list« Previous · 1 · 2Thread · Author · Date
Andrzej Bialecki (JIRA) [jira] Updated: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser Mon, 14 May, 14:52
Sami Siren (JIRA) [jira] Resolved: (NUTCH-457) Create top level dist directory and checkin KEYS file to subversion be standard with Lucene Java and Hadoop Mon, 14 May, 15:16
Doğacan Güney (JIRA) [jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser Mon, 14 May, 17:52
Doğacan Güney (JIRA) [jira] Updated: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser Mon, 14 May, 17:54
Andrzej Bialecki (JIRA) [jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser Mon, 14 May, 21:58
Mark Woon (JIRA) [jira] Created: (NUTCH-486) Break searcher dependency on commons-cli Mon, 14 May, 23:36
Andrzej Bialecki (JIRA) [jira] Commented: (NUTCH-486) Break searcher dependency on commons-cli Tue, 15 May, 06:22
Sami Siren (JIRA) [jira] Updated: (NUTCH-161) Change Plain text parser to use parser.character.encoding.default property for fall back encoding Tue, 15 May, 18:32
Sami Siren (JIRA) [jira] Resolved: (NUTCH-161) Change Plain text parser to use parser.character.encoding.default property for fall back encoding Tue, 15 May, 18:32
rubdabadub Re: Issues pending before 0.9 release Thu, 17 May, 04:21
Andrzej Bialecki Re: Issues pending before 0.9 release Fri, 18 May, 07:17
Ilya Vishnevsky bug in SegmentReader Mon, 21 May, 08:42
Marcin Okraszewski =?UTF-8?Q?Bug_(with_fix):_Neko_HTML_parser_goes_on_defaults.?= Mon, 21 May, 10:45
Doğacan Güney Re: Bug (with fix): Neko HTML parser goes on defaults. Mon, 21 May, 13:47
Marcin Okraszewski (JIRA) [jira] Created: (NUTCH-487) Neko HTML parser goes on default settings. Mon, 21 May, 14:06
Marcin Okraszewski (JIRA) [jira] Updated: (NUTCH-487) Neko HTML parser goes on default settings. Mon, 21 May, 14:06
Marcin Okraszewski =?UTF-8?Q?Re:_Bug_(with_fix):_Neko_HTML_parser_goes_on_defaults.?= Mon, 21 May, 14:09
Doug Cook (JIRA) [jira] Commented: (NUTCH-25) needs 'character encoding' detector Mon, 21 May, 16:34
Ken Krugler (JIRA) [jira] Commented: (NUTCH-25) needs 'character encoding' detector Mon, 21 May, 18:01
Doğacan Güney (JIRA) [jira] Updated: (NUTCH-25) needs 'character encoding' detector Mon, 21 May, 20:48
Emmanuel Joke (JIRA) [jira] Created: (NUTCH-488) Avoid parsing uneccessary links and get a more relevant outlink list Tue, 22 May, 07:38
Emmanuel Joke (JIRA) [jira] Updated: (NUTCH-488) Avoid parsing uneccessary links and get a more relevant outlink list Tue, 22 May, 07:40
Emmanuel Joke (JIRA) [jira] Updated: (NUTCH-488) Avoid parsing uneccessary links and get a more relevant outlink list Tue, 22 May, 07:42
Emmanuel Joke (JIRA) [jira] Created: (NUTCH-489) URLFilter-suffix management of the url path when the url contains some query parameters Tue, 22 May, 08:35
Emmanuel Joke (JIRA) [jira] Updated: (NUTCH-489) URLFilter-suffix management of the url path when the url contains some query parameters Tue, 22 May, 08:37
Doğacan Güney (JIRA) [jira] Commented: (NUTCH-489) URLFilter-suffix management of the url path when the url contains some query parameters Tue, 22 May, 09:23
Marcin Okraszewski (JIRA) [jira] Created: (NUTCH-490) Extension point with filters for Neko HTML parser (with patch) Tue, 22 May, 12:18
Marcin Okraszewski (JIRA) [jira] Updated: (NUTCH-490) Extension point with filters for Neko HTML parser (with patch) Tue, 22 May, 12:18
Marcin Okraszewski (JIRA) [jira] Updated: (NUTCH-490) Extension point with filters for Neko HTML parser (with patch) Tue, 22 May, 12:20
Vadim Bauer (JIRA) [jira] Commented: (NUTCH-427) protocol-smb: plugin protocol implementing the CIFS/SMB protocol. This protocol allows Nutch to crawl Microsoft Windows Shares remotely using the CIFS/SMB protocol implmentation. Tue, 22 May, 12:37
Doug Cook (JIRA) [jira] Commented: (NUTCH-25) needs 'character encoding' detector Tue, 22 May, 22:28
Emmanuel Joke (JIRA) [jira] Updated: (NUTCH-489) URLFilter-suffix management of the url path when the url contains some query parameters Wed, 23 May, 03:37
Doğacan Güney (JIRA) [jira] Commented: (NUTCH-489) URLFilter-suffix management of the url path when the url contains some query parameters Wed, 23 May, 06:10
Otis Gospodnetic IntelliJ & Eclipse Lucene code styles available Wed, 23 May, 06:20
Yakn Get meta name="description" and other meta tags from Content Wed, 23 May, 15:02
Nicolás Lichtmaier (JIRA) [jira] Created: (NUTCH-491) dedup fails with ArrayIndexOutOfBoundsException Wed, 23 May, 16:53
Andrzej Bialecki Re: Get meta name="description" and other meta tags from Content Wed, 23 May, 16:54
Doğacan Güney (JIRA) [jira] Commented: (NUTCH-491) dedup fails with ArrayIndexOutOfBoundsException Thu, 24 May, 11:55
karthik085 NUTCH-348 and Nutch-0.7.2 Thu, 24 May, 14:01
Doug Cutting Re: NUTCH-348 and Nutch-0.7.2 Thu, 24 May, 16:27
Vadim Bauer (JIRA) [jira] Updated: (NUTCH-427) protocol-smb: plugin protocol implementing the CIFS/SMB protocol. This protocol allows Nutch to crawl Microsoft Windows Shares remotely using the CIFS/SMB protocol implmentation. Fri, 25 May, 21:05
Vadim Bauer (JIRA) [jira] Updated: (NUTCH-427) protocol-smb: plugin protocol implementing the CIFS/SMB protocol. This protocol allows Nutch to crawl Microsoft Windows Shares remotely using the CIFS/SMB protocol implmentation. Fri, 25 May, 21:09
Nicolás Lichtmaier (JIRA) [jira] Created: (NUTCH-492) java.lang.OutOfMemoryError while indexing. Sat, 26 May, 23:42
Andrzej Bialecki (JIRA) [jira] Work started: (NUTCH-466) Flexible segment format Mon, 28 May, 09:01
Gal Nitzan proposal for committer Mon, 28 May, 12:32
Nicolás Lichtmaier Plugins initialized all the time! Mon, 28 May, 20:47
Nicolás Lichtmaier Re: Plugins initialized all the time! Mon, 28 May, 21:00
Doğacan Güney (JIRA) [jira] Commented: (NUTCH-489) URLFilter-suffix management of the url path when the url contains some query parameters Tue, 29 May, 12:22
Enis Soztutar Re: proposal for committer Tue, 29 May, 12:39
prem kumar running nutch without http proxy Tue, 29 May, 14:03
Doğacan Güney Re: Plugins initialized all the time! Tue, 29 May, 15:50
Briggs Re: Plugins initialized all the time! Tue, 29 May, 16:07
Doğacan Güney Re: Plugins initialized all the time! Tue, 29 May, 16:52
Briggs Re: Plugins initialized all the time! Tue, 29 May, 17:16
Nicolás Lichtmaier Re: Plugins initialized all the time! Tue, 29 May, 20:39
Doug Cutting Re: proposal for committer Tue, 29 May, 20:45
Nicolás Lichtmaier Re: Plugins initialized all the time! Tue, 29 May, 21:56
wangxu (JIRA) [jira] Created: (NUTCH-493) contentType parse not correctly,,,,got empty content using readseg -get Wed, 30 May, 00:05
Marcin Okraszewski Re: running nutch without http proxy Wed, 30 May, 06:03
Doğacan Güney Re: Plugins initialized all the time! Wed, 30 May, 06:07
Andrzej Bialecki Re: Plugins initialized all the time! Wed, 30 May, 11:01
Doğacan Güney Re: Plugins initialized all the time! Wed, 30 May, 11:47
Chris Mattmann Committer Wed, 30 May, 13:42
Chris A. Mattmann (JIRA) [jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility Wed, 30 May, 13:55
Manoharam Reddy OutOfMemoryError - Why should the while(1) loop stop? Wed, 30 May, 14:55
Dennis Kubes Re: OutOfMemoryError - Why should the while(1) loop stop? Wed, 30 May, 15:38
Andrzej Bialecki (JIRA) [jira] Resolved: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content Wed, 30 May, 18:37
hud...@lucene.zones.apache.org Build failed in Hudson: Nutch-Nightly #102 Thu, 31 May, 07:00
Manoharam Reddy What is parse-oo and why doesn't parsed PDF content show up in cached.jsp ? Thu, 31 May, 07:07
Manoharam Reddy How is lib-http plugin called? It is not there in plugins.include! Thu, 31 May, 07:10
rubdabadub Re: [jira] Resolved: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content Thu, 31 May, 08:04
Doğacan Güney (JIRA) [jira] Created: (NUTCH-494) FindBugs: CrawlDbReader and DeleteDuplicates Thu, 31 May, 08:52
Doğacan Güney (JIRA) [jira] Updated: (NUTCH-494) FindBugs: CrawlDbReader and DeleteDuplicates Thu, 31 May, 08:52
Andrzej Bialecki Re: [jira] Resolved: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content Thu, 31 May, 10:18
Doğacan Güney Re: Plugins initialized all the time! Thu, 31 May, 14:02
Dennis Kubes Re: How is lib-http plugin called? It is not there in plugins.include! Thu, 31 May, 15:32
Doğacan Güney (JIRA) [jira] Created: (NUTCH-495) Unnecessary delays in Fetcher2 Thu, 31 May, 15:49
Doğacan Güney (JIRA) [jira] Updated: (NUTCH-495) Unnecessary delays in Fetcher2 Thu, 31 May, 15:51
hud...@lucene.zones.apache.org Hudson build is back to normal: Nutch-Nightly #103 Thu, 31 May, 16:56
Nicolás Lichtmaier Re: Plugins initialized all the time! Thu, 31 May, 17:54
Andrzej Bialecki (JIRA) [jira] Updated: (NUTCH-466) Flexible segment format Thu, 31 May, 18:42
Andrzej Bialecki (JIRA) [jira] Resolved: (NUTCH-486) Break searcher dependency on commons-cli Thu, 31 May, 19:01
Doğacan Güney (JIRA) [jira] Commented: (NUTCH-466) Flexible segment format Thu, 31 May, 19:28
Andrzej Bialecki (JIRA) [jira] Updated: (NUTCH-466) Flexible segment format Thu, 31 May, 19:55
Nicolás Lichtmaier Making "Hits" work as a normal List Thu, 31 May, 20:58
Andrzej Bialecki (JIRA) [jira] Resolved: (NUTCH-392) OutputFormat implementations should pass on Progressable Thu, 31 May, 21:25
Nicolás Lichtmaier [PATCH] Moving HitDetails construction to a constructor =) Thu, 31 May, 21:57
Manoharam Reddy How to create patch? Fri, 01 Jun, 06:12
Message list« Previous · 1 · 2Thread · Author · Date
Box list
Dec 200933
Nov 2009154
Oct 200988
Sep 200932
Aug 200982
Jul 200977
Jun 200994
May 2009104
Apr 200985
Mar 2009255
Feb 2009250
Jan 2009197
Dec 2008130
Nov 2008117
Oct 200884
Sep 2008101
Aug 200858
Jul 200832
Jun 200893
May 200857
Apr 200878
Mar 2008152
Feb 2008189
Jan 2008151
Dec 200768
Nov 2007186
Oct 2007162
Sep 2007189
Aug 2007135
Jul 2007283
Jun 2007241
May 2007188
Apr 2007144
Mar 2007282
Feb 2007241
Jan 2007266
Dec 2006103
Nov 2006222
Oct 2006187
Sep 2006166
Aug 2006281
Jul 2006180
Jun 2006262
May 2006282
Apr 2006247
Mar 2006304
Feb 2006349
Jan 2006558
Dec 2005412
Nov 2005288
Oct 2005313
Sep 2005339
Aug 2005426
Jul 2005228
Jun 2005178
May 2005140
Apr 2005497
Mar 2005398
Feb 200510