Mailing list archives: May 2007

Site index · List index
Message list« Previous · 1 · 2 · 3 · Next »Thread · Author · Date
[jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility
Doğacan Güney (JIRA)   [jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility Thu, 10 May, 12:58
Chris A. Mattmann (JIRA)   [jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility Thu, 10 May, 14:32
nutch.newbie (JIRA)   [jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility Thu, 10 May, 16:54
Doğacan Güney (JIRA)   [jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility Fri, 11 May, 07:59
nutch.newbie (JIRA)   [jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility Fri, 11 May, 09:52
Chris A. Mattmann (JIRA)   [jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility Sun, 13 May, 16:25
Chris A. Mattmann (JIRA)   [jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility Wed, 30 May, 13:55
Mike Brzozowski (JIRA) [jira] Updated: (NUTCH-424) CLONE - Problem persists with Nutch 0.8.1 (Nekohtml 0.9.4) - NekoHTML's DOMFragmentParser hangs on certain URLs Thu, 10 May, 16:16
Mike Brzozowski (JIRA) [jira] Commented: (NUTCH-424) CLONE - Problem persists with Nutch 0.8.1 (Nekohtml 0.9.4) - NekoHTML's DOMFragmentParser hangs on certain URLs Thu, 10 May, 16:16
Sami Siren (JIRA) [jira] Resolved: (NUTCH-456) parse msexcel plugin speedup Thu, 10 May, 16:16
Sami Siren (JIRA) [jira] Assigned: (NUTCH-446) RobotRulesParser should ignore Crawl-delay values of other bots in robots.txt Thu, 10 May, 16:18
Mike Brzozowski (JIRA) [jira] Updated: (NUTCH-424) NekoHTML's DOMFragmentParser hangs on certain URLs (CLONE: Problem persists with Nutch 0.9 and 0.8.1 (Nekohtml 0.9.4)) Thu, 10 May, 16:18
Mike Brzozowski (JIRA) [jira] Commented: (NUTCH-424) NekoHTML's DOMFragmentParser hangs on certain URLs (CLONE: Problem persists with Nutch 0.9 and 0.8.1 (Nekohtml 0.9.4)) Thu, 10 May, 16:27
Sami Siren (JIRA) [jira] Resolved: (NUTCH-446) RobotRulesParser should ignore Crawl-delay values of other bots in robots.txt Thu, 10 May, 16:32
Michael McIntosh Will any Nutch/Lucene folks be at the Enterprise Search Summit in week in New York? Fri, 11 May, 15:17
charlie wanek (JIRA) [jira] Created: (NUTCH-481) http.content.limit is broken in the protocol-httpclient plugin Fri, 11 May, 18:24
charlie wanek (JIRA) [jira] Updated: (NUTCH-481) http.content.limit is broken in the protocol-httpclient plugin Fri, 11 May, 18:41
Sami Siren (JIRA) [jira] Created: (NUTCH-482) Remove redundant plugin lib-log4j Sat, 12 May, 07:54
Sami Siren (JIRA) [jira] Created: (NUTCH-483) remove redundant commons-logging jar from ontology plugin Sat, 12 May, 07:56
Gal Nitzan Site nightly API link is broken Sat, 12 May, 08:00
Sami Siren   Re: Site nightly API link is broken Sat, 12 May, 08:04
Gal Nitzan     RE: Site nightly API link is broken Sat, 12 May, 08:07
Sami Siren       Re: Site nightly API link is broken Sat, 12 May, 08:31
Gal Nitzan (JIRA) [jira] Created: (NUTCH-484) Nutch Nightly API link is broken in site Sat, 12 May, 09:01
Gal Nitzan (JIRA) [jira] Updated: (NUTCH-484) Nutch Nightly API link is broken in site Sat, 12 May, 09:06
Gal Nitzan (JIRA) [jira] Created: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object Sat, 12 May, 19:50
[jira] Updated: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object
Gal Nitzan (JIRA)   [jira] Updated: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object Sat, 12 May, 20:00
Gal Nitzan (JIRA)   [jira] Updated: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object Sun, 13 May, 06:35
Gal Nitzan (JIRA)   [jira] Updated: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object Sun, 13 May, 06:50
Gal Nitzan (JIRA)   [jira] Updated: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object Sun, 13 May, 09:47
Gal Nitzan (JIRA)   [jira] Updated: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object Sun, 13 May, 21:17
[jira] Commented: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object
Andrzej Bialecki (JIRA)   [jira] Commented: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object Sat, 12 May, 21:55
Doğacan Güney (JIRA)   [jira] Commented: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object Sun, 13 May, 09:28
Doğacan Güney (JIRA)   [jira] Commented: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object Sun, 13 May, 20:09
Sami Siren (JIRA) [jira] Resolved: (NUTCH-484) Nutch Nightly API link is broken in site Sun, 13 May, 14:56
Doğacan Güney (JIRA) [jira] Updated: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility Sun, 13 May, 16:01
Chris A. Mattmann (JIRA) [jira] Reopened: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser Sun, 13 May, 16:23
Sami Siren (JIRA) [jira] Resolved: (NUTCH-482) Remove redundant plugin lib-log4j Mon, 14 May, 14:38
Sami Siren (JIRA) [jira] Resolved: (NUTCH-483) remove redundant commons-logging jar from ontology plugin Mon, 14 May, 14:52
Sami Siren (JIRA) [jira] Resolved: (NUTCH-457) Create top level dist directory and checkin KEYS file to subversion be standard with Lucene Java and Hadoop Mon, 14 May, 15:16
Mark Woon (JIRA) [jira] Created: (NUTCH-486) Break searcher dependency on commons-cli Mon, 14 May, 23:36
Andrzej Bialecki (JIRA) [jira] Commented: (NUTCH-486) Break searcher dependency on commons-cli Tue, 15 May, 06:22
Sami Siren (JIRA) [jira] Resolved: (NUTCH-161) Change Plain text parser to use parser.character.encoding.default property for fall back encoding Tue, 15 May, 18:32
Sami Siren (JIRA) [jira] Updated: (NUTCH-161) Change Plain text parser to use parser.character.encoding.default property for fall back encoding Tue, 15 May, 18:32
Re: Issues pending before 0.9 release
rubdabadub   Re: Issues pending before 0.9 release Thu, 17 May, 04:21
Andrzej Bialecki     Re: Issues pending before 0.9 release Fri, 18 May, 07:17
Ilya Vishnevsky bug in SegmentReader Mon, 21 May, 08:42
Marcin Okraszewski =?UTF-8?Q?Bug_(with_fix):_Neko_HTML_parser_goes_on_defaults.?= Mon, 21 May, 10:45
Doğacan Güney   Re: Bug (with fix): Neko HTML parser goes on defaults. Mon, 21 May, 13:47
Marcin Okraszewski     =?UTF-8?Q?Re:_Bug_(with_fix):_Neko_HTML_parser_goes_on_defaults.?= Mon, 21 May, 14:09
Marcin Okraszewski (JIRA) [jira] Updated: (NUTCH-487) Neko HTML parser goes on default settings. Mon, 21 May, 14:06
Marcin Okraszewski (JIRA) [jira] Created: (NUTCH-487) Neko HTML parser goes on default settings. Mon, 21 May, 14:06
[jira] Commented: (NUTCH-25) needs 'character encoding' detector
Doug Cook (JIRA)   [jira] Commented: (NUTCH-25) needs 'character encoding' detector Mon, 21 May, 16:34
Ken Krugler (JIRA)   [jira] Commented: (NUTCH-25) needs 'character encoding' detector Mon, 21 May, 18:01
Doug Cook (JIRA)   [jira] Commented: (NUTCH-25) needs 'character encoding' detector Tue, 22 May, 22:28
Doğacan Güney (JIRA) [jira] Updated: (NUTCH-25) needs 'character encoding' detector Mon, 21 May, 20:48
Emmanuel Joke (JIRA) [jira] Created: (NUTCH-488) Avoid parsing uneccessary links and get a more relevant outlink list Tue, 22 May, 07:38
[jira] Updated: (NUTCH-488) Avoid parsing uneccessary links and get a more relevant outlink list
Emmanuel Joke (JIRA)   [jira] Updated: (NUTCH-488) Avoid parsing uneccessary links and get a more relevant outlink list Tue, 22 May, 07:40
Emmanuel Joke (JIRA)   [jira] Updated: (NUTCH-488) Avoid parsing uneccessary links and get a more relevant outlink list Tue, 22 May, 07:42
Emmanuel Joke (JIRA) [jira] Created: (NUTCH-489) URLFilter-suffix management of the url path when the url contains some query parameters Tue, 22 May, 08:35
[jira] Updated: (NUTCH-489) URLFilter-suffix management of the url path when the url contains some query parameters
Emmanuel Joke (JIRA)   [jira] Updated: (NUTCH-489) URLFilter-suffix management of the url path when the url contains some query parameters Tue, 22 May, 08:37
Emmanuel Joke (JIRA)   [jira] Updated: (NUTCH-489) URLFilter-suffix management of the url path when the url contains some query parameters Wed, 23 May, 03:37
Message list« Previous · 1 · 2 · 3 · Next »Thread · Author · Date
Box list
Dec 200931
Nov 2009154
Oct 200988
Sep 200932
Aug 200982
Jul 200977
Jun 200994
May 2009104
Apr 200985
Mar 2009255
Feb 2009250
Jan 2009197
Dec 2008130
Nov 2008117
Oct 200884
Sep 2008101
Aug 200858
Jul 200832
Jun 200893
May 200857
Apr 200878
Mar 2008152
Feb 2008189
Jan 2008151
Dec 200768
Nov 2007186
Oct 2007162
Sep 2007189
Aug 2007135
Jul 2007283
Jun 2007241
May 2007188
Apr 2007144
Mar 2007282
Feb 2007241
Jan 2007266
Dec 2006103
Nov 2006222
Oct 2006187
Sep 2006166
Aug 2006281
Jul 2006180
Jun 2006262
May 2006282
Apr 2006247
Mar 2006304
Feb 2006349
Jan 2006558
Dec 2005412
Nov 2005288
Oct 2005313
Sep 2005339
Aug 2005426
Jul 2005228
Jun 2005178
May 2005140
Apr 2005497
Mar 2005398
Feb 200510