nutch-dev mailing list archives: December 2005

Site index · List index
Message list1 · 2 · 3 · 4 · 5 · Next »Thread · Author · Date
Jérôme Charron Re: Urlfilter Patch Thu, 01 Dec, 20:11
Jérôme Charron Re: [Nutch-dev] incremental crawling Thu, 01 Dec, 21:04
Jérôme Charron Re: Urlfilter Patch Thu, 01 Dec, 21:29
Jérôme Charron Re: Google performance bottlenecks ;-) (Re: Lucene performance bottlenecks) Fri, 09 Dec, 09:58
Jérôme Charron Hard-coded Content-type checks Tue, 13 Dec, 13:24
Jérôme Charron Re: Standard metadata property names in the ParseData metadata Tue, 13 Dec, 20:37
Jérôme Charron Re: Standard metadata property names in the ParseData metadata Tue, 13 Dec, 20:45
Jérôme Charron Re: [Fwd: Crawler submits forms?] Tue, 13 Dec, 22:16
Jérôme Charron Re: [Fwd: Crawler submits forms?] Wed, 14 Dec, 11:24
Jérôme Charron Re: vote results. Thu, 15 Dec, 22:27
Jérôme Charron Re: Latest version of Mapred Mon, 19 Dec, 22:43
Jérôme Charron Re: Static initializers Tue, 20 Dec, 14:19
Lutischán Ferenc (JIRA) [jira] Commented: (NUTCH-133) ParserFactory does not work as expected Wed, 07 Dec, 09:41
AJ Chen severe error in fetch Sun, 25 Dec, 22:38
AJ Chen Re: severe error in fetch Sun, 25 Dec, 23:13
AJ Chen Re: severe error in fetch Fri, 30 Dec, 22:21
AJ Chen how to add additional factor at search time to ranking score Sat, 31 Dec, 22:50
American Jeff Bowden Re: IndexSorter optimizer Thu, 22 Dec, 01:07
Andrew McNabb Re: vote for issues to fix in 0.7.2 Wed, 14 Dec, 17:12
Andrew McNabb GNU Getopt Tue, 20 Dec, 07:47
Andrzej Bialecki Re: incremental crawling Fri, 02 Dec, 09:15
Andrzej Bialecki Re: Lucene performance bottlenecks Thu, 08 Dec, 09:04
Andrzej Bialecki Re: Lucene performance bottlenecks Thu, 08 Dec, 16:59
Andrzej Bialecki Re: Lucene performance bottlenecks Thu, 08 Dec, 17:49
Andrzej Bialecki Google performance bottlenecks ;-) (Re: Lucene performance bottlenecks) Fri, 09 Dec, 09:42
Andrzej Bialecki Re: Google performance bottlenecks ;-) (Re: Lucene performance bottlenecks) Mon, 12 Dec, 09:58
Andrzej Bialecki IndexOptimizer (Re: Lucene performance bottlenecks) Mon, 12 Dec, 16:32
Andrzej Bialecki Re: IndexOptimizer (Re: Lucene performance bottlenecks) Mon, 12 Dec, 17:50
Andrzej Bialecki Re: IndexOptimizer (Re: Lucene performance bottlenecks) Tue, 13 Dec, 06:58
Andrzej Bialecki Re: IndexOptimizer (Re: Lucene performance bottlenecks) Tue, 13 Dec, 14:43
Andrzej Bialecki Re: Hard-coded Content-type checks Tue, 13 Dec, 14:56
Andrzej Bialecki Re: Standard metadata property names in the ParseData metadata Tue, 13 Dec, 20:37
Andrzej Bialecki Re: best file system for NDFS? Tue, 13 Dec, 20:43
Andrzej Bialecki Re: [Fwd: Crawler submits forms?] Tue, 13 Dec, 22:28
Andrzej Bialecki Re: [Fwd: Crawler submits forms?] Wed, 14 Dec, 08:34
Andrzej Bialecki Re: IndexOptimizer (Re: Lucene performance bottlenecks) Wed, 14 Dec, 10:06
Andrzej Bialecki Re: IndexOptimizer (Re: Lucene performance bottlenecks) Wed, 14 Dec, 23:16
Andrzej Bialecki Re: IndexOptimizer (Re: Lucene performance bottlenecks) Thu, 15 Dec, 09:53
Andrzej Bialecki Re: Nutch design queries Thu, 15 Dec, 14:22
Andrzej Bialecki Re: vote results. Thu, 15 Dec, 16:50
Andrzej Bialecki Re: IndexOptimizer (Re: Lucene performance bottlenecks) Thu, 15 Dec, 17:10
Andrzej Bialecki Re: [Fwd: Crawler submits forms?] Thu, 15 Dec, 19:06
Andrzej Bialecki Re: IndexOptimizer (Re: Lucene performance bottlenecks) Thu, 15 Dec, 20:10
Andrzej Bialecki Re: [Fwd: Crawler submits forms?] Thu, 15 Dec, 20:11
Andrzej Bialecki Re: version branches / two products Fri, 16 Dec, 00:58
Andrzej Bialecki [VOTE] Commiter access for Stefan Groschupf Fri, 16 Dec, 21:50
Andrzej Bialecki Re: problems http-client Mon, 19 Dec, 18:47
Andrzej Bialecki Re: problems http-client Mon, 19 Dec, 19:05
Andrzej Bialecki Re: GNU Getopt Tue, 20 Dec, 08:02
Andrzej Bialecki Static initializers Tue, 20 Dec, 13:19
Andrzej Bialecki Re: Static initializers Tue, 20 Dec, 13:45
Andrzej Bialecki Re: Static initializers Tue, 20 Dec, 14:34
Andrzej Bialecki Re: [Nutch-dev] distributed search Tue, 20 Dec, 15:39
Andrzej Bialecki IndexSorter optimizer Wed, 21 Dec, 13:14
Andrzej Bialecki Re: IndexSorter optimizer Thu, 22 Dec, 07:07
Andrzej Bialecki Re: Commons HttpClient 3.0 released Thu, 22 Dec, 11:48
Andrzej Bialecki Removing old classes from trunk/ Fri, 23 Dec, 01:16
Andrzej Bialecki Re: severe error in fetch Mon, 26 Dec, 23:46
Andrzej Bialecki Mega-cleanup in trunk/ Thu, 29 Dec, 00:56
Andrzej Bialecki Re: Trunk is broken Fri, 30 Dec, 07:56
Andrzej Bialecki Re: Bug in DeleteDuplicates.java ? Fri, 30 Dec, 08:23
Andrzej Bialecki Re: Trunk is broken Fri, 30 Dec, 11:08
Andrzej Bialecki Adaptive fetch interval & unmodified content detection, episode II Fri, 30 Dec, 16:31
Andrzej Bialecki (JIRA) [jira] Resolved: (NUTCH-114) getting number of urls and links from crawldb Fri, 02 Dec, 08:46
Andrzej Bialecki (JIRA) [jira] Created: (NUTCH-134) Summarizer doesn't select the best snippets Wed, 07 Dec, 14:11
Andrzej Bialecki (JIRA) [jira] Commented: (NUTCH-134) Summarizer doesn't select the best snippets Wed, 07 Dec, 20:11
Andrzej Bialecki (JIRA) [jira] Commented: (NUTCH-135) http header meta data are case insensitive in the real world (e.g. Content-Type or content-type) Fri, 09 Dec, 21:59
Andrzej Bialecki (JIRA) [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata Tue, 20 Dec, 10:21
Andrzej Bialecki (JIRA) [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata Tue, 20 Dec, 11:34
Andrzej Bialecki (JIRA) [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata Tue, 20 Dec, 15:59
Andrzej Bialecki (JIRA) [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata Wed, 21 Dec, 11:10
Andrzej Bialecki (JIRA) [jira] Commented: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content Thu, 22 Dec, 19:46
Andrzej Bialecki (JIRA) [jira] Commented: (NUTCH-95) DeleteDuplicates depends on the order of input segments Wed, 28 Dec, 07:10
Andrzej Bialecki (JIRA) [jira] Commented: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content Wed, 28 Dec, 07:12
Andrzej Bialecki (JIRA) [jira] Closed: (NUTCH-121) SegmentReader for mapred Thu, 29 Dec, 18:38
Andrzej Bialecki (JIRA) [jira] Updated: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content Fri, 30 Dec, 16:07
Arun Kumar Sharma (JIRA) [jira] Created: (NUTCH-154) Unable to add/update new files to fetchlist/fetcher and thus index, when u rerun crawl tool on same db. Wed, 28 Dec, 08:04
Bernhard Messer (JIRA) [jira] Created: (NUTCH-144) corrupt language identifier tri files and bad language recognition for german Sat, 17 Dec, 16:51
Byron Miller Re: [VOTE] Commiter access for Stefan Groschupf Fri, 16 Dec, 21:51
Byron Miller Re: IndexSorter optimizer Wed, 21 Dec, 23:55
Byron Miller failure with crawl using 12/23 trunk Sat, 24 Dec, 04:35
Byron Miller Re: Mega-cleanup in trunk/ Thu, 29 Dec, 01:57
Chris A. Mattmann (JIRA) [jira] Commented: (NUTCH-133) ParserFactory does not work as expected Wed, 07 Dec, 00:33
Chris A. Mattmann (JIRA) [jira] Commented: (NUTCH-133) ParserFactory does not work as expected Wed, 07 Dec, 17:00
Chris A. Mattmann (JIRA) [jira] Commented: (NUTCH-133) ParserFactory does not work as expected Wed, 07 Dec, 21:33
Chris A. Mattmann (JIRA) [jira] Commented: (NUTCH-133) ParserFactory does not work as expected Thu, 08 Dec, 00:55
Chris A. Mattmann (JIRA) [jira] Commented: (NUTCH-34) Parsing different content formats Sun, 11 Dec, 18:11
Chris A. Mattmann (JIRA) [jira] Created: (NUTCH-139) Standard metadata property names in the ParseData metadata Wed, 14 Dec, 04:02
Chris A. Mattmann (JIRA) [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata Wed, 14 Dec, 04:04
Chris A. Mattmann (JIRA) [jira] Updated: (NUTCH-139) Standard metadata property names in the ParseData metadata Wed, 14 Dec, 04:04
Chris A. Mattmann (JIRA) [jira] Created: (NUTCH-140) Add alias capability in parse-plugins.xml file that allows mimeType->extensionId mapping Wed, 14 Dec, 04:10
Chris A. Mattmann (JIRA) [jira] Updated: (NUTCH-139) Standard metadata property names in the ParseData metadata Sat, 17 Dec, 03:03
Chris A. Mattmann (JIRA) [jira] Commented: (NUTCH-140) Add alias capability in parse-plugins.xml file that allows mimeType->extensionId mapping Sat, 17 Dec, 03:11
Chris A. Mattmann (JIRA) [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata Sat, 17 Dec, 20:29
Chris A. Mattmann (JIRA) [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata Tue, 20 Dec, 15:13
Chris A. Mattmann (JIRA) [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata Tue, 20 Dec, 15:24
Chris Mattmann Re: Urlfilter Patch Thu, 01 Dec, 20:56
Chris Mattmann Re: Urlfilter Patch Thu, 01 Dec, 21:16
Chris Mattmann RE: Urlfilter Patch Thu, 01 Dec, 22:04
Chris Mattmann RE: Urlfilter Patch Thu, 01 Dec, 22:06
Message list1 · 2 · 3 · 4 · 5 · Next »Thread · Author · Date
Box list
Apr 2015431
Mar 2015384
Feb 2015530
Jan 2015258
Dec 2014162
Nov 2014165
Oct 2014249
Sep 2014376
Aug 2014136
Jul 2014219
Jun 2014355
May 2014378
Apr 2014332
Mar 2014248
Feb 2014168
Jan 2014471
Dec 2013186
Nov 2013177
Oct 2013182
Sep 2013158
Aug 2013182
Jul 2013240
Jun 2013321
May 2013288
Apr 2013437
Mar 2013521
Feb 2013201
Jan 2013560
Dec 2012176
Nov 2012251
Oct 2012200
Sep 2012219
Aug 2012230
Jul 2012301
Jun 2012391
May 2012317
Apr 2012352
Mar 2012297
Feb 2012395
Jan 2012298
Dec 2011318
Nov 2011524
Oct 2011483
Sep 2011605
Aug 2011528
Jul 2011635
Jun 2011418
May 2011176
Apr 2011453
Mar 2011139
Feb 201162
Jan 2011150
Dec 2010100
Nov 201096
Oct 2010177
Sep 2010143
Aug 2010289
Jul 2010364
Jun 2010246
May 201075
Apr 2010124
Mar 2010183
Feb 2010134
Jan 2010106
Dec 200998
Nov 2009154
Oct 200988
Sep 200932
Aug 200982
Jul 200977
Jun 200994
May 2009104
Apr 200985
Mar 2009255
Feb 2009250
Jan 2009197
Dec 2008158
Nov 2008117
Oct 200884
Sep 2008101
Aug 200858
Jul 200832
Jun 200893
May 200857
Apr 200878
Mar 2008152
Feb 2008190
Jan 2008155
Dec 200768
Nov 2007188
Oct 2007179
Sep 2007189
Aug 2007135
Jul 2007283
Jun 2007241
May 2007188
Apr 2007144
Mar 2007282
Feb 2007241
Jan 2007266
Dec 2006103
Nov 2006222
Oct 2006187
Sep 2006166
Aug 2006281
Jul 2006180
Jun 2006262
May 2006282
Apr 2006247
Mar 2006304
Feb 2006349
Jan 2006558
Dec 2005412
Nov 2005288
Oct 2005313
Sep 2005339
Aug 2005426
Jul 2005228
Jun 2005178
May 2005140
Apr 2005497
Mar 2005398
Feb 200510