| Jérôme Charron |
Re: implement thai language indexing and search |
Thu, 16 Nov, 10:52 |
| Jérôme Charron |
Re: implement thai language indexing and search |
Tue, 28 Nov, 21:56 |
| Uroš Gruber |
Re: need help to speed up map-reduce |
Tue, 07 Nov, 06:41 |
| Do?acan Güney (JIRA) |
[jira] Created: (NUTCH-397) porting clustering-carrot2 plugin to carrot2 v2.0 |
Tue, 07 Nov, 16:20 |
| Doğacan Güney (JIRA) |
[jira] Updated: (NUTCH-397) porting clustering-carrot2 plugin to carrot2 v2.0 |
Tue, 07 Nov, 16:22 |
| Doğacan Güney (JIRA) |
[jira] Updated: (NUTCH-92) DistributedSearch incorrectly scores results |
Mon, 20 Nov, 17:00 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-331) Fetcher incorrectly reports task progress to tasktracker resulting in skipped URLs |
Thu, 23 Nov, 10:27 |
| Doğacan Güney (JIRA) |
[jira] Created: (NUTCH-406) Metadata tries to write null values |
Thu, 23 Nov, 13:27 |
| Doğacan Güney (JIRA) |
[jira] Updated: (NUTCH-406) Metadata tries to write null values |
Thu, 23 Nov, 13:29 |
| Doğacan Güney (JIRA) |
[jira] Updated: (NUTCH-406) Metadata tries to write null values |
Thu, 23 Nov, 16:18 |
| Dogacan Güney (JIRA) |
[jira] Commented: (NUTCH-92) DistributedSearch incorrectly scores results |
Mon, 27 Nov, 19:24 |
| Dogacan Güney (JIRA) |
[jira] Updated: (NUTCH-92) DistributedSearch incorrectly scores results |
Mon, 27 Nov, 19:24 |
| Dogacan Güney (JIRA) |
[jira] Created: (NUTCH-411) Parse ignores meta refresh redirection |
Thu, 30 Nov, 14:36 |
| Dogacan Güney (JIRA) |
[jira] Commented: (NUTCH-411) Parse ignores meta refresh redirection |
Thu, 30 Nov, 14:53 |
| Dogacan Güney (JIRA) |
[jira] Updated: (NUTCH-411) Parse ignores meta refresh redirection |
Thu, 30 Nov, 14:55 |
| AJ Chen |
need help to speed up map-reduce |
Mon, 06 Nov, 21:34 |
| AJ Chen |
Re: [jira] Resolved: (NUTCH-395) Increase fetching speed |
Tue, 14 Nov, 06:54 |
| AJ Chen |
Re: [jira] Commented: (NUTCH-395) Increase fetching speed |
Wed, 22 Nov, 17:09 |
| AJ Chen |
Re: [jira] Commented: (NUTCH-395) Increase fetching speed |
Wed, 22 Nov, 23:14 |
| AJ Chen (JIRA) |
[jira] Created: (NUTCH-398) map-reduce very slow when crawling on single server |
Wed, 08 Nov, 00:28 |
| Aisha |
Fetcher freezes |
Fri, 03 Nov, 14:53 |
| Aisha |
Re: Fetcher freezes |
Fri, 03 Nov, 15:14 |
| Aisha |
Re: Fetcher freezes |
Mon, 06 Nov, 14:42 |
| Aisha |
Re: Fetcher freezes |
Tue, 07 Nov, 11:14 |
| Alan Tanaman (JIRA) |
[jira] Commented: (NUTCH-407) Make Nutch crawling parent directories for file protocol configurable |
Tue, 28 Nov, 14:37 |
| Andrzej Bialecki |
Re: Nutch and Lucene |
Fri, 10 Nov, 10:07 |
| Andrzej Bialecki |
Re: [jira] Commented: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content |
Sun, 12 Nov, 21:11 |
| Andrzej Bialecki |
Nutch requires now Java 1.5 |
Mon, 13 Nov, 20:25 |
| Andrzej Bialecki |
Welcome Chris Mattmann as Nutch committer |
Thu, 23 Nov, 12:10 |
| Andrzej Bialecki |
Re: [jira] Closed: (NUTCH-406) Metadata tries to write null values |
Fri, 24 Nov, 07:54 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-387) host normalization in Generator$Selector |
Fri, 03 Nov, 11:56 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content |
Sun, 12 Nov, 19:38 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-395) Increase fetching speed |
Mon, 13 Nov, 09:59 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-401) Hardcoded /tmp directory in SegmentReader |
Tue, 14 Nov, 12:26 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-378) MetaWrapper decorator |
Tue, 14 Nov, 19:48 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-403) Make URL filtering optional in Generator |
Sun, 19 Nov, 08:33 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-331) Fetcher incorrectly reports task progress to tasktracker resulting in skipped URLs |
Thu, 23 Nov, 10:56 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-331) Fetcher incorrectly reports task progress to tasktracker resulting in skipped URLs |
Thu, 23 Nov, 10:58 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-406) Metadata tries to write null values |
Thu, 23 Nov, 15:59 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-406) Metadata tries to write null values |
Thu, 23 Nov, 16:44 |
| Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-339) Refactor nutch to allow fetcher improvements |
Fri, 24 Nov, 18:55 |
| Andrzej Bialecki (JIRA) |
[jira] Assigned: (NUTCH-339) Refactor nutch to allow fetcher improvements |
Fri, 24 Nov, 19:06 |
| Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-339) Refactor nutch to allow fetcher improvements |
Sat, 25 Nov, 09:42 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-407) Make Nutch crawling parent directories for file protocol configurable |
Mon, 27 Nov, 08:42 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-407) Make Nutch crawling parent directories for file protocol configurable |
Mon, 27 Nov, 09:40 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-339) Refactor nutch to allow fetcher improvements |
Tue, 28 Nov, 08:27 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-339) Refactor nutch to allow fetcher improvements |
Tue, 28 Nov, 16:16 |
| Armel Nene (JIRA) |
[jira] Commented: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content |
Sun, 12 Nov, 11:46 |
| Armel Nene (JIRA) |
[jira] Commented: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content |
Mon, 13 Nov, 11:53 |
| Armel Nene (JIRA) |
[jira] Commented: (NUTCH-185) XMLParser is configurable xml parser plugin. |
Sat, 25 Nov, 13:51 |
| Armel T. Nene |
RE: [jira] Commented: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content |
Sun, 12 Nov, 19:59 |
| Armel T. Nene |
RE: What's the status of Nutch-GUI? |
Mon, 20 Nov, 21:44 |
| Armel T. Nene |
RE: [jira] Commented: (NUTCH-185) XMLParser is configurable xml parser plugin. |
Mon, 20 Nov, 22:26 |
| Armel T. Nene |
RE: What's the status of Nutch-GUI? |
Tue, 21 Nov, 00:04 |
| Armel T. Nene |
Nutch folder configuration |
Tue, 21 Nov, 21:55 |
| Armel T. Nene |
RE: Nutch folder configuration |
Tue, 21 Nov, 22:45 |
| Armel T. Nene |
Nutch - Hadoop error |
Wed, 22 Nov, 17:49 |
| Armel T. Nene |
RE: [jira] Created: (NUTCH-408) Plugin development documentation |
Sat, 25 Nov, 14:32 |
| Armel T. Nene |
Indexing and Re-crawling site |
Tue, 28 Nov, 20:20 |
| Arun Kaundal |
Re: implement thai lanaguage analyzer in nutch |
Wed, 08 Nov, 04:12 |
| Arun Kaundal |
Re: implement thai lanaguage analyzer in nutch |
Wed, 08 Nov, 10:36 |
| Arun Kaundal |
Re: implement thai lanaguage analyzer in nutch |
Wed, 08 Nov, 13:27 |
| Arun Kumar Sharma (JIRA) |
[jira] Created: (NUTCH-402) Incrementalcrawling and indexing |
Thu, 16 Nov, 04:53 |
| Chris A. Mattmann (JIRA) |
[jira] Updated: (NUTCH-406) Metadata tries to write null values |
Thu, 23 Nov, 15:45 |
| Chris A. Mattmann (JIRA) |
[jira] Work started: (NUTCH-406) Metadata tries to write null values |
Thu, 23 Nov, 15:45 |
| Chris A. Mattmann (JIRA) |
[jira] Commented: (NUTCH-406) Metadata tries to write null values |
Thu, 23 Nov, 16:26 |
| Chris A. Mattmann (JIRA) |
[jira] Commented: (NUTCH-406) Metadata tries to write null values |
Thu, 23 Nov, 16:48 |
| Chris A. Mattmann (JIRA) |
[jira] Commented: (NUTCH-406) Metadata tries to write null values |
Thu, 23 Nov, 16:50 |
| Chris A. Mattmann (JIRA) |
[jira] Resolved: (NUTCH-406) Metadata tries to write null values |
Thu, 23 Nov, 17:18 |
| Chris A. Mattmann (JIRA) |
[jira] Closed: (NUTCH-406) Metadata tries to write null values |
Thu, 23 Nov, 17:20 |
| Chris A. Mattmann (JIRA) |
[jira] Assigned: (NUTCH-390) Javadoc warnings |
Fri, 24 Nov, 18:28 |
| Chris A. Mattmann (JIRA) |
[jira] Assigned: (NUTCH-185) XMLParser is configurable xml parser plugin. |
Fri, 24 Nov, 18:30 |
| Chris A. Mattmann (JIRA) |
[jira] Commented: (NUTCH-407) Make Nutch crawling parent directories for file protocol configurable |
Tue, 28 Nov, 14:44 |
| Chris Mattmann |
Re: What's the status of Nutch-GUI? |
Mon, 20 Nov, 18:39 |
| Chris Mattmann |
Re: What's the status of Nutch-GUI? |
Mon, 20 Nov, 23:29 |
| Chris Mattmann |
Re: [jira] Closed: (NUTCH-406) Metadata tries to write null values |
Thu, 23 Nov, 18:08 |
| Chris Mattmann |
Re: Welcome Chris Mattmann as Nutch committer |
Thu, 23 Nov, 19:28 |
| Chris Schneider (JIRA) |
[jira] Commented: (NUTCH-351) Protocol forward proxy |
Thu, 02 Nov, 01:45 |
| DS jha |
updating index without refetching |
Tue, 28 Nov, 14:11 |
| DS jha |
Re: updating index without refitting |
Tue, 28 Nov, 15:47 |
| Dawid Weiss (JIRA) |
[jira] Commented: (NUTCH-397) porting clustering-carrot2 plugin to carrot2 v2.0 |
Wed, 15 Nov, 18:44 |
| Doug Cook |
Re: need help to speed up map-reduce |
Tue, 07 Nov, 01:26 |
| Doug Cook |
More fetcher speed increases |
Thu, 16 Nov, 16:30 |
| Doug Cook |
Re: More fetcher speed increases |
Sun, 26 Nov, 00:20 |
| Doug Cook |
Re: Should URL normalization iterate? |
Wed, 29 Nov, 19:47 |
| Doug Cook (JIRA) |
[jira] Created: (NUTCH-396) mergesegs sorts URLs, making segments useless for subsequent fetch |
Fri, 03 Nov, 23:07 |
| Doug Cook (JIRA) |
[jira] Created: (NUTCH-409) Add "short circuit" notion to filters to speedup mixed site/subsite crawling |
Sun, 26 Nov, 00:18 |
| Doug Cook (JIRA) |
[jira] Updated: (NUTCH-409) Add "short circuit" notion to filters to speedup mixed site/subsite crawling |
Sun, 26 Nov, 00:20 |
| Doug Cook (JIRA) |
[jira] Commented: (NUTCH-409) Add "short circuit" notion to filters to speedup mixed site/subsite crawling |
Sun, 26 Nov, 01:03 |
| Doug Cook (JIRA) |
[jira] Created: (NUTCH-410) Faster RegexNormalize with more features |
Wed, 29 Nov, 19:44 |
| Doug Cook (JIRA) |
[jira] Updated: (NUTCH-410) Faster RegexNormalize with more features |
Wed, 29 Nov, 19:46 |
| Eelco Lempsink (JIRA) |
[jira] Commented: (NUTCH-393) Indexer doesn't handle null documents returned by filters |
Tue, 07 Nov, 22:14 |
| Enis Soztutar |
Re: Modifiying Nutch Indexer |
Tue, 07 Nov, 13:01 |
| Enis Soztutar |
Re: What's the status of Nutch-GUI? |
Tue, 21 Nov, 12:17 |
| Enis Soztutar (JIRA) |
[jira] Updated: (NUTCH-389) a url tokenizer implementation for tokenizing index fields : url and host |
Tue, 07 Nov, 13:16 |
| Enis Soztutar (JIRA) |
[jira] Commented: (NUTCH-393) Indexer doesn't handle null documents returned by filters |
Tue, 07 Nov, 13:34 |
| Enis Soztutar (JIRA) |
[jira] Updated: (NUTCH-289) CrawlDatum should store IP address |
Thu, 16 Nov, 08:44 |
| Enis Soztutar (JIRA) |
[jira] Updated: (NUTCH-251) Administration GUI |
Thu, 23 Nov, 14:35 |
| Gal Nitzan |
RE: updating index without refitting |
Tue, 28 Nov, 14:24 |
| Gavino Marras |
Nutch HTTPS & Sessions |
Tue, 21 Nov, 08:24 |