| Ian Holsman (JIRA) |
[jira] Commented: (NUTCH-524) Generate Problem with Single Node |
Tue, 24 Jul, 17:35 |
| Doug Cook (JIRA) |
[jira] Updated: (NUTCH-25) needs 'character encoding' detector |
Tue, 24 Jul, 17:37 |
| Doug Cook (JIRA) |
[jira] Updated: (NUTCH-25) needs 'character encoding' detector |
Tue, 24 Jul, 18:31 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-526) Use a combiner in LinDbMerger to improve the performance as in LinkDb |
Wed, 25 Jul, 02:24 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-526) Use a combiner in LinDbMerger to improve the performance as in LinkDb |
Wed, 25 Jul, 02:24 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-526) Use a combiner in LinDbMerger to improve the performance as in LinkDb |
Wed, 25 Jul, 06:32 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-25) needs 'character encoding' detector |
Wed, 25 Jul, 07:27 |
| Rob Young (JIRA) |
[jira] Created: (NUTCH-527) MapWritable doesn't support all hadoops writable types |
Wed, 25 Jul, 11:03 |
| Rob Young (JIRA) |
[jira] Updated: (NUTCH-527) MapWritable doesn't support all hadoops writable types |
Wed, 25 Jul, 11:07 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-524) Generate Problem with Single Node |
Wed, 25 Jul, 11:16 |
| Rob Young (JIRA) |
[jira] Updated: (NUTCH-527) MapWritable doesn't support all hadoops writable types |
Wed, 25 Jul, 11:45 |
| Emmanuel |
CrawlDbReader TopN |
Wed, 25 Jul, 11:50 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-527) MapWritable doesn't support all hadoops writable types |
Wed, 25 Jul, 12:39 |
| Doug Cook (JIRA) |
[jira] Commented: (NUTCH-25) needs 'character encoding' detector |
Wed, 25 Jul, 16:39 |
| Robert Young |
Re: [jira] Commented: (NUTCH-527) MapWritable doesn't support all hadoops writable types |
Wed, 25 Jul, 17:39 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-25) needs 'character encoding' detector |
Wed, 25 Jul, 17:57 |
| Doğacan Güney |
Re: [jira] Commented: (NUTCH-527) MapWritable doesn't support all hadoops writable types |
Wed, 25 Jul, 18:05 |
| Doug Cook (JIRA) |
[jira] Commented: (NUTCH-25) needs 'character encoding' detector |
Thu, 26 Jul, 00:39 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-528) CrawlDbReader: add some new stats + dump into a csv format |
Thu, 26 Jul, 07:55 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-528) CrawlDbReader: add some new stats + dump into a csv format |
Thu, 26 Jul, 07:57 |
| Doğacan Güney (JIRA) |
[jira] Resolved: (NUTCH-516) Next fetch time is not set when it is a CrawlDatum.STATUS_FETCH_GONE |
Thu, 26 Jul, 08:37 |
| Doğacan Güney (JIRA) |
[jira] Resolved: (NUTCH-525) DeleteDuplicates generates ArrayIndexOutOfBoundsException when trying to rerun dedup on a segment |
Thu, 26 Jul, 08:54 |
| Robert Young |
Re: [jira] Commented: (NUTCH-527) MapWritable doesn't support all hadoops writable types |
Thu, 26 Jul, 10:51 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-516) Next fetch time is not set when it is a CrawlDatum.STATUS_FETCH_GONE |
Thu, 26 Jul, 12:55 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-525) DeleteDuplicates generates ArrayIndexOutOfBoundsException when trying to rerun dedup on a segment |
Thu, 26 Jul, 12:55 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-439) Top Level Domains Indexing / Scoring |
Thu, 26 Jul, 12:58 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-439) Top Level Domains Indexing / Scoring |
Thu, 26 Jul, 18:03 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-529) NodeWalker.skipChildren don't wrok for more than 1 child. |
Fri, 27 Jul, 03:30 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-529) NodeWalker.skipChildren don't wrok for more than 1 child. |
Fri, 27 Jul, 03:30 |
| Hudson (JIRA) |
[jira] Commented: (NUTCH-516) Next fetch time is not set when it is a CrawlDatum.STATUS_FETCH_GONE |
Fri, 27 Jul, 04:25 |
| Hudson (JIRA) |
[jira] Commented: (NUTCH-525) DeleteDuplicates generates ArrayIndexOutOfBoundsException when trying to rerun dedup on a segment |
Fri, 27 Jul, 04:25 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-522) Use URLValidator in the Injector |
Fri, 27 Jul, 06:53 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-522) Use URLValidator in the Injector |
Fri, 27 Jul, 06:55 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-522) Use URLValidator in the Injector |
Fri, 27 Jul, 06:55 |
| Enis Soztutar (JIRA) |
[jira] Updated: (NUTCH-439) Top Level Domains Indexing / Scoring |
Fri, 27 Jul, 08:06 |
| Enis Soztutar (JIRA) |
[jira] Commented: (NUTCH-439) Top Level Domains Indexing / Scoring |
Fri, 27 Jul, 08:12 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-522) Use URLValidator in the Injector |
Fri, 27 Jul, 08:32 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-522) Use URLValidator in the Injector |
Fri, 27 Jul, 08:39 |
| Doğacan Güney (JIRA) |
[jira] Issue Comment Edited: (NUTCH-522) Use URLValidator in the Injector |
Fri, 27 Jul, 13:10 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-522) Use URLValidator in the Injector |
Sat, 28 Jul, 05:04 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-530) Add a combiner to improve performance on updatedb |
Sun, 29 Jul, 08:32 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-530) Add a combiner to improve performance on updatedb |
Sun, 29 Jul, 08:35 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-526) Use a combiner in LinDbMerger to improve the performance as in LinkDb |
Sun, 29 Jul, 08:37 |
| Le Quoc Anh |
Error indexer |
Sun, 29 Jul, 09:13 |
| Carl Cerecke (JIRA) |
[jira] Created: (NUTCH-531) Pages with no ContentType cause a Null Pointer exception |
Sun, 29 Jul, 20:55 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Mon, 30 Jul, 08:59 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Mon, 30 Jul, 09:01 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Mon, 30 Jul, 09:01 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Mon, 30 Jul, 09:01 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-533) LinkDbMerger: url normlaized is not updated in the key and inlinks list |
Mon, 30 Jul, 09:53 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-533) LinkDbMerger: url normlaized is not updated in the key and inlinks list |
Mon, 30 Jul, 09:53 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-526) Use a combiner in LinDbMerger to improve the performance as in LinkDb |
Mon, 30 Jul, 10:41 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
Mon, 30 Jul, 10:48 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-526) Use a combiner in LinDbMerger to improve the performance as in LinkDb |
Mon, 30 Jul, 10:50 |
| Doğacan Güney (JIRA) |
[jira] Updated: (NUTCH-531) Pages with no ContentType cause a Null Pointer exception |
Mon, 30 Jul, 10:54 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-533) LinkDbMerger: url normlaized is not updated in the key and inlinks list |
Mon, 30 Jul, 10:56 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Mon, 30 Jul, 11:00 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS |
Mon, 30 Jul, 11:02 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-528) CrawlDbReader: add some new stats + dump into a csv format |
Mon, 30 Jul, 11:06 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-529) NodeWalker.skipChildren don't wrok for more than 1 child. |
Mon, 30 Jul, 11:08 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-533) LinkDbMerger: url normlaized is not updated in the key and inlinks list |
Mon, 30 Jul, 14:52 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS |
Mon, 30 Jul, 16:20 |
| Doğacan Güney (JIRA) |
[jira] Updated: (NUTCH-529) NodeWalker.skipChildren doesn't work for more than 1 child. |
Mon, 30 Jul, 18:59 |
| Doğacan Güney (JIRA) |
[jira] Updated: (NUTCH-533) LinkDbMerger: url normalized is not updated in the key and inlinks list |
Mon, 30 Jul, 18:59 |
| Doğacan Güney (JIRA) |
[jira] Resolved: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS |
Mon, 30 Jul, 19:03 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS |
Mon, 30 Jul, 19:03 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
Tue, 31 Jul, 02:35 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-533) LinkDbMerger: url normalized is not updated in the key and inlinks list |
Tue, 31 Jul, 02:41 |
| Hudson (JIRA) |
[jira] Commented: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS |
Tue, 31 Jul, 04:19 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Tue, 31 Jul, 05:20 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
Tue, 31 Jul, 06:06 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Tue, 31 Jul, 06:15 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-534) SegmentMerger: add -normalize option |
Tue, 31 Jul, 10:35 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-534) SegmentMerger: add -normalize option |
Tue, 31 Jul, 10:35 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
Tue, 31 Jul, 11:00 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
Tue, 31 Jul, 11:13 |
| Blaž Smolnikar |
Pages in UTF-16 |
Tue, 31 Jul, 11:23 |
| Doğacan Güney (JIRA) |
[jira] Resolved: (NUTCH-533) LinkDbMerger: url normalized is not updated in the key and inlinks list |
Tue, 31 Jul, 12:07 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-533) LinkDbMerger: url normalized is not updated in the key and inlinks list |
Tue, 31 Jul, 12:12 |
| Doğacan Güney (JIRA) |
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch |
Tue, 31 Jul, 13:19 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-520) A common infrastructure for different index backends |
Tue, 31 Jul, 13:21 |
| Doğacan Güney (JIRA) |
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch |
Tue, 31 Jul, 14:01 |
| Hudson (JIRA) |
[jira] Commented: (NUTCH-533) LinkDbMerger: url normalized is not updated in the key and inlinks list |
Wed, 01 Aug, 05:23 |