| Doğacan Güney (JIRA) |
[jira] Resolved: (NUTCH-525) DeleteDuplicates generates ArrayIndexOutOfBoundsException when trying to rerun dedup on a segment |
Thu, 26 Jul, 08:54 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-516) Next fetch time is not set when it is a CrawlDatum.STATUS_FETCH_GONE |
Thu, 26 Jul, 12:55 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-525) DeleteDuplicates generates ArrayIndexOutOfBoundsException when trying to rerun dedup on a segment |
Thu, 26 Jul, 12:55 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-529) NodeWalker.skipChildren don't wrok for more than 1 child. |
Fri, 27 Jul, 03:30 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-529) NodeWalker.skipChildren don't wrok for more than 1 child. |
Fri, 27 Jul, 03:30 |
| Doğacan Güney (JIRA) |
[jira] Issue Comment Edited: (NUTCH-522) Use URLValidator in the Injector |
Fri, 27 Jul, 13:10 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-530) Add a combiner to improve performance on updatedb |
Sun, 29 Jul, 08:32 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-530) Add a combiner to improve performance on updatedb |
Sun, 29 Jul, 08:35 |
| Carl Cerecke (JIRA) |
[jira] Created: (NUTCH-531) Pages with no ContentType cause a Null Pointer exception |
Sun, 29 Jul, 20:55 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Mon, 30 Jul, 08:59 |
|
[jira] Updated: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
|
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Mon, 30 Jul, 09:01 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Mon, 30 Jul, 09:01 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Mon, 30 Jul, 09:01 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-533) LinkDbMerger: url normlaized is not updated in the key and inlinks list |
Mon, 30 Jul, 09:53 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-533) LinkDbMerger: url normlaized is not updated in the key and inlinks list |
Mon, 30 Jul, 09:53 |
|
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
|
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
Mon, 30 Jul, 10:48 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
Tue, 31 Jul, 02:35 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
Tue, 31 Jul, 06:06 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
Tue, 31 Jul, 11:00 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
Tue, 31 Jul, 11:13 |
| Doğacan Güney (JIRA) |
[jira] Updated: (NUTCH-531) Pages with no ContentType cause a Null Pointer exception |
Mon, 30 Jul, 10:54 |
|
[jira] Commented: (NUTCH-533) LinkDbMerger: url normlaized is not updated in the key and inlinks list |
|
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-533) LinkDbMerger: url normlaized is not updated in the key and inlinks list |
Mon, 30 Jul, 10:56 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-533) LinkDbMerger: url normlaized is not updated in the key and inlinks list |
Mon, 30 Jul, 14:52 |
|
[jira] Commented: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
|
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Mon, 30 Jul, 11:00 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Tue, 31 Jul, 05:20 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Tue, 31 Jul, 06:15 |
|
[jira] Commented: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS |
|
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS |
Mon, 30 Jul, 11:02 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS |
Mon, 30 Jul, 16:20 |
| Hudson (JIRA) |
[jira] Commented: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS |
Tue, 31 Jul, 04:19 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-528) CrawlDbReader: add some new stats + dump into a csv format |
Mon, 30 Jul, 11:06 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-529) NodeWalker.skipChildren don't wrok for more than 1 child. |
Mon, 30 Jul, 11:08 |
| Doğacan Güney (JIRA) |
[jira] Updated: (NUTCH-529) NodeWalker.skipChildren doesn't work for more than 1 child. |
Mon, 30 Jul, 18:59 |
|
[jira] Updated: (NUTCH-533) LinkDbMerger: url normalized is not updated in the key and inlinks list |
|
| Doğacan Güney (JIRA) |
[jira] Updated: (NUTCH-533) LinkDbMerger: url normalized is not updated in the key and inlinks list |
Mon, 30 Jul, 18:59 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-533) LinkDbMerger: url normalized is not updated in the key and inlinks list |
Tue, 31 Jul, 02:41 |
| Doğacan Güney (JIRA) |
[jira] Resolved: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS |
Mon, 30 Jul, 19:03 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS |
Mon, 30 Jul, 19:03 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-534) SegmentMerger: add -normalize option |
Tue, 31 Jul, 10:35 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-534) SegmentMerger: add -normalize option |
Tue, 31 Jul, 10:35 |
| Blaž Smolnikar |
Pages in UTF-16 |
Tue, 31 Jul, 11:23 |
| Doğacan Güney (JIRA) |
[jira] Resolved: (NUTCH-533) LinkDbMerger: url normalized is not updated in the key and inlinks list |
Tue, 31 Jul, 12:07 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-533) LinkDbMerger: url normalized is not updated in the key and inlinks list |
Tue, 31 Jul, 12:12 |
|
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch |
|
| Doğacan Güney (JIRA) |
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch |
Tue, 31 Jul, 13:19 |
| Doğacan Güney (JIRA) |
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch |
Tue, 31 Jul, 14:01 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-520) A common infrastructure for different index backends |
Tue, 31 Jul, 13:21 |
| Hudson (JIRA) |
[jira] Commented: (NUTCH-533) LinkDbMerger: url normalized is not updated in the key and inlinks list |
Wed, 01 Aug, 05:23 |