| Doğacan Güney (JIRA) |
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch |
Tue, 31 Jul, 14:01 |
| Andrzej Bialecki |
Re: Plans on releasing another bug fix release? |
Tue, 03 Jul, 19:53 |
| Andrzej Bialecki |
Re: Plans on releasing another bug fix release? |
Wed, 04 Jul, 06:56 |
| Andrzej Bialecki |
Re: Plans on releasing another bug fix release? |
Wed, 04 Jul, 09:35 |
| Andrzej Bialecki |
Re: OPIC scoring differences |
Mon, 09 Jul, 12:28 |
| Andrzej Bialecki |
Re: Not renewing CrawlDatum on Inject |
Mon, 09 Jul, 19:17 |
| Andrzej Bialecki |
Re: Fwd: [Collex] application#index (ActionController::RoutingError) "no route found to match \"/nines/ escape(document.title) u,\" with {:method=>:get}" |
Tue, 10 Jul, 13:36 |
| Andrzej Bialecki |
Re: OPIC scoring differences |
Wed, 11 Jul, 18:14 |
| Andrzej Bialecki |
Re: Looking to fix relative path issue in linkdb |
Thu, 19 Jul, 10:24 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-439) Top Level Domains Indexing / Scoring |
Tue, 10 Jul, 09:24 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-505) Outlink urls should be validated |
Tue, 10 Jul, 13:51 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-505) Outlink urls should be validated |
Thu, 12 Jul, 15:18 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-511) Recrawling |
Thu, 12 Jul, 15:25 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-512) Search on date range |
Thu, 12 Jul, 15:33 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-515) Next fetch time is set incorrectly |
Mon, 16 Jul, 20:34 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-516) Next fetch time is not set when it is a CrawlDatum.STATUS_FETCH_GONE |
Tue, 17 Jul, 14:43 |
| Andrzej Bialecki (JIRA) |
[jira] Reopened: (NUTCH-518) Fix OpicScoringFilter to respect scoring filter chaining |
Wed, 18 Jul, 18:32 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-518) Fix OpicScoringFilter to respect scoring filter chaining |
Wed, 18 Jul, 20:03 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-518) Fix OpicScoringFilter to respect scoring filter chaining |
Thu, 19 Jul, 08:47 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-525) DeleteDuplicates generates ArrayIndexOutOfBoundsException when trying to rerun dedup on a segment |
Tue, 24 Jul, 08:36 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-439) Top Level Domains Indexing / Scoring |
Thu, 26 Jul, 18:03 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-533) LinkDbMerger: url normlaized is not updated in the key and inlinks list |
Mon, 30 Jul, 14:52 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-514) Indexer should only index pages with fetch status SUCCESS |
Mon, 30 Jul, 16:20 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
Tue, 31 Jul, 11:00 |
| Briggs |
Plans on releasing another bug fix release? |
Tue, 03 Jul, 14:12 |
| Briggs |
Re: Plans on releasing another bug fix release? |
Wed, 04 Jul, 19:04 |
| Briggs |
Re: Plans on releasing another bug fix release? |
Fri, 06 Jul, 16:45 |
| Briggs |
Re: Looking to fix relative path issue in linkdb |
Thu, 19 Jul, 13:19 |
| Briggs |
Re: Looking to fix relative path issue in linkdb |
Thu, 19 Jul, 17:31 |
| Carl Cerecke |
OPIC scoring differences |
Sun, 08 Jul, 22:38 |
| Carl Cerecke (JIRA) |
[jira] Created: (NUTCH-531) Pages with no ContentType cause a Null Pointer exception |
Sun, 29 Jul, 20:55 |
| Chris Hane (JIRA) |
[jira] Created: (NUTCH-519) prased incorrectly |
Wed, 18 Jul, 21:54 |
| Cuongnhc |
how can i fetch a site manual |
Thu, 12 Jul, 06:56 |
| Daniel Clark (JIRA) |
[jira] Created: (NUTCH-524) Generate Problem with Single Node |
Mon, 23 Jul, 21:27 |
| Daniel Clark (JIRA) |
[jira] Updated: (NUTCH-524) Generate Problem with Single Node |
Mon, 23 Jul, 21:29 |
| David Fuhry |
Patch to skip hidden plugin directories |
Tue, 03 Jul, 17:33 |
| Dennis Kubes (JIRA) |
[jira] Reopened: (NUTCH-471) Fix synchronization in NutchBean creation |
Fri, 13 Jul, 20:58 |
| Dennis Kubes (JIRA) |
[jira] Commented: (NUTCH-471) Fix synchronization in NutchBean creation |
Sat, 14 Jul, 13:03 |
| Dennis Kubes (JIRA) |
[jira] Closed: (NUTCH-471) Fix synchronization in NutchBean creation |
Sat, 14 Jul, 13:05 |
| Doug Cook (JIRA) |
[jira] Commented: (NUTCH-25) needs 'character encoding' detector |
Fri, 20 Jul, 23:59 |
| Doug Cook (JIRA) |
[jira] Commented: (NUTCH-25) needs 'character encoding' detector |
Sat, 21 Jul, 00:09 |
| Doug Cook (JIRA) |
[jira] Commented: (NUTCH-25) needs 'character encoding' detector |
Sat, 21 Jul, 02:11 |
| Doug Cook (JIRA) |
[jira] Commented: (NUTCH-25) needs 'character encoding' detector |
Sat, 21 Jul, 17:00 |
| Doug Cook (JIRA) |
[jira] Commented: (NUTCH-25) needs 'character encoding' detector |
Sat, 21 Jul, 20:03 |
| Doug Cook (JIRA) |
[jira] Commented: (NUTCH-25) needs 'character encoding' detector |
Tue, 24 Jul, 16:52 |
| Doug Cook (JIRA) |
[jira] Updated: (NUTCH-25) needs 'character encoding' detector |
Tue, 24 Jul, 17:04 |
| Doug Cook (JIRA) |
[jira] Updated: (NUTCH-25) needs 'character encoding' detector |
Tue, 24 Jul, 17:37 |
| Doug Cook (JIRA) |
[jira] Updated: (NUTCH-25) needs 'character encoding' detector |
Tue, 24 Jul, 18:31 |
| Doug Cook (JIRA) |
[jira] Commented: (NUTCH-25) needs 'character encoding' detector |
Wed, 25 Jul, 16:39 |
| Doug Cook (JIRA) |
[jira] Commented: (NUTCH-25) needs 'character encoding' detector |
Thu, 26 Jul, 00:39 |
| Doug Cutting |
Re: Plans on releasing another bug fix release? |
Tue, 03 Jul, 23:29 |
| Emmanuel |
CrawlDbReader TopN |
Wed, 25 Jul, 11:50 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-507) lib-lucene-analyzers jar defintion is wrong in plugin.xml |
Sat, 07 Jul, 17:18 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-508) ${hadoop.log.dir} and ${hadoop.log.file} are not propagated to the tasktracker |
Sat, 07 Jul, 17:28 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-509) Update Crawldb: avoid to start a job if there is no valid segment |
Sun, 08 Jul, 08:04 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-509) Update Crawldb: avoid to start a job if there is no valid segment |
Sun, 08 Jul, 08:04 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-509) Update Crawldb: avoid to start a job if there is no valid segment |
Mon, 09 Jul, 06:16 |
| Emmanuel Joke (JIRA) |
[jira] Closed: (NUTCH-509) Update Crawldb: avoid to start a job if there is no valid segment |
Mon, 09 Jul, 06:18 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-516) Next fetch time is not set when it is a CrawlDatum.STATUS_FETCH_GONE |
Tue, 17 Jul, 12:08 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-516) Next fetch time is not set when it is a CrawlDatum.STATUS_FETCH_GONE |
Wed, 18 Jul, 06:35 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-522) Use URLValidator in the Injector |
Thu, 19 Jul, 11:45 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-522) Use URLValidator in the Injector |
Thu, 19 Jul, 11:45 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-522) Use URLValidator in the Injector |
Fri, 20 Jul, 02:45 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-522) Use URLValidator in the Injector |
Fri, 20 Jul, 08:43 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-526) Use a combiner in LinDbMerger to improve the performance as in LinkDb |
Wed, 25 Jul, 02:24 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-526) Use a combiner in LinDbMerger to improve the performance as in LinkDb |
Wed, 25 Jul, 02:24 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-528) CrawlDbReader: add some new stats + dump into a csv format |
Thu, 26 Jul, 07:55 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-528) CrawlDbReader: add some new stats + dump into a csv format |
Thu, 26 Jul, 07:57 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-529) NodeWalker.skipChildren don't wrok for more than 1 child. |
Fri, 27 Jul, 03:30 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-529) NodeWalker.skipChildren don't wrok for more than 1 child. |
Fri, 27 Jul, 03:30 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-522) Use URLValidator in the Injector |
Fri, 27 Jul, 06:53 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-522) Use URLValidator in the Injector |
Fri, 27 Jul, 06:55 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-522) Use URLValidator in the Injector |
Fri, 27 Jul, 06:55 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-522) Use URLValidator in the Injector |
Sat, 28 Jul, 05:04 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-530) Add a combiner to improve performance on updatedb |
Sun, 29 Jul, 08:32 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-530) Add a combiner to improve performance on updatedb |
Sun, 29 Jul, 08:35 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-526) Use a combiner in LinDbMerger to improve the performance as in LinkDb |
Sun, 29 Jul, 08:37 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Mon, 30 Jul, 08:59 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Mon, 30 Jul, 09:01 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Mon, 30 Jul, 09:01 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Mon, 30 Jul, 09:01 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-533) LinkDbMerger: url normlaized is not updated in the key and inlinks list |
Mon, 30 Jul, 09:53 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-533) LinkDbMerger: url normlaized is not updated in the key and inlinks list |
Mon, 30 Jul, 09:53 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-526) Use a combiner in LinDbMerger to improve the performance as in LinkDb |
Mon, 30 Jul, 10:50 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
Tue, 31 Jul, 02:35 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-533) LinkDbMerger: url normalized is not updated in the key and inlinks list |
Tue, 31 Jul, 02:41 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-532) CrawlDbMerger: wrong computation of last fetch time |
Tue, 31 Jul, 05:20 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-534) SegmentMerger: add -normalize option |
Tue, 31 Jul, 10:35 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-534) SegmentMerger: add -normalize option |
Tue, 31 Jul, 10:35 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
Tue, 31 Jul, 11:13 |
| Enis Soztutar (JIRA) |
[jira] Created: (NUTCH-510) IndexMerger delete working dir |
Mon, 09 Jul, 06:37 |
| Enis Soztutar (JIRA) |
[jira] Updated: (NUTCH-510) IndexMerger delete working dir |
Mon, 09 Jul, 06:52 |
| Enis Soztutar (JIRA) |
[jira] Issue Comment Edited: (NUTCH-510) IndexMerger delete working dir |
Mon, 09 Jul, 12:34 |
| Enis Soztutar (JIRA) |
[jira] Commented: (NUTCH-508) ${hadoop.log.dir} and ${hadoop.log.file} are not propagated to the tasktracker |
Mon, 09 Jul, 13:48 |
| Enis Soztutar (JIRA) |
[jira] Updated: (NUTCH-439) Top Level Domains Indexing / Scoring |
Tue, 10 Jul, 07:51 |
| Enis Soztutar (JIRA) |
[jira] Updated: (NUTCH-439) Top Level Domains Indexing / Scoring |
Tue, 10 Jul, 14:57 |
| Enis Soztutar (JIRA) |
[jira] Updated: (NUTCH-439) Top Level Domains Indexing / Scoring |
Wed, 11 Jul, 05:57 |
| Enis Soztutar (JIRA) |
[jira] Updated: (NUTCH-439) Top Level Domains Indexing / Scoring |
Wed, 11 Jul, 05:59 |
| Enis Soztutar (JIRA) |
[jira] Created: (NUTCH-517) build encoding should be UTF-8 |
Wed, 18 Jul, 08:09 |
| Enis Soztutar (JIRA) |
[jira] Updated: (NUTCH-517) build encoding should be UTF-8 |
Wed, 18 Jul, 08:11 |