| jqq |
Searching multiple indexes with Nutch-2 servers,0 segments |
Sat, 02 May, 05:59 |
| ron (JIRA) |
[jira] Created: (NUTCH-734) option to filter "a" tag text |
Sat, 02 May, 11:49 |
| Xalan |
Similarity with few keywords |
Sat, 02 May, 17:57 |
| Andrzej Bialecki |
Re: Searching multiple indexes with Nutch-2 servers,0 segments |
Mon, 04 May, 08:21 |
| jqq |
Re: Searching multiple indexes with Nutch-2 servers,0 segments |
Mon, 04 May, 09:54 |
| Andrzej Bialecki |
Re: Searching multiple indexes with Nutch-2 servers,0 segments |
Mon, 04 May, 13:02 |
| Ilguiz Latypov (JIRA) |
[jira] Commented: (NUTCH-733) plain text view of cached files ignores HTML encoding |
Mon, 04 May, 23:12 |
| MyD |
Filtering URLs |
Tue, 05 May, 14:49 |
| Ken Krugler |
Re: Filtering URLs |
Tue, 05 May, 16:24 |
| Gaurang Patel |
Nutch crawled results for Clustering with Carrot2 |
Wed, 06 May, 13:18 |
| Dawid Weiss |
Re: Nutch crawled results for Clustering with Carrot2 |
Thu, 07 May, 09:11 |
| jqq |
Re: Searching multiple indexes with Nutch-2 servers,0 segments |
Fri, 08 May, 01:47 |
| Susam Pal (JIRA) |
[jira] Created: (NUTCH-735) crawl-tool.xml must be read before nutch-site.xml when invoked using crawl command |
Sat, 09 May, 06:56 |
| Susam Pal (JIRA) |
[jira] Updated: (NUTCH-735) crawl-tool.xml must be read before nutch-site.xml when invoked using crawl command |
Sat, 09 May, 07:01 |
| Susam Pal |
Re: crawl-tool.xml mentions nutch-site.xml for overriding but it is not possible |
Sat, 09 May, 08:37 |
| Gaurang Patel |
Source code of web pages crawled by Nutch |
Mon, 11 May, 21:15 |
| Gaurang Patel |
Content(source code) of web pages crawled by nutch |
Tue, 12 May, 03:20 |
| Rodrigo Reyes C. |
Is there any working Nutch Administration interface in Nutch 1.0? |
Tue, 12 May, 16:16 |
| Marko Bauhardt |
Re: Is there any working Nutch Administration interface in Nutch 1.0? |
Tue, 12 May, 17:13 |
| Rodrigo Reyes C. |
Re: Is there any working Nutch Administration interface in Nutch 1.0? |
Tue, 12 May, 17:37 |
| Siddhartha Reddy |
Nutch/Solr: storing the page cache in Solr |
Wed, 13 May, 13:36 |
| malli j |
Regarding Solr1.3 and Nutch 0.9 Integration |
Wed, 13 May, 15:39 |
| Filipe Antunes (JIRA) |
[jira] Created: (NUTCH-736) how long it takes nutch 1.0 to fetch |
Thu, 14 May, 09:39 |
| Andrzej Bialecki |
Re: Nutch/Solr: storing the page cache in Solr |
Thu, 14 May, 11:59 |
| Filipe Antunes (JIRA) |
[jira] Updated: (NUTCH-736) how long it takes nutch 1.0 to fetch |
Thu, 14 May, 13:28 |
| Andrzej Bialecki |
The Future of Nutch, reactivated |
Thu, 14 May, 13:59 |
| Apache Wiki |
[Nutch Wiki] Trivial Update of "HttpAuthenticationSchemes" by susam |
Thu, 14 May, 14:30 |
| Apache Wiki |
[Nutch Wiki] Update of "RunningNutchAndSolr" by amitkumar |
Thu, 14 May, 17:07 |
| Apache Wiki |
[Nutch Wiki] Update of "RunningNutchAndSolr" by amitkumar |
Thu, 14 May, 17:09 |
| Apache Wiki |
[Nutch Wiki] Update of "RunningNutchAndSolr" by amitkumar |
Thu, 14 May, 17:16 |
| Apache Wiki |
[Nutch Wiki] Update of "RunningNutchAndSolr" by amitkumar |
Thu, 14 May, 17:22 |
| Apache Wiki |
[Nutch Wiki] Update of "RunningNutchAndSolr" by amitkumar |
Thu, 14 May, 17:28 |
| Mattmann, Chris A |
Re: The Future of Nutch, reactivated |
Thu, 14 May, 20:43 |
| Kirby Bohling |
The Future of Nutch, reactivated |
Fri, 15 May, 00:23 |
| Siddhartha Reddy |
Re: Nutch/Solr: storing the page cache in Solr |
Fri, 15 May, 09:06 |
| martin lopez (JIRA) |
[jira] Commented: (NUTCH-386) Plugin to index categories by url rules |
Sat, 16 May, 01:07 |
| martin lopez (JIRA) |
[jira] Issue Comment Edited: (NUTCH-386) Plugin to index categories by url rules |
Sat, 16 May, 01:09 |
| atencorps |
Ranking Algorithms |
Sun, 17 May, 14:55 |
| Aaron Binns |
Re: The Future of Nutch, reactivated |
Mon, 18 May, 19:01 |
| Dennis Kubes |
Re: Ranking Algorithms |
Mon, 18 May, 20:20 |
| Andrzej Bialecki |
Re: The Future of Nutch, reactivated |
Tue, 19 May, 08:01 |
| Aaron Binns |
Re: The Future of Nutch, reactivated |
Tue, 19 May, 19:23 |
| Mark Olson |
Re: The Future of Nutch, reactivated |
Tue, 19 May, 21:24 |
| Mark Olson |
Re: The Future of Nutch, reactivated |
Tue, 19 May, 21:26 |
| Bradford Stephens |
Re: The Future of Nutch, reactivated |
Tue, 19 May, 23:10 |
| Ken Krugler |
Performance issues with queue-based fetching |
Wed, 20 May, 00:27 |
| Frank McCown |
Re: Support for Sitemap Protocol and Canonical URLs |
Wed, 20 May, 21:05 |
| Andrzej Bialecki |
Re: Support for Sitemap Protocol and Canonical URLs |
Thu, 21 May, 08:38 |
| Frank McCown |
Re: Support for Sitemap Protocol and Canonical URLs |
Thu, 21 May, 13:02 |
| Donghyeok Kang |
A link that begins with the question mark(?) can't be crawled. |
Thu, 21 May, 13:51 |
| Andrzej Bialecki |
Re: Support for Sitemap Protocol and Canonical URLs |
Thu, 21 May, 14:23 |
| Dmitry Lihachev (JIRA) |
[jira] Updated: (NUTCH-716) Make subcollection index filed multivalued |
Fri, 22 May, 04:06 |
| Otis Gospodnetic (JIRA) |
[jira] Resolved: (NUTCH-736) how long it takes nutch 1.0 to fetch |
Sun, 24 May, 02:42 |
| Otis Gospodnetic (JIRA) |
[jira] Commented: (NUTCH-731) Redirection of robots.txt in RobotRulesParser |
Sun, 24 May, 03:20 |
| Otis Gospodnetic (JIRA) |
[jira] Commented: (NUTCH-721) Fetcher2 Slow |
Sun, 24 May, 03:51 |
| Otis Gospodnetic (JIRA) |
[jira] Commented: (NUTCH-721) Fetcher2 Slow |
Sun, 24 May, 04:02 |
| Otis Gospodnetic |
Re: The Future of Nutch, reactivated |
Sun, 24 May, 04:30 |
| Roger Dunk (JIRA) |
[jira] Commented: (NUTCH-721) Fetcher2 Slow |
Sun, 24 May, 05:20 |
| Dmitry Lihachev (JIRA) |
[jira] Created: (NUTCH-737) urlnormalizer-unalias plugin |
Tue, 26 May, 04:16 |
| Dmitry Lihachev (JIRA) |
[jira] Updated: (NUTCH-737) urlnormalizer-unalias plugin |
Tue, 26 May, 04:18 |
| Dmitry Lihachev (JIRA) |
[jira] Updated: (NUTCH-737) urlnormalizer-unalias plugin |
Tue, 26 May, 04:18 |
| Martina Koch (JIRA) |
[jira] Created: (NUTCH-738) Close SegmentUpdater when FetchedSegments is closed |
Tue, 26 May, 06:40 |
| Martina Koch (JIRA) |
[jira] Updated: (NUTCH-738) Close SegmentUpdater when FetchedSegments is closed |
Tue, 26 May, 06:42 |
| Martina Koch (JIRA) |
[jira] Updated: (NUTCH-738) Close SegmentUpdater when FetchedSegments is closed |
Tue, 26 May, 06:42 |
| Julien Nioche (JIRA) |
[jira] Commented: (NUTCH-731) Redirection of robots.txt in RobotRulesParser |
Tue, 26 May, 08:34 |
| Dmitry Lihachev (JIRA) |
[jira] Updated: (NUTCH-737) urlnormalizer-unalias plugin |
Tue, 26 May, 09:27 |
| Dmitry Lihachev (JIRA) |
[jira] Updated: (NUTCH-737) urlnormalizer-unalias plugin |
Tue, 26 May, 09:27 |
| Frank McCown |
Re: Support for Sitemap Protocol and Canonical URLs |
Tue, 26 May, 16:51 |
| Dmitry Lihachev (JIRA) |
[jira] Commented: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum |
Wed, 27 May, 02:46 |
| Julien Nioche (JIRA) |
[jira] Updated: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum |
Wed, 27 May, 09:10 |
| Marcin Okraszewski (JIRA) |
[jira] Updated: (NUTCH-677) Segment merge filering based on segment content |
Wed, 27 May, 21:05 |
| Marcin Okraszewski (JIRA) |
[jira] Updated: (NUTCH-490) Extension point with filters for Neko HTML parser (with patch) |
Wed, 27 May, 21:28 |
| Otis Gospodnetic (JIRA) |
[jira] Assigned: (NUTCH-693) Add configurable option for treating nofollow behaviour. |
Thu, 28 May, 04:16 |
| Otis Gospodnetic (JIRA) |
[jira] Commented: (NUTCH-693) Add configurable option for treating nofollow behaviour. |
Thu, 28 May, 04:18 |
| Otis Gospodnetic (JIRA) |
[jira] Commented: (NUTCH-650) Hbase Integration |
Thu, 28 May, 04:24 |
| Dmitry Lihachev (JIRA) |
[jira] Created: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop |
Thu, 28 May, 04:34 |
| Dmitry Lihachev (JIRA) |
[jira] Updated: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop |
Thu, 28 May, 04:36 |
| Dmitry Lihachev (JIRA) |
[jira] Updated: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop |
Thu, 28 May, 08:36 |
| Otis Gospodnetic (JIRA) |
[jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop |
Thu, 28 May, 17:51 |
| Otis Gospodnetic (JIRA) |
[jira] Commented: (NUTCH-677) Segment merge filering based on segment content |
Thu, 28 May, 17:59 |
| Kirby Bohling |
Remove duplicate nutch conf files from .job file |
Thu, 28 May, 18:30 |
| Marcin Okraszewski (JIRA) |
[jira] Created: (NUTCH-740) Configuration option to override default language for fetched pages. |
Thu, 28 May, 21:13 |
| Marcin Okraszewski (JIRA) |
[jira] Updated: (NUTCH-740) Configuration option to override default language for fetched pages. |
Thu, 28 May, 21:15 |
| Otis Gospodnetic |
Re: Remove duplicate nutch conf files from .job file |
Thu, 28 May, 21:35 |
| Otis Gospodnetic (JIRA) |
[jira] Updated: (NUTCH-740) Configuration option to override default language for fetched pages. |
Thu, 28 May, 21:39 |
| Dmitry Lihachev (JIRA) |
[jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop |
Fri, 29 May, 01:52 |
| Ken Krugler (JIRA) |
[jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop |
Fri, 29 May, 02:42 |
| Otis Gospodnetic (JIRA) |
[jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop |
Fri, 29 May, 03:38 |
| Dmitry Lihachev (JIRA) |
[jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop |
Fri, 29 May, 03:46 |
| Dmitry Lihachev (JIRA) |
[jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop |
Fri, 29 May, 03:48 |
| Dmitry Lihachev (JIRA) |
[jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop |
Fri, 29 May, 03:50 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop |
Fri, 29 May, 06:52 |
| Dmitry Lihachev (JIRA) |
[jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop |
Fri, 29 May, 07:54 |
| Dmitry Lihachev (JIRA) |
[jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop |
Fri, 29 May, 08:00 |
| Dmitry Lihachev (JIRA) |
[jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop |
Fri, 29 May, 08:02 |
| Georg Kirschner |
Eclipse Nutch1.0 IOException |
Fri, 29 May, 13:34 |
| Marko Bauhardt |
Re: Eclipse Nutch1.0 IOException |
Fri, 29 May, 13:51 |
| Frank McCown |
Re: Eclipse Nutch1.0 IOException |
Fri, 29 May, 15:14 |
| Georg Kirschner |
Re: Eclipse Nutch1.0 IOException |
Fri, 29 May, 15:21 |
| Otis Gospodnetic (JIRA) |
[jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop |
Fri, 29 May, 18:08 |