| Jérôme Charron |
Re: End-Of-Life status for 0.7.x? |
Fri, 18 Jan, 08:25 |
| Doğacan Güney |
Re: End-Of-Life status for 0.7.x? |
Fri, 18 Jan, 09:17 |
| Doğacan Güney (JIRA) |
[jira] Resolved: (NUTCH-559) NTLM, Basic and Digest Authentication schemes for web/proxy server |
Fri, 04 Jan, 19:51 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-481) http.content.limit is broken in the protocol-httpclient plugin |
Fri, 04 Jan, 19:51 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-539) HttpClient plugin does not work with BasicAuthentication |
Fri, 04 Jan, 19:53 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-561) HttpClient plugin does not work with NTLM authentication |
Fri, 04 Jan, 19:53 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-559) NTLM, Basic and Digest Authentication schemes for web/proxy server |
Fri, 04 Jan, 19:53 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-560) protocol-httpclient reading more bytes than http.content.limit |
Fri, 04 Jan, 19:53 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-598) Remove deprecated use of ToolBase, Migration to the new implementation |
Fri, 04 Jan, 19:59 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-567) Proper (?) handling of URIs in TagSoup. |
Sat, 05 Jan, 23:02 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-599) nutch crawl and index problem |
Tue, 08 Jan, 07:44 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-600) Nutch index problem |
Fri, 11 Jan, 18:03 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-598) Remove deprecated use of ToolBase, Migration to the new implementation |
Sat, 12 Jan, 08:44 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-584) urls missing from fetchlist |
Wed, 16 Jan, 08:43 |
| Ahmad Dahlan |
New Developer |
Fri, 18 Jan, 01:53 |
| Andrzej Bialecki |
Re: setting number of reduce outputs problem |
Sat, 12 Jan, 13:15 |
| Andrzej Bialecki |
Serious bug in Generator / FreeGenerator |
Wed, 16 Jan, 01:15 |
| Andrzej Bialecki |
Re: Need pointers regarding accessing crawled data/customizing policy for crawl. |
Thu, 17 Jan, 09:35 |
| Andrzej Bialecki |
End-Of-Life status for 0.7.x? |
Thu, 17 Jan, 20:38 |
| Andrzej Bialecki |
NOTICE: End Of Life status for Nutch 0.7.x |
Fri, 18 Jan, 09:52 |
| Andrzej Bialecki |
Re: Reg: Nutch Admin GUI |
Wed, 30 Jan, 09:35 |
| Andrzej Bialecki |
Re: read crawldb. |
Thu, 31 Jan, 07:52 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-534) SegmentMerger: add -normalize option |
Tue, 15 Jan, 17:55 |
| Andrzej Bialecki (JIRA) |
[jira] Resolved: (NUTCH-534) SegmentMerger: add -normalize option |
Tue, 15 Jan, 17:55 |
| Andrzej Bialecki (JIRA) |
[jira] Resolved: (NUTCH-528) CrawlDbReader: add some new stats + dump into a csv format |
Tue, 15 Jan, 22:03 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-528) CrawlDbReader: add some new stats + dump into a csv format |
Tue, 15 Jan, 22:05 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS |
Tue, 15 Jan, 22:27 |
| Andrzej Bialecki (JIRA) |
[jira] Resolved: (NUTCH-597) Fetcher2 - java.lang.NullPointerException when host does not exist and fetcher.threads.per.host.by.ip is set to true causes threads to finish. |
Tue, 15 Jan, 22:39 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-597) Fetcher2 - java.lang.NullPointerException when host does not exist and fetcher.threads.per.host.by.ip is set to true causes threads to finish. |
Tue, 15 Jan, 22:41 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-594) Serve Nutch search results in XML and JSON |
Tue, 15 Jan, 22:49 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-592) Fetcher2 : NPE for page with status ProtocolStatus.TEMP_MOVED |
Tue, 15 Jan, 23:01 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-590) Index multiple docs per call using IndexingFilter extension point |
Tue, 15 Jan, 23:11 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-584) urls missing from fetchlist |
Wed, 16 Jan, 01:09 |
| Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-584) urls missing from fetchlist |
Wed, 16 Jan, 01:11 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-584) urls missing from fetchlist |
Wed, 16 Jan, 16:54 |
| Andrzej Bialecki (JIRA) |
[jira] Resolved: (NUTCH-584) urls missing from fetchlist |
Wed, 16 Jan, 16:54 |
| Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-570) Improvement of URL Ordering in Generator.java |
Thu, 17 Jan, 20:20 |
| Andrzej Bialecki (JIRA) |
[jira] Resolved: (NUTCH-186) mapred-default.xml is over ridden by nutch-site.xml |
Thu, 17 Jan, 20:28 |
| Andrzej Bialecki (JIRA) |
[jira] Resolved: (NUTCH-152) TaskRunner io pipes are not setDaemon(true), cleanup and exception errors are incomplete, max heap too small |
Thu, 17 Jan, 20:28 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-186) mapred-default.xml is over ridden by nutch-site.xml |
Thu, 17 Jan, 20:28 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-152) TaskRunner io pipes are not setDaemon(true), cleanup and exception errors are incomplete, max heap too small |
Thu, 17 Jan, 20:28 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-159) Specify temp/working directory for crawl |
Thu, 17 Jan, 20:30 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-159) Specify temp/working directory for crawl |
Thu, 17 Jan, 20:30 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-95) DeleteDuplicates depends on the order of input segments |
Thu, 17 Jan, 20:32 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-95) DeleteDuplicates depends on the order of input segments |
Thu, 17 Jan, 20:32 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-12) WebDBReader options to print incoming links |
Tue, 22 Jan, 14:10 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-12) WebDBReader options to print incoming links |
Tue, 22 Jan, 14:10 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-175) No input directories specified in: while crawing in nightly build from the 14.1.2006: sh ./nutch crawl urllist.txt -dir tmpdir |
Tue, 22 Jan, 14:12 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-226) CrawlDb Filter tool |
Tue, 22 Jan, 14:14 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-226) CrawlDb Filter tool |
Tue, 22 Jan, 14:14 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-59) meta data support in webdb |
Tue, 22 Jan, 14:18 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-59) meta data support in webdb |
Tue, 22 Jan, 14:18 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-128) second configuration nodes overwrites first node |
Tue, 22 Jan, 14:22 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-128) second configuration nodes overwrites first node |
Tue, 22 Jan, 14:22 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-115) jobtracker.jsp shows too much information |
Tue, 22 Jan, 14:22 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-115) jobtracker.jsp shows too much information |
Tue, 22 Jan, 14:22 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-163) LogFormatter design |
Tue, 22 Jan, 14:26 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-163) LogFormatter design |
Tue, 22 Jan, 14:26 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-252) Launching a segread/readdb command kills any running nutch commands |
Tue, 22 Jan, 14:30 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-252) Launching a segread/readdb command kills any running nutch commands |
Tue, 22 Jan, 14:31 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-438) Add -noAdditions to updatedb |
Tue, 22 Jan, 14:38 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-440) Command line utilities should exit with an error message when given wrong arguments |
Tue, 22 Jan, 14:38 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-438) Add -noAdditions to updatedb |
Tue, 22 Jan, 14:38 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-440) Command line utilities should exit with an error message when given wrong arguments |
Tue, 22 Jan, 14:38 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-368) Message queueing system |
Tue, 22 Jan, 14:50 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-368) Message queueing system |
Tue, 22 Jan, 14:50 |
| Apache Hudson Server |
Build failed in Hudson: Nutch-trunk #340 |
Sat, 26 Jan, 10:03 |
| Apache Hudson Server |
Hudson build is back to normal: Nutch-trunk #341 |
Sat, 26 Jan, 20:14 |
| Apache Hudson Server |
Build failed in Hudson: Nutch-trunk #343 |
Mon, 28 Jan, 06:30 |
| Apache Hudson Server |
Build failed in Hudson: Nutch-trunk #344 |
Mon, 28 Jan, 08:10 |
| Apache Hudson Server |
Hudson build is back to normal: Nutch-trunk #345 |
Tue, 29 Jan, 04:35 |
| Bryan Bishop |
Plugins? |
Sat, 12 Jan, 01:37 |
| Bryan Bishop |
Re: Plugins? |
Sat, 12 Jan, 01:48 |
| Carl Cerecke (JIRA) |
[jira] Commented: (NUTCH-531) Pages with no ContentType cause a Null Pointer exception |
Tue, 08 Jan, 05:58 |
| Chris Chiappone (JIRA) |
[jira] Commented: (NUTCH-368) Message queueing system |
Tue, 15 Jan, 21:09 |
| Chris Mattmann |
Re: Student contributions |
Thu, 03 Jan, 01:43 |
| Chris Mattmann |
Tika 0.1-incubating released |
Mon, 07 Jan, 18:00 |
| Chris Mattmann |
Re: End-Of-Life status for 0.7.x? |
Fri, 18 Jan, 00:29 |
| Cuong Le Manh |
Re: End-Of-Life status for 0.7.x? |
Fri, 18 Jan, 08:24 |
| Dawid Weiss (JIRA) |
[jira] Commented: (NUTCH-567) Proper (?) handling of URIs in TagSoup. |
Sat, 05 Jan, 17:38 |
| Dennis Kubes |
Re: nutch and future |
Thu, 10 Jan, 17:08 |
| Dennis Kubes |
Re: End-Of-Life status for 0.7.x? |
Thu, 17 Jan, 20:49 |
| Dennis Kubes |
Re: Crawl taking too much time |
Mon, 21 Jan, 14:35 |
| Dennis Kubes |
Re: read crawldb. |
Tue, 29 Jan, 16:50 |
| Dennis Kubes (JIRA) |
[jira] Updated: (NUTCH-587) Upgrade Nutch to use Hadoop 0.15.3 release |
Sat, 26 Jan, 22:08 |
| Dennis Kubes (JIRA) |
[jira] Updated: (NUTCH-587) Upgrade Nutch to use Hadoop 0.15.3 release |
Sat, 26 Jan, 22:18 |
| Dennis Kubes (JIRA) |
[jira] Resolved: (NUTCH-587) Upgrade Nutch to use Hadoop 0.15.3 release |
Mon, 28 Jan, 22:39 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-598) Remove deprecated use of ToolBase, Migration to the new implementation |
Wed, 02 Jan, 08:54 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-598) Remove deprecated use of ToolBase, Migration to the new implementation |
Wed, 02 Jan, 08:58 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-559) NTLM, Basic and Digest Authentication schemes for web/proxy server |
Fri, 04 Jan, 07:31 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-580) Remove deprecated hadoop api calls (FS) |
Fri, 04 Jan, 07:37 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-531) Pages with no ContentType cause a Null Pointer exception |
Fri, 04 Jan, 07:57 |
| Emmanuel Joke (JIRA) |
[jira] Issue Comment Edited: (NUTCH-559) NTLM, Basic and Digest Authentication schemes for web/proxy server |
Fri, 04 Jan, 09:55 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS |
Fri, 04 Jan, 11:04 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-598) Remove deprecated use of ToolBase, Migration to the new implementation |
Mon, 07 Jan, 07:37 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-534) SegmentMerger: add -normalize option |
Fri, 11 Jan, 12:02 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-528) CrawlDbReader: add some new stats + dump into a csv format |
Fri, 11 Jan, 12:04 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-363) Fetcher normalizes everything at least twice |
Wed, 16 Jan, 07:27 |
| Frank McCown |
Student contributions |
Wed, 02 Jan, 22:44 |
| Frank McCown |
Re: Student contributions |
Thu, 03 Jan, 15:29 |