| NIDHI MALIK |
nutch internet crawling help |
Fri, 28 Dec, 11:28 |
|
Build failed in Hudson: Nutch-Nightly #312 |
|
| hud...@lucene.zones.apache.org |
Build failed in Hudson: Nutch-Nightly #312 |
Tue, 01 Jan, 04:19 |
| hud...@lucene.zones.apache.org |
Build failed in Hudson: Nutch-Nightly #313 |
Wed, 02 Jan, 04:24 |
| hud...@lucene.zones.apache.org |
Build failed in Hudson: Nutch-Nightly #314 |
Wed, 02 Jan, 16:05 |
| hud...@lucene.zones.apache.org |
Hudson build is back to normal: Nutch-Nightly #315 |
Wed, 02 Jan, 19:38 |
| Emmanuel Joke (JIRA) |
[jira] Created: (NUTCH-598) Remove deprecated use of ToolBase, Migration to the new implementation |
Wed, 02 Jan, 08:54 |
|
[jira] Updated: (NUTCH-598) Remove deprecated use of ToolBase, Migration to the new implementation |
|
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-598) Remove deprecated use of ToolBase, Migration to the new implementation |
Wed, 02 Jan, 08:58 |
| Emmanuel Joke (JIRA) |
[jira] Updated: (NUTCH-598) Remove deprecated use of ToolBase, Migration to the new implementation |
Mon, 07 Jan, 07:37 |
| Frank McCown |
Student contributions |
Wed, 02 Jan, 22:44 |
| jian chen |
Re: Student contributions |
Wed, 02 Jan, 22:49 |
| Chris Mattmann |
Re: Student contributions |
Thu, 03 Jan, 01:43 |
| Frank McCown |
Re: Student contributions |
Thu, 03 Jan, 15:29 |
| hud...@lucene.zones.apache.org |
Build failed in Hudson: Nutch-Nightly #316 |
Thu, 03 Jan, 04:42 |
| hud...@lucene.zones.apache.org |
Hudson build is back to normal: Nutch-Nightly #317 |
Fri, 04 Jan, 05:44 |
|
[jira] Commented: (NUTCH-559) NTLM, Basic and Digest Authentication schemes for web/proxy server |
|
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-559) NTLM, Basic and Digest Authentication schemes for web/proxy server |
Fri, 04 Jan, 07:31 |
| Hudson (JIRA) |
[jira] Commented: (NUTCH-559) NTLM, Basic and Digest Authentication schemes for web/proxy server |
Sat, 05 Jan, 05:47 |
|
[jira] Commented: (NUTCH-580) Remove deprecated hadoop api calls (FS) |
|
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-580) Remove deprecated hadoop api calls (FS) |
Fri, 04 Jan, 07:37 |
| Hudson (JIRA) |
[jira] Commented: (NUTCH-580) Remove deprecated hadoop api calls (FS) |
Sat, 19 Jan, 09:10 |
| Hudson (JIRA) |
[jira] Commented: (NUTCH-580) Remove deprecated hadoop api calls (FS) |
Sun, 20 Jan, 11:10 |
|
[jira] Commented: (NUTCH-531) Pages with no ContentType cause a Null Pointer exception |
|
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-531) Pages with no ContentType cause a Null Pointer exception |
Fri, 04 Jan, 07:57 |
| Carl Cerecke (JIRA) |
[jira] Commented: (NUTCH-531) Pages with no ContentType cause a Null Pointer exception |
Tue, 08 Jan, 05:58 |
| Emmanuel Joke (JIRA) |
[jira] Issue Comment Edited: (NUTCH-559) NTLM, Basic and Digest Authentication schemes for web/proxy server |
Fri, 04 Jan, 09:55 |
|
[jira] Commented: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS |
|
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS |
Fri, 04 Jan, 11:04 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS |
Tue, 15 Jan, 22:27 |
| Doğacan Güney (JIRA) |
[jira] Resolved: (NUTCH-559) NTLM, Basic and Digest Authentication schemes for web/proxy server |
Fri, 04 Jan, 19:51 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-481) http.content.limit is broken in the protocol-httpclient plugin |
Fri, 04 Jan, 19:51 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-561) HttpClient plugin does not work with NTLM authentication |
Fri, 04 Jan, 19:53 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-539) HttpClient plugin does not work with BasicAuthentication |
Fri, 04 Jan, 19:53 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-560) protocol-httpclient reading more bytes than http.content.limit |
Fri, 04 Jan, 19:53 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-559) NTLM, Basic and Digest Authentication schemes for web/proxy server |
Fri, 04 Jan, 19:53 |
|
[jira] Commented: (NUTCH-598) Remove deprecated use of ToolBase, Migration to the new implementation |
|
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-598) Remove deprecated use of ToolBase, Migration to the new implementation |
Fri, 04 Jan, 19:59 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-598) Remove deprecated use of ToolBase, Migration to the new implementation |
Sat, 12 Jan, 08:44 |
|
[jira] Commented: (NUTCH-567) Proper (?) handling of URIs in TagSoup. |
|
| Dawid Weiss (JIRA) |
[jira] Commented: (NUTCH-567) Proper (?) handling of URIs in TagSoup. |
Sat, 05 Jan, 17:38 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-567) Proper (?) handling of URIs in TagSoup. |
Sat, 05 Jan, 23:02 |
| hud...@lucene.zones.apache.org |
Build failed in Hudson: Nutch-Nightly #319 |
Sun, 06 Jan, 04:27 |
| hud...@lucene.zones.apache.org |
Build failed in Hudson: Nutch-Nightly #320 |
Mon, 07 Jan, 04:29 |
| hud...@lucene.zones.apache.org |
Build failed in Hudson: Nutch-Nightly #321 |
Tue, 08 Jan, 04:46 |
| hud...@lucene.zones.apache.org |
Build failed in Hudson: Nutch-Nightly #322 |
Wed, 09 Jan, 05:31 |
| hud...@lucene.zones.apache.org |
Hudson build is back to normal: Nutch-Nightly #323 |
Wed, 09 Jan, 20:40 |
| Chris Mattmann |
Tika 0.1-incubating released |
Mon, 07 Jan, 18:00 |
| sudarat (JIRA) |
[jira] Created: (NUTCH-599) nutch crawl and index problem |
Tue, 08 Jan, 01:46 |
| Susam Pal |
Re: [jira] Created: (NUTCH-599) nutch crawl and index problem |
Tue, 08 Jan, 04:51 |
| Susam Pal |
Re: [jira] Created: (NUTCH-599) nutch crawl and index problem |
Tue, 08 Jan, 04:57 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-599) nutch crawl and index problem |
Tue, 08 Jan, 07:44 |
| Jesiel Trevisan |
Problems with Hadhoop Log4J on Nutch 0.8.1 |
Tue, 08 Jan, 18:01 |
| sudarat (JIRA) |
[jira] Created: (NUTCH-600) Nutch index problem |
Wed, 09 Jan, 04:54 |
| tigger . |
nutch and future |
Thu, 10 Jan, 16:34 |
| Dennis Kubes |
Re: nutch and future |
Thu, 10 Jan, 17:08 |
|
[jira] Commented: (NUTCH-534) SegmentMerger: add -normalize option |
|
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-534) SegmentMerger: add -normalize option |
Fri, 11 Jan, 12:02 |
| Hudson (JIRA) |
[jira] Commented: (NUTCH-534) SegmentMerger: add -normalize option |
Wed, 16 Jan, 08:15 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-528) CrawlDbReader: add some new stats + dump into a csv format |
Fri, 11 Jan, 12:04 |
| Doğacan Güney (JIRA) |
[jira] Closed: (NUTCH-600) Nutch index problem |
Fri, 11 Jan, 18:03 |
| viz |
setting number of reduce outputs problem |
Sat, 12 Jan, 00:05 |
| Andrzej Bialecki |
Re: setting number of reduce outputs problem |
Sat, 12 Jan, 13:15 |
| Bryan Bishop |
Plugins? |
Sat, 12 Jan, 01:37 |
| Bryan Bishop |
Re: Plugins? |
Sat, 12 Jan, 01:48 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-534) SegmentMerger: add -normalize option |
Tue, 15 Jan, 17:55 |
| Andrzej Bialecki (JIRA) |
[jira] Resolved: (NUTCH-534) SegmentMerger: add -normalize option |
Tue, 15 Jan, 17:55 |
|
[jira] Commented: (NUTCH-368) Message queueing system |
|
| Chris Chiappone (JIRA) |
[jira] Commented: (NUTCH-368) Message queueing system |
Tue, 15 Jan, 21:09 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-368) Message queueing system |
Tue, 22 Jan, 14:50 |
| Andrzej Bialecki (JIRA) |
[jira] Resolved: (NUTCH-528) CrawlDbReader: add some new stats + dump into a csv format |
Tue, 15 Jan, 22:03 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-528) CrawlDbReader: add some new stats + dump into a csv format |
Tue, 15 Jan, 22:05 |
| Andrzej Bialecki (JIRA) |
[jira] Resolved: (NUTCH-597) Fetcher2 - java.lang.NullPointerException when host does not exist and fetcher.threads.per.host.by.ip is set to true causes threads to finish. |
Tue, 15 Jan, 22:39 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-597) Fetcher2 - java.lang.NullPointerException when host does not exist and fetcher.threads.per.host.by.ip is set to true causes threads to finish. |
Tue, 15 Jan, 22:41 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-594) Serve Nutch search results in XML and JSON |
Tue, 15 Jan, 22:49 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-592) Fetcher2 : NPE for page with status ProtocolStatus.TEMP_MOVED |
Tue, 15 Jan, 23:01 |