| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-590) Index multiple docs per call using IndexingFilter extension point |
Tue, 15 Jan, 23:11 |
|
[jira] Commented: (NUTCH-584) urls missing from fetchlist |
|
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-584) urls missing from fetchlist |
Wed, 16 Jan, 01:09 |
| Doğacan Güney (JIRA) |
[jira] Commented: (NUTCH-584) urls missing from fetchlist |
Wed, 16 Jan, 08:43 |
| Ruslan Ermilov (JIRA) |
[jira] Commented: (NUTCH-584) urls missing from fetchlist |
Wed, 16 Jan, 16:40 |
| Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-584) urls missing from fetchlist |
Wed, 16 Jan, 01:11 |
| Andrzej Bialecki |
Serious bug in Generator / FreeGenerator |
Wed, 16 Jan, 01:15 |
|
[jira] Commented: (NUTCH-363) Fetcher normalizes everything at least twice |
|
| iwan cornelius (JIRA) |
[jira] Commented: (NUTCH-363) Fetcher normalizes everything at least twice |
Wed, 16 Jan, 06:57 |
| Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-363) Fetcher normalizes everything at least twice |
Wed, 16 Jan, 07:27 |
| Hudson (JIRA) |
[jira] Commented: (NUTCH-597) Fetcher2 - java.lang.NullPointerException when host does not exist and fetcher.threads.per.host.by.ip is set to true causes threads to finish. |
Wed, 16 Jan, 08:15 |
| Andrzej Bialecki (JIRA) |
[jira] Resolved: (NUTCH-584) urls missing from fetchlist |
Wed, 16 Jan, 16:54 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-584) urls missing from fetchlist |
Wed, 16 Jan, 16:54 |
| Krishnamohan Meduri |
Help: parsing pdf files |
Wed, 16 Jan, 20:31 |
| Martin Kuen |
Re: Help: parsing pdf files |
Thu, 17 Jan, 00:07 |
| Manoj Bist |
Need pointers regarding accessing crawled data/customizing policy for crawl. |
Thu, 17 Jan, 07:32 |
| Andrzej Bialecki |
Re: Need pointers regarding accessing crawled data/customizing policy for crawl. |
Thu, 17 Jan, 09:35 |
| hud...@lucene.zones.apache.org |
Build failed in Hudson: Nutch-Nightly #331 |
Thu, 17 Jan, 16:34 |
| hud...@lucene.zones.apache.org |
Hudson build is back to normal: Nutch-Nightly #332 |
Fri, 18 Jan, 06:00 |
| Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-570) Improvement of URL Ordering in Generator.java |
Thu, 17 Jan, 20:20 |
| Andrzej Bialecki (JIRA) |
[jira] Resolved: (NUTCH-152) TaskRunner io pipes are not setDaemon(true), cleanup and exception errors are incomplete, max heap too small |
Thu, 17 Jan, 20:28 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-186) mapred-default.xml is over ridden by nutch-site.xml |
Thu, 17 Jan, 20:28 |
| Andrzej Bialecki (JIRA) |
[jira] Resolved: (NUTCH-186) mapred-default.xml is over ridden by nutch-site.xml |
Thu, 17 Jan, 20:28 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-152) TaskRunner io pipes are not setDaemon(true), cleanup and exception errors are incomplete, max heap too small |
Thu, 17 Jan, 20:28 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-159) Specify temp/working directory for crawl |
Thu, 17 Jan, 20:30 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-159) Specify temp/working directory for crawl |
Thu, 17 Jan, 20:30 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-95) DeleteDuplicates depends on the order of input segments |
Thu, 17 Jan, 20:32 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-95) DeleteDuplicates depends on the order of input segments |
Thu, 17 Jan, 20:32 |
| Andrzej Bialecki |
End-Of-Life status for 0.7.x? |
Thu, 17 Jan, 20:38 |
| Dennis Kubes |
Re: End-Of-Life status for 0.7.x? |
Thu, 17 Jan, 20:49 |
| Yousef Ourabi |
Re: End-Of-Life status for 0.7.x? |
Thu, 17 Jan, 21:18 |
| Chris Mattmann |
Re: End-Of-Life status for 0.7.x? |
Fri, 18 Jan, 00:29 |
| Sami Siren |
Re: End-Of-Life status for 0.7.x? |
Fri, 18 Jan, 04:22 |
| Jérôme Charron |
Re: End-Of-Life status for 0.7.x? |
Fri, 18 Jan, 08:25 |
| Doğacan Güney |
Re: End-Of-Life status for 0.7.x? |
Fri, 18 Jan, 09:17 |
| Cuong Le Manh |
Re: End-Of-Life status for 0.7.x? |
Fri, 18 Jan, 08:24 |
| Ahmad Dahlan |
New Developer |
Fri, 18 Jan, 01:53 |
| Andrzej Bialecki |
NOTICE: End Of Life status for Nutch 0.7.x |
Fri, 18 Jan, 09:52 |
| Sami Siren (JIRA) |
[jira] Resolved: (NUTCH-580) Remove deprecated hadoop api calls (FS) |
Sat, 19 Jan, 09:01 |
| Hudson Apache Zone |
Build failed in Hudson: Nutch-trunk #333 |
Sat, 19 Jan, 09:08 |
| Hudson Apache Zone |
Build failed in Hudson: Nutch-trunk #334 |
Sun, 20 Jan, 09:08 |
| Hudson Apache Zone |
Hudson build is back to normal: Nutch-trunk #335 |
Mon, 21 Jan, 09:15 |
| armand rayman (JIRA) |
[jira] Commented: (NUTCH-595) "Target file:/.... already exists" |
Sun, 20 Jan, 05:14 |
| kishore.krish...@wipro.com |
Crawl taking too much time |
Mon, 21 Jan, 05:57 |
| Dennis Kubes |
Re: Crawl taking too much time |
Mon, 21 Jan, 14:35 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-12) WebDBReader options to print incoming links |
Tue, 22 Jan, 14:10 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-12) WebDBReader options to print incoming links |
Tue, 22 Jan, 14:10 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-175) No input directories specified in: while crawing in nightly build from the 14.1.2006: sh ./nutch crawl urllist.txt -dir tmpdir |
Tue, 22 Jan, 14:12 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-226) CrawlDb Filter tool |
Tue, 22 Jan, 14:14 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-226) CrawlDb Filter tool |
Tue, 22 Jan, 14:14 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-59) meta data support in webdb |
Tue, 22 Jan, 14:18 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-59) meta data support in webdb |
Tue, 22 Jan, 14:18 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-128) second configuration nodes overwrites first node |
Tue, 22 Jan, 14:22 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-128) second configuration nodes overwrites first node |
Tue, 22 Jan, 14:22 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-115) jobtracker.jsp shows too much information |
Tue, 22 Jan, 14:22 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-115) jobtracker.jsp shows too much information |
Tue, 22 Jan, 14:22 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-163) LogFormatter design |
Tue, 22 Jan, 14:26 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-163) LogFormatter design |
Tue, 22 Jan, 14:26 |