| Doğacan Güney |
linkdb bug |
Thu, 28 Dec, 16:15 |
| Doğacan Güney |
Re: linkdb bug |
Fri, 29 Dec, 10:01 |
| ÎâÖ¾Ãô |
hi all: |
Sat, 09 Dec, 07:59 |
| Dogacan Güney (JIRA) |
[jira] Commented: (NUTCH-413) Fetcher ignores -noParsing command line option |
Fri, 08 Dec, 14:00 |
| Dogacan Güney (JIRA) |
[jira] Commented: (NUTCH-413) Fetcher ignores -noParsing command line option |
Fri, 08 Dec, 20:16 |
| Dogacan Güney (JIRA) |
[jira] Created: (NUTCH-417) After upgrade to hadoop-0.9.1, parsing and indexing doesn't work. |
Fri, 15 Dec, 13:27 |
| Dogacan Güney (JIRA) |
[jira] Commented: (NUTCH-417) After upgrade to hadoop-0.9.1, parsing and indexing doesn't work. |
Fri, 15 Dec, 13:31 |
| Dogacan Güney (JIRA) |
[jira] Updated: (NUTCH-417) After upgrade to hadoop-0.9.1, parsing and indexing doesn't work. |
Fri, 15 Dec, 13:31 |
| Dogacan Güney (JIRA) |
[jira] Commented: (NUTCH-417) After upgrade to hadoop-0.9.1, parsing and indexing doesn't work. |
Fri, 15 Dec, 14:47 |
| Dogacan Güney (JIRA) |
[jira] Created: (NUTCH-420) DeleteDuplicates.HashPartitioner depends on the order of IndexDocs |
Tue, 26 Dec, 11:30 |
| Dogacan Güney (JIRA) |
[jira] Updated: (NUTCH-420) DeleteDuplicates.HashPartitioner depends on the order of IndexDocs |
Tue, 26 Dec, 11:32 |
| Alan Tanaman |
RE: Issue with Boosting Fields |
Thu, 28 Dec, 13:39 |
| Alan Tanaman (JIRA) |
[jira] Created: (NUTCH-421) Allow predeterminate running order of index filters |
Wed, 27 Dec, 13:57 |
| Alan Tanaman (JIRA) |
[jira] Updated: (NUTCH-421) Allow predeterminate running order of index filters |
Wed, 27 Dec, 14:01 |
| Alan Tanaman (JIRA) |
[jira] Updated: (NUTCH-421) Allow predeterminate running order of index filters |
Wed, 27 Dec, 15:11 |
| Alan Tanaman (JIRA) |
[jira] Updated: (NUTCH-421) Allow predeterminate running order of index filters |
Wed, 27 Dec, 15:11 |
| Alan Tanaman (JIRA) |
[jira] Created: (NUTCH-422) index-extra plugin creates additional fields in the index, based on configurable logic |
Thu, 28 Dec, 19:23 |
| Alan Tanaman (JIRA) |
[jira] Updated: (NUTCH-422) index-extra plugin creates additional fields in the index, based on configurable logic |
Thu, 28 Dec, 19:25 |
| Andrzej Bialecki |
Warning: set speculative execution to false |
Fri, 15 Dec, 15:05 |
| Andrzej Bialecki |
Re: linkdb bug |
Thu, 28 Dec, 19:04 |
| Andrzej Bialecki |
Re: linkdb bug |
Sat, 30 Dec, 19:19 |
| Andrzej Bialecki (JIRA) |
[jira] Created: (NUTCH-415) Generate should mark selected records in crawlDB |
Fri, 15 Dec, 12:32 |
| Andrzej Bialecki (JIRA) |
[jira] Created: (NUTCH-416) CrawlDatum status and CrawlDbReducer refactoring |
Fri, 15 Dec, 12:47 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-417) After upgrade to hadoop-0.9.1, parsing and indexing doesn't work. |
Fri, 15 Dec, 14:09 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-415) Generate should mark selected records in crawlDB |
Fri, 15 Dec, 15:09 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-417) After upgrade to hadoop-0.9.1, parsing and indexing doesn't work. |
Sat, 16 Dec, 20:40 |
| Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-416) CrawlDatum status and CrawlDbReducer refactoring |
Wed, 20 Dec, 23:18 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-415) Generate should mark selected records in crawlDB |
Thu, 28 Dec, 00:10 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-416) CrawlDatum status and CrawlDbReducer refactoring |
Thu, 28 Dec, 00:14 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-322) Fetcher discards ProtocolStatus, doesn't store redirected pages |
Thu, 28 Dec, 00:18 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-273) When a page is redirected, the original url is NOT updated. |
Thu, 28 Dec, 00:18 |
| Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-274) Empty row in/at end of URL-list results in error |
Thu, 28 Dec, 00:22 |
| Armel T. Nene |
RE: Indexing and Re-crawling site |
Tue, 05 Dec, 09:18 |
| Armel T. Nene |
RE: Indexing and Re-crawling site |
Tue, 05 Dec, 11:10 |
| Armel T. Nene |
Nutch Re-crawl same file over and over again |
Wed, 06 Dec, 23:43 |
| Armel T. Nene |
Nutch site crawling |
Thu, 07 Dec, 10:47 |
| Armel T. Nene |
Fetching problem and FileProtocol bug in Nutch 0.8.1 |
Sun, 10 Dec, 21:16 |
| Brian Whitman |
parse-mp3 plugin concatenating previous tags for text field |
Mon, 11 Dec, 13:32 |
| Brian Whitman (JIRA) |
[jira] Created: (NUTCH-414) parse-mp3 plugin concatenating previous tags for text field |
Tue, 12 Dec, 15:29 |
| Briggs |
Re: Porn sites' link at the download page |
Sun, 10 Dec, 15:59 |
| Briggs |
Changing NutchConf params at Runtime. |
Mon, 11 Dec, 15:39 |
| Carsten Lehmann (JIRA) |
[jira] Created: (NUTCH-419) unavailable robots.txt kills fetch |
Sun, 24 Dec, 12:45 |
| Carsten Lehmann (JIRA) |
[jira] Updated: (NUTCH-419) unavailable robots.txt kills fetch |
Sun, 24 Dec, 13:01 |
| Carsten Lehmann (JIRA) |
[jira] Updated: (NUTCH-419) unavailable robots.txt kills fetch |
Sun, 24 Dec, 13:10 |
| Carsten Lehmann (JIRA) |
[jira] Updated: (NUTCH-419) unavailable robots.txt kills fetch |
Sun, 24 Dec, 13:10 |
| Carsten Lehmann (JIRA) |
[jira] Commented: (NUTCH-419) unavailable robots.txt kills fetch |
Sun, 24 Dec, 13:26 |
| Chris Mattmann |
Re: svn commit: r485076 - in /lucene/nutch/trunk/src: java/org/apache/nutch/metadata/SpellCheckedMetadata.java test/org/apache/nutch/metadata/TestSpellCheckedMetadata.java |
Sat, 09 Dec, 22:56 |
| Chris Mattmann |
Re: svn commit: r485076 - in /lucene/nutch/trunk/src: java/org/apache/nutch/metadata/SpellCheckedMetadata.java test/org/apache/nutch/metadata/TestSpellCheckedMetadata.java |
Sun, 10 Dec, 00:05 |
| Chris Mattmann |
Re: svn commit: r485076 - in /lucene/nutch/trunk/src: java/org/apache/nutch/metadata/SpellCheckedMetadata.java test/org/apache/nutch/metadata/TestSpellCheckedMetadata.java |
Sun, 10 Dec, 16:01 |
| Doug Cook (JIRA) |
[jira] Commented: (NUTCH-416) CrawlDatum status and CrawlDbReducer refactoring |
Wed, 20 Dec, 22:40 |
| Doug Cutting |
Re: Brochure for Nutch |
Fri, 08 Dec, 20:26 |
| Doug Cutting |
Re: What's the status of Nutch-GUI? |
Fri, 08 Dec, 20:35 |
| Eelco Lempsink (JIRA) |
[jira] Updated: (NUTCH-273) When a page is redirected, the original url is NOT updated. |
Fri, 22 Dec, 09:39 |
| Francois.McN...@bnc.ca |
NUTCH 0.8.1: Difficulties with Analyzers |
Wed, 13 Dec, 16:21 |
| Gavino Marras |
Protocol.secure |
Fri, 01 Dec, 14:32 |
| Jonathan Amir (JIRA) |
[jira] Created: (NUTCH-413) Fetcher ignores -noParsing command line option |
Thu, 07 Dec, 23:11 |
| Jonathan Amir (JIRA) |
[jira] Commented: (NUTCH-413) Fetcher ignores -noParsing command line option |
Fri, 08 Dec, 15:12 |
| Lukas Vlcek |
Re: Indexing and Re-crawling site |
Mon, 04 Dec, 22:11 |
| Lukas Vlcek |
Re: Indexing and Re-crawling site |
Wed, 13 Dec, 06:37 |
| Michael Stack |
Re: [Archive-access-discuss] Full List of Metadata Fields |
Wed, 06 Dec, 16:03 |
| Michael Wechner |
Extracting title from XHTML pages |
Wed, 20 Dec, 13:42 |
| Michael Wechner |
Re: Extracting title from XHTML pages |
Wed, 20 Dec, 16:45 |
| Michael Wechner |
difference between intranet and internet crawling |
Wed, 20 Dec, 16:47 |
| Michael Wechner |
Re: Extracting title from XHTML pages |
Thu, 21 Dec, 13:00 |
| Michael Wechner |
Re: Extracting title from XHTML pages |
Thu, 21 Dec, 13:01 |
| Michael Wechner (JIRA) |
[jira] Created: (NUTCH-418) Fixes parsing of XHTML (e.g. title) |
Thu, 21 Dec, 12:58 |
| Michael Wechner (JIRA) |
[jira] Updated: (NUTCH-418) Fixes parsing of XHTML (e.g. title) |
Thu, 21 Dec, 13:00 |
| Renaud Richardet (JIRA) |
[jira] Created: (NUTCH-412) plugin to parse the feed-url (rss/atom) of a blog |
Sun, 03 Dec, 07:21 |
| Renaud Richardet (JIRA) |
[jira] Updated: (NUTCH-412) plugin to parse the feed-url (rss/atom) of a blog |
Sun, 03 Dec, 07:23 |
| Renaud Richardet (JIRA) |
[jira] Updated: (NUTCH-412) plugin to parse the feed-url (rss/atom) of a blog |
Sun, 03 Dec, 08:00 |
| Rida Benjelloun |
Phrase query analysis-fr |
Sat, 02 Dec, 22:45 |
| Sami Siren |
Re: hi all: |
Sat, 09 Dec, 13:38 |
| Sami Siren |
Re: svn commit: r485076 - in /lucene/nutch/trunk/src: java/org/apache/nutch/metadata/SpellCheckedMetadata.java test/org/apache/nutch/metadata/TestSpellCheckedMetadata.java |
Sat, 09 Dec, 23:53 |
| Sami Siren |
Re: svn commit: r485076 - in /lucene/nutch/trunk/src: java/org/apache/nutch/metadata/SpellCheckedMetadata.java test/org/apache/nutch/metadata/TestSpellCheckedMetadata.java |
Sun, 10 Dec, 09:52 |
| Sami Siren |
include hadoop native libs to nutch? |
Mon, 11 Dec, 16:26 |
| Sami Siren |
Re: parse-mp3 plugin concatenating previous tags for text field |
Tue, 12 Dec, 15:13 |
| Sami Siren |
Re: Fetching problem and FileProtocol bug in Nutch 0.8.1 |
Tue, 12 Dec, 16:08 |
| Sami Siren |
Re: Extracting title from XHTML pages |
Wed, 20 Dec, 14:40 |
| Sami Siren (JIRA) |
[jira] Commented: (NUTCH-248) add support for internationalized domain names |
Mon, 11 Dec, 18:54 |
| Sami Siren (JIRA) |
[jira] Commented: (NUTCH-415) Generate should mark selected records in crawlDB |
Fri, 15 Dec, 14:59 |
| Sami Siren (JIRA) |
[jira] Updated: (NUTCH-272) Max. pages to crawl/fetch per site (emergency limit) |
Thu, 21 Dec, 05:10 |
| Sami Siren (JIRA) |
[jira] Commented: (NUTCH-418) Fixes parsing of XHTML (e.g. title) |
Thu, 21 Dec, 14:49 |
| Sean Dean (JIRA) |
[jira] Commented: (NUTCH-224) Nutch doesn't handle Korean text at all |
Sat, 02 Dec, 01:41 |
| Sean Dean (JIRA) |
[jira] Commented: (NUTCH-224) Nutch doesn't handle Korean text at all |
Sat, 02 Dec, 01:47 |
| Sean Dean (JIRA) |
[jira] Commented: (NUTCH-417) After upgrade to hadoop-0.9.1, parsing and indexing doesn't work. |
Sat, 16 Dec, 20:20 |
| Shay Lawless |
Full List of Metadata Fields |
Wed, 06 Dec, 15:31 |
| Shtykh Roman |
Re: implement thai language indexing and search |
Tue, 12 Dec, 12:25 |
| Stefan Groschupf |
Re: What's the status of Nutch-GUI? |
Sat, 02 Dec, 08:04 |
| Thorsten Scherler |
Re: implement thai language indexing and search |
Thu, 21 Dec, 08:30 |
| Zaheed Haque |
Re: Porn sites' link at the download page |
Sun, 10 Dec, 19:45 |
| bruce |
lucene/nutch investigation |
Tue, 05 Dec, 17:43 |
| howard chen |
Want some idea abt distributed searching behind Nutch |
Fri, 08 Dec, 16:46 |
| howard chen |
Porn sites' link at the download page |
Sun, 10 Dec, 09:21 |
| hyrogen |
crawl null pointer |
Thu, 21 Dec, 10:22 |
| kauu |
Re: hi all: |
Sun, 10 Dec, 05:28 |
| kauu |
hi all: |
Thu, 14 Dec, 14:28 |
| lukai |
Re: [jira] Updated: (NUTCH-273) When a page is redirected, the original url is NOT updated. |
Sun, 24 Dec, 07:34 |
| sanjeev |
Re: implement thai language indexing and search |
Mon, 04 Dec, 07:21 |
| sanjeev |
Re: implement thai language indexing and search |
Tue, 12 Dec, 05:39 |
| sanjeev |
Re: implement thai language indexing and search |
Fri, 15 Dec, 04:37 |