nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] Updated: (NUTCH-439) Top Level Domains Indexing / Scoring
Date Thu, 08 Feb 2007 07:20:06 GMT

     [ https://issues.apache.org/jira/browse/NUTCH-439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Enis Soztutar updated NUTCH-439:
--------------------------------

    Attachment: tld_plugin_v1.1.patch

I have forgotten to unset http.agent.name in the v1.0 accidentally. this version is the same
except agent name is not set. This patch obsoletes v1.0. 


> Top Level Domains Indexing / Scoring
> ------------------------------------
>
>                 Key: NUTCH-439
>                 URL: https://issues.apache.org/jira/browse/NUTCH-439
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer
>    Affects Versions: 0.9.0
>            Reporter: Enis Soztutar
>         Attachments: tld_plugin_v1.0.patch, tld_plugin_v1.1.patch
>
>
> Top Level Domains (tlds) are the last part(s) of the host name in a DNS system. TLDs
are managed by the Internet Assigned Numbers Authority. IANA divides tlds into three. infrastructure,
generic(such as "com", "edu") and country code tlds(such as "en", "de" , "tr", ). Indexing
the top level domain and optionally boosting is needed for improving the search results and
enhancing locality. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message