lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom Burton-West (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2393) Utility to output total term frequency and df from a lucene index
Date Wed, 14 Apr 2010 17:24:49 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856967#action_12856967
] 

Tom Burton-West commented on LUCENE-2393:
-----------------------------------------

For an example of how this utility can be used please see: http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-1

> Utility to output total term frequency and df from a lucene index
> -----------------------------------------------------------------
>
>                 Key: LUCENE-2393
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2393
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: contrib/*
>            Reporter: Tom Burton-West
>            Priority: Trivial
>         Attachments: LUCENE-2393.patch
>
>
> This is a command line utility that takes a field name, term, and index directory and
outputs the document frequency for the term and the total number of occurrences of the term
in the index (i.e. the sum of the tf of the term for each document).  It is useful for estimating
the size of the term's entry in the *prx files and consequent Disk I/O demands

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message