lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steven Parkes (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-971) Create enwiki indexable data as line-per-article rather than file-per-article
Date Wed, 01 Aug 2007 15:29:52 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516997
] 

Steven Parkes commented on LUCENE-971:
--------------------------------------

I can look at what it would take to avoid the line file ... but ... what about the overhead
of the XML parser? I don't tend to think of XML parsers as "light". Would bundling that into
the test be a concern?

I guess it's not an issue if you're just using this to create an index and then are going
to do your performance measurements on the queries of the index. But for measuring index performance,
I would probably be cautious of bundling in the XML processing (until proven insignificant).

> Create enwiki indexable data as line-per-article rather than file-per-article
> -----------------------------------------------------------------------------
>
>                 Key: LUCENE-971
>                 URL: https://issues.apache.org/jira/browse/LUCENE-971
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Steven Parkes
>         Attachments: LUCENE-971.patch.txt
>
>
> Create a line per article rather than a file. Consume with indexLineFile task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message