lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steven Bower (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SOLR-6259) Performance issue with large number of fields and values when using copyFields
Date Sat, 19 Jul 2014 03:44:38 GMT
Steven Bower created SOLR-6259:
----------------------------------

             Summary: Performance issue with large number of fields and values when using
copyFields
                 Key: SOLR-6259
                 URL: https://issues.apache.org/jira/browse/SOLR-6259
             Project: Solr
          Issue Type: Bug
    Affects Versions: 4.8.1
            Reporter: Steven Bower
            Priority: Critical


When you have schema with a large enough number of fields (in my case around 250 fields) and
you use copyFields to populate a number of fields (very few in my case 3-4) you see a severe
degradation in the performance of ingestion.

Tracking this down using a profiler found that in the lucene Document.getField() was using
87% of all CPU time. As it turns out getField() does an iteration over the list of fields
in the Document returning the field if the name matches.. Anyway in the case of copyFields
with lots of values getField() gets called alot...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message