lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (LUCENE-7963) Useless getAttribute() in DefaultIndexingChain causes performance drop
Date Fri, 08 Sep 2017 13:00:00 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-7963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16158588#comment-16158588
] 

Uwe Schindler edited comment on LUCENE-7963 at 9/8/17 12:59 PM:
----------------------------------------------------------------

bq. If it writes to memory it's not dead code?

The write here is not the write to the termAtt local variable. This one is already removed
(it ignores the return value). The write comes from something inside getAttribute, as it needs
to cast the attribute so it does a lot of work. In fact it is also not dead code - although
the return value is ignroed, because the HashMap#get or the Class#cast may have side effects
(like throwing exception,...). It might have no side effects, but that's hard to figure out
by the optimizer. As far as I remeber: Hotspot was never able to remove hashmap gets easily!


was (Author: thetaphi):
bq. If it writes to memory it's not dead code?

The write here is not the write to the termAtt local variable. This one is already removed
(it ignores the return value). The write comes from something inside getAttribute, as it needs
to cast the attribute so it does a lot of work. In fact it is also not dead code - although
the return value is ignroed, because the HashMap#get may have side effects (like throwing
exception,...). It might have no side effects, but that's hard to figure out by the optimizer.
As far as I remeber: Hotspot was never able to remove hashmap gets easily!

> Useless getAttribute() in DefaultIndexingChain causes performance drop
> ----------------------------------------------------------------------
>
>                 Key: LUCENE-7963
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7963
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 7.0
>         Environment: *Hardware*
> i3.16xlarge (Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, 64 cores, two sockets)
> *Software*
> * Ubuntu 16.04 (kernel 4.4.0-1022-aws)
> * Java 1.8.0_144
>            Reporter: Daniel Mitterdorfer
>            Assignee: Uwe Schindler
>            Priority: Blocker
>             Fix For: 7.0
>
>         Attachments: 0001-Remove-unused-variable-in-DefaultIndexingChain.patch
>
>
> Some background: We have spotted a significant drop in indexing throughput on machines
with a large number of cores (see https://github.com/elastic/elasticsearch/issues/26339 for
details).
> We finally managed to isolate the behavior in a JMH "microbenchmark". Here is the output
of running that benchmark with 32 threads with JMH's perfasm profiler (also Oracle Development
Studio finds the same hotspot):
> {code}
> ....[Hottest Region 1]..............................................................................
> C2, level 4, org.apache.lucene.index.DefaultIndexingChain$PerField::invert, version 1231
(792 bytes)
>                           0x00007f52d56b026a: xor    %r10d,%r10d        ;*invokevirtual
isAssignableFrom
>                                                                         ; - org.apache.lucene.util.AttributeSource::addAttribute@28
(line 207)
>                                                                         ; - org.apache.lucene.document.Field$BinaryTokenStream::&lt;init&gt;@8
(line 512)
>                                                                         ; - org.apache.lucene.document.Field::tokenStream@82
(line 491)
>                                                                         ; - org.apache.lucene.index.DefaultIndexingChain$PerField::invert@99
(line 729)
>                           0x00007f52d56b026d: mov    $0x8,%r11d         ;*invokeinterface
iterator
>                                                                         ; - org.apache.lucene.util.AttributeSource::getCurrentState@46
(line 254)
>                                                                         ; - org.apache.lucene.util.AttributeSource::clearAttributes@1
(line 269)
>                                                                         ; - org.apache.lucene.document.Field$BinaryTokenStream::incrementToken@10
(line 532)
>                                                                         ; - org.apache.lucene.index.DefaultIndexingChain$PerField::invert@153
(line 736)
>   0.00%    0.02%       ↗  0x00007f52d56b0273: test   %r10,%r10
>                   ╭    │  0x00007f52d56b0276: je     0x00007f52d56b0292  ;*getfield
fieldsData
>                   │    │                                                ; - org.apache.lucene.document.Field::binaryValue@1
(line 441)
>                   │    │                                                ; - org.apache.lucene.document.Field::tokenStream@65
(line 487)
>                   │    │                                                ; - org.apache.lucene.index.DefaultIndexingChain$PerField::invert@99
(line 729)
>   0.00%    0.00%  │    │  0x00007f52d56b0278: mov    (%r11),%rsi        ;*getfield
next
>                   │    │                                                ; - java.util.HashMap::getNode@137
(line 580)
>                   │    │                                                ; - java.util.LinkedHashMap::get@6
(line 440)
>                   │    │                                                ; - org.apache.lucene.util.AttributeSource::getAttribute@6
(line 245)
>                   │    │                                                ; - org.apache.lucene.index.DefaultIndexingChain$PerField::invert@143
(line 734)
>   0.09%    0.51%  │    │  0x00007f52d56b027b: mov    0x18(%rsi),%r8
>  23.70%    3.54%  │    │  0x00007f52d56b027f: mov    $0x7f4926a81d88,%rcx  ;   {metadata(&apos;org/apache/lucene/analysis/tokenattributes/CharTermAttribute&apos;)}
>   0.00%    0.00%  │    │  0x00007f52d56b0289: cmp    %rcx,%r8
>                   │    │  0x00007f52d56b028c: jne    0x00007f52d56b0949  ;*instanceof
>                   │    │                                                ; - org.apache.lucene.document.Field::binaryValue@4
(line 441)
>                   │    │                                                ; - org.apache.lucene.document.Field::tokenStream@65
(line 487)
>                   │    │                                                ; - org.apache.lucene.index.DefaultIndexingChain$PerField::invert@99
(line 729)
>   0.01%    0.00%  ↘    │  0x00007f52d56b0292: test   %r10,%r10
>                        │  0x00007f52d56b0295: je     0x00007f52d56b1e0d  ;*invokevirtual
addAttributeImpl
>                        │                                                ; - org.apache.lucene.util.AttributeSource::addAttribute@80
(line 213)
>                        │                                                ; - org.apache.lucene.document.Field$StringTokenStream::&lt;init&gt;@8
(line 550)
>                        │                                                ; - org.apache.lucene.document.Field::tokenStream@47
(line 483)
>                        │                                                ; - org.apache.lucene.index.DefaultIndexingChain$PerField::invert@99
(line 729)
>   0.01%    0.00%       │  0x00007f52d56b029b: mov    (%r11),%rsi        ;*invokevirtual
hashCode
>                        │                                                ; - java.util.HashMap::hash@9
(line 338)
>                        │                                                ; - java.util.LinkedHashMap::get@2
(line 440)
>                        │                                                ; - org.apache.lucene.util.AttributeSource::getAttribute@6
(line 245)
>                        │                                                ; - org.apache.lucene.index.DefaultIndexingChain$PerField::invert@143
(line 734)
> {code}
> You can see that a significant amount of time is spent in {{mov $0x7f4926a81d88,%rcx}}.
It turns out that this maps to the following line in Java ({{DefaultIndexingChain$PerField::invert}}):
> {code:java}
> CharTermAttribute termAtt = tokenStream.getAttribute(CharTermAttribute.class);
> {code}
> Which is - luckily - unused (and removed by the attached patch).
> We have verified the impact of the change with an Elasticsearch macrobenchmark which
indexes 165,346,692 documents (~ 74GB) with 24 clients concurrently. The reported numbers
are median indexing throughput during that benchmark against a single Elasticsearch node:
> * Without the patch: 125418 documents/s
> * With the patch: 221237 documents/s
> Details about the benchmark setup etc. are mentioned in the [Elasticsearch issue #26339|https://github.com/elastic/elasticsearch/issues/26339]
if you're interested.
> Unfortunately, it is beyond me why writing this register to main memory takes such a
long time and why C2 did not eliminate this line as dead code to begin with.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message