lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mahmoud Almokadem <prog.mahm...@gmail.com>
Subject Re: Unbalanced CPU no SolrCloud
Date Mon, 16 Oct 2017 11:35:05 GMT
The transition of the load happened after I restarted the bulk insert
process.

The size of the index on each server about 500GB.

There are about 8 warnings on each server for "Not found segment file" like
that

Error getting file length for [segments_2s4]

java.nio.file.NoSuchFileException:
/media/ssd_losedata/solr-home/data/documents_online_shard16_replica_n1/data/index/segments_2s4
at
java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
at
java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at
java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
at
java.base/sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
at
java.base/sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:145)
at
java.base/sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
at java.base/java.nio.file.Files.readAttributes(Files.java:1755)
at java.base/java.nio.file.Files.size(Files.java:2369)
at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:243)
at
org.apache.lucene.store.NRTCachingDirectory.fileLength(NRTCachingDirectory.java:128)
at
org.apache.solr.handler.admin.LukeRequestHandler.getFileLength(LukeRequestHandler.java:611)
at
org.apache.solr.handler.admin.LukeRequestHandler.getIndexInfo(LukeRequestHandler.java:584)
at
org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(LukeRequestHandler.java:136)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2474)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:720)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:526)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:378)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:322)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.base/java.lang.Thread.run(Thread.java:844)

On Mon, Oct 16, 2017 at 1:08 PM, Emir Arnautović <
emir.arnautovic@sematext.com> wrote:

> I did not look at graph details - now I see that it is over 3h time span.
> It seems that there was a load on the other server before this one and
> ended with 14GB read spike and 10GB write spike, just before load started
> on this server. Do you see any errors or suspicious logs lines?
> How big is your index?
>
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 16 Oct 2017, at 12:39, Mahmoud Almokadem <prog.mahmoud@gmail.com>
> wrote:
> >
> > Yes, it's constantly since I started this bulk indexing process.
> > As you see the write operations on the loaded server are 3x the normal
> > server despite Disk writes not 3x times.
> >
> > Mahmoud
> >
> >
> > On Mon, Oct 16, 2017 at 12:32 PM, Emir Arnautović <
> > emir.arnautovic@sematext.com> wrote:
> >
> >> Hi Mahmoud,
> >> Is this something that you see constantly? Network charts suggests that
> >> your servers are loaded equally and as you said - you are not using
> routing
> >> so expected. Disk read/write and CPU are not equal and it is expected to
> >> not be equal during heavy indexing since it also triggers segment merges
> >> which require those resources. Even if host same documents (e.g. leader
> and
> >> replica) merges are not likely to happen at the same time and you can
> >> expect to see such cases.
> >>
> >> Thanks,
> >> Emir
> >> --
> >> Monitoring - Log Management - Alerting - Anomaly Detection
> >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >>
> >>
> >>
> >>> On 16 Oct 2017, at 11:58, Mahmoud Almokadem <prog.mahmoud@gmail.com>
> >> wrote:
> >>>
> >>> Here are the screen shots for the two server metrics on Amazon
> >>>
> >>> https://ibb.co/kxBQam
> >>> https://ibb.co/fn0Jvm
> >>> https://ibb.co/kUpYT6
> >>>
> >>>
> >>>
> >>> On Mon, Oct 16, 2017 at 11:37 AM, Mahmoud Almokadem <
> >> prog.mahmoud@gmail.com>
> >>> wrote:
> >>>
> >>>> Hi Emir,
> >>>>
> >>>> We doesn't use routing.
> >>>>
> >>>> Servers is already balanced and the number of documents on each shard
> >> are
> >>>> approximately the same.
> >>>>
> >>>> Nothing running on the servers except Solr and ZooKeeper.
> >>>>
> >>>> I initialized the client as
> >>>>
> >>>> String zkHost = "192.168.1.89:2181,192.168.1.99:2181";
> >>>>
> >>>> CloudSolrClient solrCloud = new CloudSolrClient.Builder()
> >>>>                   .withZkHost(zkHost)
> >>>>                   .build();
> >>>>
> >>>>           solrCloud.setIdField("document_id");
> >>>>           solrCloud.setDefaultCollection(collection);
> >>>>           solrCloud.setRequestWriter(new BinaryRequestWriter());
> >>>>
> >>>>
> >>>> And the documents are approximately the same size.
> >>>>
> >>>> I Used 10 threads with 10 SolrClients to send data to solr and every
> >>>> thread send a batch of 1000 documents every time.
> >>>>
> >>>> Thanks,
> >>>> Mahmoud
> >>>>
> >>>>
> >>>>
> >>>> On Mon, Oct 16, 2017 at 11:01 AM, Emir Arnautović <
> >>>> emir.arnautovic@sematext.com> wrote:
> >>>>
> >>>>> Hi Mahmoud,
> >>>>> Do you use routing? Are your servers equally balanced - do you end
up
> >>>>> having approximately the same number of documents hosted on both
> >> servers
> >>>>> (counted all shards)?
> >>>>> Do you have anything else running on those servers?
> >>>>> How do you initialise your SolrJ client?
> >>>>> Are documents of similar size?
> >>>>>
> >>>>> Thanks,
> >>>>> Emir
> >>>>> --
> >>>>> Monitoring - Log Management - Alerting - Anomaly Detection
> >>>>> Solr & Elasticsearch Consulting Support Training -
> >> http://sematext.com/
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On 16 Oct 2017, at 10:46, Mahmoud Almokadem <prog.mahmoud@gmail.com
> >
> >>>>> wrote:
> >>>>>>
> >>>>>> We've installed SolrCloud 7.0.1 with two nodes and 8 shards
per
> node.
> >>>>>>
> >>>>>> The configurations and the specs of the two servers are identical.
> >>>>>>
> >>>>>> When running bulk indexing using SolrJ we see one of the servers
is
> >>>>> fully
> >>>>>> loaded as you see on the images and the other is normal.
> >>>>>>
> >>>>>> Images URLs:
> >>>>>>
> >>>>>> https://ibb.co/jkE6gR
> >>>>>> https://ibb.co/hyzvam
> >>>>>> https://ibb.co/mUpvam
> >>>>>> https://ibb.co/e4bxo6
> >>>>>>
> >>>>>> How can I figure this issue?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Mahmoud
> >>>>>
> >>>>>
> >>>>
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message