storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis.gospodne...@gmail.com>
Subject Re: Supervisor CPU usage 100%, supervisors restarting in production
Date Tue, 04 Feb 2014 04:14:27 GMT
Hi,

Could it be JVM GC?  Check your GC counts and timings and correlate them
with your other Storm metrics.  If you use something like SPM for Storm you
can send your Storm, JVM, and system metrics graph to the Storm mailing
list directly from SPM.  This may help others help you more easily.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Mon, Feb 3, 2014 at 5:23 AM, Chitra Raveendran <
chitra.raveendran@flutura.com> wrote:

> Hi
>
> I have a storm cluster in production.
>
>     Recently CPU usage by the supervisor machines is hitting 100% during
> weekends, this is kind of weird as we have least traffic on our website
> during weekends. The system gets hung and the supervisord daemon keeps
> trying to restart the storm daemons. Since all the supervisors are being
> affected, the topology is getting hung.
>     Whenever this happens, we loose ssh access to the servers, and have to
> reboot so that the memory gets cleaned up.
>
> There are 4 supervisor machines(VM's) each with 8GB RAM & 4 cores
> And a separate Nimbus machine(8GB RAM, 4 cores).
> There are 12 workers in each node, we currently have around 15 unused
> slots.
>
> Generally the CPU used is around 50-60 percent for these systems and out
> of 8GB only 3-4 GB of RAM is used.
>
> What could be happening?
>
> --
>
> Regards,
>
> *Chitra Raveendran*
> *Data Scientist*
> Mobile: +91 819753660│*Email:* chitra.raveendran@flutura.com
> *Flutura Business Solutions Private Limited – “A Decision Sciences &
> Analytics Company”*│ #693, 2nd Floor, Geetanjali, 15th Cross, J.P Nagar 2
> nd Phase, Bangalore – 560078│
>
>
>

Mime
View raw message