aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mehrdad Nurolahzade <mehr...@apache.org>
Subject Re: Review Request 58979: Expose stats on log storage lock wait time per caller
Date Thu, 04 May 2017 15:34:19 GMT


> On May 4, 2017, 6:32 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java
> > Lines 210 (patched)
> > <https://reviews.apache.org/r/58979/diff/1/?file=1708018#file1708018line210>
> >
> >     Here you call it "nanos", in the stats above it is called "ns". Should be consistent
I suppose.

I did consider that. However after looking at usage of `SlidingStats` constructor I realized
`log_storage_write_lock_wait` is the only metric that uses `ns` and all others use `nanos`.


> On May 4, 2017, 6:32 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java
> > Lines 559 (patched)
> > <https://reviews.apache.org/r/58979/diff/1/?file=1708018#file1708018line559>
> >
> >     I am wondering why you get the frame here at the top, rather than moving `getCallerFrame()`
into an `if (enableLogStorageLockWaitTracking)` where needed below. Did you fear a performance
penalty doing this while holding the lock?

It ended up looking like that after a bit of refactoring I did. 
Now that you mention it I can see that it looks weird. Going to fix.


- Mehrdad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58979/#review173897
-----------------------------------------------------------


On May 3, 2017, 5:09 p.m., Mehrdad Nurolahzade wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58979/
> -----------------------------------------------------------
> 
> (Updated May 3, 2017, 5:09 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> `LogStorage` exposes aggregated stats on its write lock wait time (`log_storage_write_lock_wait`).
However, this does not help much with performance studies where a breakdown of wait time by
caller is required.
> 
> This patch exposes stats on the writer processes trying to acquire `LogStorage` write
lock. Metric collection has been put behind a feature flag (`enable_log_storage_lock_wait_tracking`)
to ensure it does not depreciate scheduler performance (although the induced overhead should
be minimal). It is disabled by default but can be enabled for performance studies or in test
clusters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b

>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorageModule.java 835f1604c0c5d913a87d570ee01d053bbbf92ecb

>   src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 0eb54fdaddfbc2af76fd83ffee18ce4c6b61cc48

> 
> 
> Diff: https://reviews.apache.org/r/58979/diff/1/
> 
> 
> Testing
> -------
> 
> - Manually under Vagrant
> - end to end test script
> 
> ```
> $ curl -s localhost:8081/vars | egrep -e 'log_storage_write_lock_wait_for_.*_nanos_total
'
> 
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.BatchWorker.processBatch_BatchWorker.java_207__nanos_total
81407978
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.SchedulerLifecycle_4.accept_SchedulerLifecycle.java_226__nanos_total
2337
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.mesos.MesosCallbackHandler_MesosCallbackHandlerImpl.handleRegistration_MesosCallbackHandler.java_182__nanos_total
3257
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.mesos.MesosCallbackHandler_MesosCallbackHandlerImpl.lambda_handleOffers_3_MesosCallbackHandler.java_206__nanos_total
3628
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.Storage_MutateWork_NoResult.apply_Storage.java_152__nanos_total
2101
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.db.RowGarbageCollector.runOneIteration_RowGarbageCollector.java_83__nanos_total
79698582
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.log.LogStorage.replay_LogStorage.java_469__nanos_total
172993
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.log.LogStorage__EnhancerByGuice__924dd57b.CGLIB_start_6__generated___nanos_total
3323
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.updater.JobUpdateControllerImpl.systemResume_JobUpdateControllerImpl.java_302__nanos_total
1928
> 
> curl -s localhost:8081/vars | egrep -e 'log_storage_write_lock_wait_for_.*_events '
> 
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.BatchWorker.processBatch_BatchWorker.java_207__events
1
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.SchedulerLifecycle_4.accept_SchedulerLifecycle.java_226__events
1
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.mesos.MesosCallbackHandler_MesosCallbackHandlerImpl.handleRegistration_MesosCallbackHandler.java_182__events
1
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.mesos.MesosCallbackHandler_MesosCallbackHandlerImpl.lambda_handleOffers_3_MesosCallbackHandler.java_206__events
1
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.Storage_MutateWork_NoResult.apply_Storage.java_152__events
1
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.db.RowGarbageCollector.runOneIteration_RowGarbageCollector.java_83__events
2
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.log.LogStorage.replay_LogStorage.java_469__events
35
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.log.LogStorage__EnhancerByGuice__924dd57b.CGLIB_start_6__generated___events
1
> log_storage_write_lock_wait_for_org.apache.aurora.scheduler.updater.JobUpdateControllerImpl.systemResume_JobUpdateControllerImpl.java_302__events
1
> ```
> 
> 
> Thanks,
> 
> Mehrdad Nurolahzade
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message