aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Farner <wfar...@apache.org>
Subject Re: Review Request 62869: Exclusively use Map-based in-memory stores for primary storage
Date Wed, 18 Oct 2017 18:11:32 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62869/
-----------------------------------------------------------

(Updated Oct. 18, 2017, 11:11 a.m.)


Review request for Aurora, Jordan Ly and Stephan Erb.


Changes
-------

Addressed first batch of review comments.


Repository: aurora


Description
-------

This patch introduces map-based volatile stores, most of which were revived from git history
with minimal changes.  The DB storage system is now only used in a temporary storage when
replaying a snapshot containing the `dbScript` field.

Please take special care to double-check my work in `SnapshotStoreImpl`, as it must function
for backwards compatibility with `Snapshot`s in the wild.  Another area of interest is the
implementation of `MemJobUpdateStore`, which must have nuanced behavior for compatibility
with `DbJobUpdateStore`.  Most of this stems from behind-the-scenes interaction with `LockStore`
and use of cascading deletes.

Note that this change removes the transactional nature of in-memory storage operations as
well as the `READ COMMITTED` transaction isolation previously available to some stores (proven
in necessary changes to `StorageTransactionTest`).  This means some stores will permit dirty
reads when they previously did not.  `TaskStore` has always had this non-transactional behavior
by default, as the DB task store was never deemed suitable for production.  Nonetheless, this
non-transactional behavior should be considered safe as the scheduler fails over on a storage
operation failure, and relies on the persistent log storage for transaction atomicity.

See the [mailing list discussion](https://lists.apache.org/thread.html/0bb74c78da7fa12954e2438e8011b3bff73fe3bfdafeaaa2ff00a98e@%3Cdev.aurora.apache.org%3E)
for context.


Diffs (updated)
-----

  RELEASE-NOTES.md c58e68080beb1740d92639601ef7cc29c63be37e 
  api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9ee9eff6c3d657decd93458dc3e6792a30614a60

  docs/reference/scheduler-configuration.md 4e3f90713c307e3b9e9f84c29343af7f014f0165 
  examples/vagrant/upstart/aurora-scheduler-kerberos.conf 04e3d311c6845689cd3bd813aa8dcf9df55f1199

  examples/vagrant/upstart/aurora-scheduler.conf 63fcc87be653835cb3c3f25dae4164f1d7c8d4da

  src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 7d3766829161457b1b3ba50bce128047d10b2c58

  src/jmh/java/org/apache/aurora/benchmark/SnapshotBenchmarks.java ae59f3def75d0e1d3866c8d2f85063456643e9a6

  src/jmh/java/org/apache/aurora/benchmark/StateManagerBenchmarks.java c293a9fd7c435e17dfc09f9ccc180154be8c7b2d

  src/jmh/java/org/apache/aurora/benchmark/StatusUpdateBenchmark.java 45c2ab96d8fa65ab12ba710954128da987f0544b

  src/jmh/java/org/apache/aurora/benchmark/TaskStoreBenchmarks.java f5e44dfd6d2c81e90859eaf24cf904a9662c1c72

  src/jmh/java/org/apache/aurora/benchmark/ThriftApiBenchmarks.java 7b4050637433ab656fc09958bac1a050fec02bed

  src/jmh/java/org/apache/aurora/benchmark/UpdateStoreBenchmarks.java cac02a55d26b2934099a2b03457c703600296292

  src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java dd0e480c628f6bb44de0ce758c8af2bb54db1bc6

  src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 64733e5b38f238e64424127a53591e6546a3e9a8

  src/main/java/org/apache/aurora/scheduler/http/JettyServerModule.java 93cd20adc28ed700719e472bb2331137a93d1d9d

  src/main/java/org/apache/aurora/scheduler/storage/JobUpdateStore.java 52c3c6618a3cf1009435ca8a9cece36365913e55

  src/main/java/org/apache/aurora/scheduler/storage/Storage.java 6c676693e531541ef9aec7f7a130c119ebf35df3

  src/main/java/org/apache/aurora/scheduler/storage/Util.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/backup/TemporaryStorage.java bad05f5bf1976bb590e8dd7af9b43d5d30e892a9

  src/main/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStore.java cbe5a0deff1ebc38a9618e7d89ab073dfaf78d36

  src/main/java/org/apache/aurora/scheduler/storage/db/DbModule.java fb3dadb4b4c7036112e423b2bc1b2540d793b4d6

  src/main/java/org/apache/aurora/scheduler/storage/db/DbStorage.java 7904e3880d214aac1013ce18265fed924ef7897e

  src/main/java/org/apache/aurora/scheduler/storage/db/DbUtil.java 942d180fe1a8b7a583e812f9106ee0e8db1bea55

  src/main/java/org/apache/aurora/scheduler/storage/log/LogStorageModule.java 6fdf31d6ed430f3dc0a0117442fc8b532aa40230

  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java fdf9c33f80be3eff03f50732b8c4735cb9ff3b91

  src/main/java/org/apache/aurora/scheduler/storage/log/WriteAheadStorage.java d145259653dd4df90e3877fc3e744e24c7a15d13

  src/main/java/org/apache/aurora/scheduler/storage/mem/InMemStoresModule.java 24296b6c61e62a09c30f01dfb84448f73a5f1e44

  src/main/java/org/apache/aurora/scheduler/storage/mem/MemAttributeStore.java PRE-CREATION

  src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStore.java PRE-CREATION

  src/main/java/org/apache/aurora/scheduler/storage/mem/MemLockStore.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemQuotaStore.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemSchedulerStore.java PRE-CREATION

  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java PRE-CREATION

  src/main/java/org/apache/aurora/scheduler/storage/mem/Util.java c28fb65010af5e3db925487929d4e0e12b4101a4

  src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java 54202359382bfe39e7cbaec0bf4c7d65d10ca13b

  src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java a363e70176881518653d0774e9d0c4be0f7f6d78

  src/test/java/org/apache/aurora/scheduler/app/local/LocalSchedulerMain.java 3dee55a38752faea52c0c6ace90db8352ef9b2ea

  src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 9b4f2ad15ab5b61d4cccfad38ba48f17e7853425

  src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 2b817076d6beb09586d4711bc10bceb31f3b74e1

  src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 459d6bebcd7d6341dac2aead7e3dd8ce87bc9ed6

  src/test/java/org/apache/aurora/scheduler/cron/quartz/CronJobManagerImplTest.java 81440f5689f9538a4c7a9e6700bf03ca89c4ba85

  src/test/java/org/apache/aurora/scheduler/http/MaintenanceTest.java f94b58b77b7c6ce824914af7e1147e73ad5a7eed

  src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java 09f443bca90f154b547d28ca5fd5be08177fdf99

  src/test/java/org/apache/aurora/scheduler/state/LockManagerImplTest.java 19f9de312596395eff81bbfd073a5f617d2ef84c

  src/test/java/org/apache/aurora/scheduler/state/StateManagerImplTest.java 24176fe089703386e8246f6add1965c111c136ed

  src/test/java/org/apache/aurora/scheduler/stats/ResourceCounterTest.java d73656dc8fb8764c7a66eed3ea789554ce0ecc52

  src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 04ec7715ba9a9d47c99d4ccb92939f3e561e1a60

  src/test/java/org/apache/aurora/scheduler/storage/db/AttributeStoreTest.java PRE-CREATION

  src/test/java/org/apache/aurora/scheduler/storage/db/DbAttributeStoreTest.java f47f4a8a492fb43bacd909dc520256ed028531dd

  src/test/java/org/apache/aurora/scheduler/storage/db/DbCronJobStoreTest.java b68298ab50a3621da8596947619f2d1c34d2d194

  src/test/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStoreTest.java 8fca54becde34a0d10d60e05d3809fc1c41233bf

  src/test/java/org/apache/aurora/scheduler/storage/db/DbLockStoreTest.java 8ed58e01eea09ab9f1aa4d269a5e59ce9c9c2191

  src/test/java/org/apache/aurora/scheduler/storage/db/DbQuotaStoreTest.java 275d0fd93819169550ce39c8ddfa6c4fa0e4d920

  src/test/java/org/apache/aurora/scheduler/storage/db/DbSchedulerStoreTest.java a6320bf687aa41c698bdc6a2c0b5208624cff4a3

  src/test/java/org/apache/aurora/scheduler/storage/db/DbTaskStoreTest.java bcf5be43ab69ef565bb68f9235b4c43d385898aa

  src/test/java/org/apache/aurora/scheduler/storage/db/JobUpdateStoreTest.java PRE-CREATION

  src/test/java/org/apache/aurora/scheduler/storage/db/LockStoreTest.java PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/storage/db/QuotaStoreTest.java PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/storage/db/SchedulerStoreTest.java PRE-CREATION

  src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplIT.java b2c333eabbd3e3ae8f119170266ce6054b53fcae

  src/test/java/org/apache/aurora/scheduler/storage/mem/InMemTaskStoreTest.java 84fa3ff48a0957760f2002ed026e72b882171109

  src/test/java/org/apache/aurora/scheduler/storage/mem/MemAttributeStoreTest.java PRE-CREATION

  src/test/java/org/apache/aurora/scheduler/storage/mem/MemCronJobStoreTest.java 91e591f3c84489386a7c20d33278ba691b3ae7ba

  src/test/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStoreTest.java PRE-CREATION

  src/test/java/org/apache/aurora/scheduler/storage/mem/MemLockStoreTest.java PRE-CREATION

  src/test/java/org/apache/aurora/scheduler/storage/mem/MemQuotaStoreTest.java PRE-CREATION

  src/test/java/org/apache/aurora/scheduler/storage/mem/MemStorageTest.java PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/storage/mem/StorageTransactionTest.java 25f34e2bc26c6d4754c1591fad7f2165dd465d32

  src/test/java/org/apache/aurora/scheduler/thrift/ThriftIT.java 2cd19d572d2945e28630b9bce5ba58eb9753630a

  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 453366ef4e4b6db45643b1927887771dbd795bf9



Diff: https://reviews.apache.org/r/62869/diff/5/

Changes: https://reviews.apache.org/r/62869/diff/4-5/


Testing
-------

Still need to complete (but should not block review):

* Run `./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh` from scratch at this patch.
* Run `./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh` on master, then this patch,
then on master.  This verifies some level of replay compatibility between versions.

*Most relevant benchmark*

On master:
```
Benchmark                                             (instanceOverrides)  (instances)  (metadata)
  Mode  Cnt    Score    Error  Units
UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A         1000   
     N/A  thrpt    5  696.186 ± 67.296  ops/s
UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A         5000   
     N/A  thrpt    5  160.567 ± 22.365  ops/s
UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A        10000   
     N/A  thrpt    5   82.897 ±  5.384  ops/s
UpdateStoreBenchmarks.JobInstructionsBenchmark.run                      1          N/A   
     N/A  thrpt    5  166.567 ± 20.929  ops/s
UpdateStoreBenchmarks.JobInstructionsBenchmark.run                     10          N/A   
     N/A  thrpt    5  161.149 ±  6.016  ops/s
UpdateStoreBenchmarks.JobInstructionsBenchmark.run                    100          N/A   
     N/A  thrpt    5  141.993 ±  7.119  ops/s
UpdateStoreBenchmarks.JobInstructionsBenchmark.run                   1000          N/A   
     N/A  thrpt    5   65.481 ±  0.861  ops/s
UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A   
      10  thrpt    5  167.387 ± 10.009  ops/s
UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A   
     100  thrpt    5  153.032 ± 19.067  ops/s
UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A   
    1000  thrpt    5  140.656 ± 16.893  ops/s
UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A   
   10000  thrpt    5   45.039 ±  4.840  ops/s
```

With this patch:
```
Benchmark                                             (instanceOverrides)  (instances)  (metadata)
  Mode  Cnt         Score          Error  Units
UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A         1000   
     N/A  thrpt    5  43700548.771 ±  4491654.631  ops/s
UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A         5000   
     N/A  thrpt    5  44234835.787 ±  6078260.236  ops/s
UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A        10000   
     N/A  thrpt    5  43362266.895 ±  7546660.236  ops/s
UpdateStoreBenchmarks.JobInstructionsBenchmark.run                      1          N/A   
     N/A  thrpt    5  45796640.959 ±  3698698.216  ops/s
UpdateStoreBenchmarks.JobInstructionsBenchmark.run                     10          N/A   
     N/A  thrpt    5  46402021.197 ±  4800377.641  ops/s
UpdateStoreBenchmarks.JobInstructionsBenchmark.run                    100          N/A   
     N/A  thrpt    5  44416822.563 ±  9591398.027  ops/s
UpdateStoreBenchmarks.JobInstructionsBenchmark.run                   1000          N/A   
     N/A  thrpt    5  44004609.998 ± 20332063.977  ops/s
UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A   
      10  thrpt    5  47403264.761 ±  4787575.678  ops/s
UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A   
     100  thrpt    5  45629630.921 ±  7654875.770  ops/s
UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A   
    1000  thrpt    5  44150253.103 ±  9376206.202  ops/s
UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A   
   10000  thrpt    5  44215602.085 ±  8326237.269  ops/s
```


Thanks,

Bill Farner


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message