aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Farner <wfar...@apache.org>
Subject Re: Review Request 62869: Exclusively use Map-based in-memory stores for primary storage
Date Wed, 18 Oct 2017 18:00:54 GMT


> On Oct. 17, 2017, 11:21 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/db/DbModule.java
> > Lines 79-103 (original), 72-96 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852500#file1852500line79>
> >
> >     Should these be removed? The release notes indicate these have been removed?
Are we keeping this for the deprecation cycle?

This was an attempt to minimize the diff and make easier to prove that i did not change the
Db implementations.  If you look closely, you'll see that this `Options` class doesn't make
its way to the CLI code, so it's rendered useless.  However, to make this less confusing i
will remove the `@Parameter` labels from all of the fields.


> On Oct. 17, 2017, 11:21 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/mem/MemAttributeStore.java
> > Lines 34 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852507#file1852507line34>
> >
> >     public?
> >     
> >     s/Mutable/AttributeStore.Mutable/

> public?

Store implementations are deliberately non-public where practical, as they should only be
used through the store interface.

> s/Mutable/AttributeStore.Mutable/

Done.


> On Oct. 17, 2017, 11:21 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/mem/MemAttributeStore.java
> > Lines 43-52 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852507#file1852507line43>
> >
> >     I guess you mean `replace` on `ConcurrentMap`?
> >     
> >     +1 (Sounds more robust)

Unfortunately not, we'd be looking for something much closer to `merge()`, but per the preivous
comment, `merge()` falls short.


> On Oct. 17, 2017, 11:21 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStore.java
> > Lines 156-159 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852508#file1852508line156>
> >
> >     Should we insert the refreshed entry into the map as soon as possible? Inserting
all at once towards the end seems like it can cause races.

See previous comments about write concurrency and let me know if you still have concerns here.


> On Oct. 17, 2017, 11:21 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStore.java
> > Lines 236 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852508#file1852508line236>
> >
> >     `LockStore` can change after `tokenFromLockStore` returns. Should we protect
against this race?

See previous comments about write concurrency and let me know if you still have concerns here.


> On Oct. 17, 2017, 11:21 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStore.java
> > Lines 298 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852508#file1852508line298>
> >
> >     Same here- wonder if we can move to using `Stream`s?

Done by popular demand :-)


> On Oct. 17, 2017, 11:21 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStore.java
> > Lines 307 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852508#file1852508line307>
> >
> >     Is this correct? Should it be - 
> >     
> >     `System.currentTimeMillis() - s.getState().getCreatedTimestampMs() > historyPruneThresholdMs`

> >     
> >     ?

The caller code in `JobUpdateHistoryPruner` should alleviate this concern:

```
          Set<IJobUpdateKey> prunedUpdates = storeProvider.getJobUpdateStore().pruneHistory(
              settings.maxUpdatesPerJob,
              clock.nowMillis() - settings.maxHistorySize.as(Time.MILLISECONDS));
```

i.e. `pruneHistory()` accepts the point-in-time _threshold_ rather than the interval.


> On Oct. 17, 2017, 11:21 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStore.java
> > Lines 345-360 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852508#file1852508line345>
> >
> >     +1

Stream conversion done.

Aside - reviewboard makes a +1 difficult to stitch together when in a separate review, as
it only threads in a _comment_ on a review, rather than a comment on the same line in a diff.


> On Oct. 17, 2017, 11:21 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/mem/MemLockStore.java
> > Lines 33 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852509#file1852509line33>
> >
> >     public

-1, see previous comment for rationale


> On Oct. 17, 2017, 11:21 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/mem/MemQuotaStore.java
> > Lines 27-28 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852510#file1852510line27>
> >
> >     public

-1, see previous comment for rationale


> On Oct. 17, 2017, 11:21 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/mem/MemSchedulerStore.java
> > Lines 26 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852511#file1852511line26>
> >
> >     public
> >     
> >     (Not sure which one is the convention that you are going for. Make all `Store`s
public or package-private.)

-1, see previous comment for rationale


> On Oct. 17, 2017, 11:21 a.m., Santhosh Kumar Shanmugham wrote:
> > src/test/java/org/apache/aurora/scheduler/storage/db/AttributeStoreTest.java
> > Lines 19 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852528#file1852528line19>
> >
> >     Is this going to be all the test or do you intend to add more later? (Here and
for all the `MemStore`s)

This was intentional, as a mechanism to prove compatibility between the DB and mem stores.
 Are there implementation-specific test cases you have in mind?


- Bill


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62869/#review188327
-----------------------------------------------------------


On Oct. 10, 2017, 12:35 p.m., Bill Farner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62869/
> -----------------------------------------------------------
> 
> (Updated Oct. 10, 2017, 12:35 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> This patch introduces map-based volatile stores, most of which were revived from git
history with minimal changes.  The DB storage system is now only used in a temporary storage
when replaying a snapshot containing the `dbScript` field.
> 
> Please take special care to double-check my work in `SnapshotStoreImpl`, as it must function
for backwards compatibility with `Snapshot`s in the wild.  Another area of interest is the
implementation of `MemJobUpdateStore`, which must have nuanced behavior for compatibility
with `DbJobUpdateStore`.  Most of this stems from behind-the-scenes interaction with `LockStore`
and use of cascading deletes.
> 
> Note that this change removes the transactional nature of in-memory storage operations
as well as the `READ COMMITTED` transaction isolation previously available to some stores
(proven in necessary changes to `StorageTransactionTest`).  This means some stores will permit
dirty reads when they previously did not.  `TaskStore` has always had this non-transactional
behavior by default, as the DB task store was never deemed suitable for production.  Nonetheless,
this non-transactional behavior should be considered safe as the scheduler fails over on a
storage operation failure, and relies on the persistent log storage for transaction atomicity.
> 
> See the [mailing list discussion](https://lists.apache.org/thread.html/0bb74c78da7fa12954e2438e8011b3bff73fe3bfdafeaaa2ff00a98e@%3Cdev.aurora.apache.org%3E)
for context.
> 
> 
> Diffs
> -----
> 
>   RELEASE-NOTES.md c58e68080beb1740d92639601ef7cc29c63be37e 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9ee9eff6c3d657decd93458dc3e6792a30614a60

>   docs/reference/scheduler-configuration.md 4e3f90713c307e3b9e9f84c29343af7f014f0165

>   examples/vagrant/upstart/aurora-scheduler-kerberos.conf 04e3d311c6845689cd3bd813aa8dcf9df55f1199

>   examples/vagrant/upstart/aurora-scheduler.conf 63fcc87be653835cb3c3f25dae4164f1d7c8d4da

>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 7d3766829161457b1b3ba50bce128047d10b2c58

>   src/jmh/java/org/apache/aurora/benchmark/SnapshotBenchmarks.java ae59f3def75d0e1d3866c8d2f85063456643e9a6

>   src/jmh/java/org/apache/aurora/benchmark/StateManagerBenchmarks.java c293a9fd7c435e17dfc09f9ccc180154be8c7b2d

>   src/jmh/java/org/apache/aurora/benchmark/StatusUpdateBenchmark.java 45c2ab96d8fa65ab12ba710954128da987f0544b

>   src/jmh/java/org/apache/aurora/benchmark/TaskStoreBenchmarks.java f5e44dfd6d2c81e90859eaf24cf904a9662c1c72

>   src/jmh/java/org/apache/aurora/benchmark/ThriftApiBenchmarks.java 7b4050637433ab656fc09958bac1a050fec02bed

>   src/jmh/java/org/apache/aurora/benchmark/UpdateStoreBenchmarks.java cac02a55d26b2934099a2b03457c703600296292

>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java dd0e480c628f6bb44de0ce758c8af2bb54db1bc6

>   src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 64733e5b38f238e64424127a53591e6546a3e9a8

>   src/main/java/org/apache/aurora/scheduler/http/JettyServerModule.java 93cd20adc28ed700719e472bb2331137a93d1d9d

>   src/main/java/org/apache/aurora/scheduler/storage/JobUpdateStore.java 52c3c6618a3cf1009435ca8a9cece36365913e55

>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 6c676693e531541ef9aec7f7a130c119ebf35df3

>   src/main/java/org/apache/aurora/scheduler/storage/Util.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/backup/TemporaryStorage.java bad05f5bf1976bb590e8dd7af9b43d5d30e892a9

>   src/main/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStore.java cbe5a0deff1ebc38a9618e7d89ab073dfaf78d36

>   src/main/java/org/apache/aurora/scheduler/storage/db/DbModule.java fb3dadb4b4c7036112e423b2bc1b2540d793b4d6

>   src/main/java/org/apache/aurora/scheduler/storage/db/DbStorage.java 7904e3880d214aac1013ce18265fed924ef7897e

>   src/main/java/org/apache/aurora/scheduler/storage/db/DbUtil.java 942d180fe1a8b7a583e812f9106ee0e8db1bea55

>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorageModule.java 6fdf31d6ed430f3dc0a0117442fc8b532aa40230

>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java fdf9c33f80be3eff03f50732b8c4735cb9ff3b91

>   src/main/java/org/apache/aurora/scheduler/storage/log/WriteAheadStorage.java d145259653dd4df90e3877fc3e744e24c7a15d13

>   src/main/java/org/apache/aurora/scheduler/storage/mem/InMemStoresModule.java 24296b6c61e62a09c30f01dfb84448f73a5f1e44

>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemAttributeStore.java PRE-CREATION

>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStore.java PRE-CREATION

>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemLockStore.java PRE-CREATION

>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemQuotaStore.java PRE-CREATION

>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemSchedulerStore.java PRE-CREATION

>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java PRE-CREATION

>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java PRE-CREATION

>   src/main/java/org/apache/aurora/scheduler/storage/mem/Util.java c28fb65010af5e3db925487929d4e0e12b4101a4

>   src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java 54202359382bfe39e7cbaec0bf4c7d65d10ca13b

>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java a363e70176881518653d0774e9d0c4be0f7f6d78

>   src/test/java/org/apache/aurora/scheduler/app/local/LocalSchedulerMain.java 3dee55a38752faea52c0c6ace90db8352ef9b2ea

>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 9b4f2ad15ab5b61d4cccfad38ba48f17e7853425

>   src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 2b817076d6beb09586d4711bc10bceb31f3b74e1

>   src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 459d6bebcd7d6341dac2aead7e3dd8ce87bc9ed6

>   src/test/java/org/apache/aurora/scheduler/cron/quartz/CronJobManagerImplTest.java 81440f5689f9538a4c7a9e6700bf03ca89c4ba85

>   src/test/java/org/apache/aurora/scheduler/http/MaintenanceTest.java f94b58b77b7c6ce824914af7e1147e73ad5a7eed

>   src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java 09f443bca90f154b547d28ca5fd5be08177fdf99

>   src/test/java/org/apache/aurora/scheduler/state/LockManagerImplTest.java 19f9de312596395eff81bbfd073a5f617d2ef84c

>   src/test/java/org/apache/aurora/scheduler/state/StateManagerImplTest.java 24176fe089703386e8246f6add1965c111c136ed

>   src/test/java/org/apache/aurora/scheduler/stats/ResourceCounterTest.java d73656dc8fb8764c7a66eed3ea789554ce0ecc52

>   src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 04ec7715ba9a9d47c99d4ccb92939f3e561e1a60

>   src/test/java/org/apache/aurora/scheduler/storage/db/AttributeStoreTest.java PRE-CREATION

>   src/test/java/org/apache/aurora/scheduler/storage/db/DbAttributeStoreTest.java f47f4a8a492fb43bacd909dc520256ed028531dd

>   src/test/java/org/apache/aurora/scheduler/storage/db/DbCronJobStoreTest.java b68298ab50a3621da8596947619f2d1c34d2d194

>   src/test/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStoreTest.java 8fca54becde34a0d10d60e05d3809fc1c41233bf

>   src/test/java/org/apache/aurora/scheduler/storage/db/DbLockStoreTest.java 8ed58e01eea09ab9f1aa4d269a5e59ce9c9c2191

>   src/test/java/org/apache/aurora/scheduler/storage/db/DbQuotaStoreTest.java 275d0fd93819169550ce39c8ddfa6c4fa0e4d920

>   src/test/java/org/apache/aurora/scheduler/storage/db/DbSchedulerStoreTest.java a6320bf687aa41c698bdc6a2c0b5208624cff4a3

>   src/test/java/org/apache/aurora/scheduler/storage/db/DbTaskStoreTest.java bcf5be43ab69ef565bb68f9235b4c43d385898aa

>   src/test/java/org/apache/aurora/scheduler/storage/db/JobUpdateStoreTest.java PRE-CREATION

>   src/test/java/org/apache/aurora/scheduler/storage/db/LockStoreTest.java PRE-CREATION

>   src/test/java/org/apache/aurora/scheduler/storage/db/QuotaStoreTest.java PRE-CREATION

>   src/test/java/org/apache/aurora/scheduler/storage/db/SchedulerStoreTest.java PRE-CREATION

>   src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplIT.java b2c333eabbd3e3ae8f119170266ce6054b53fcae

>   src/test/java/org/apache/aurora/scheduler/storage/mem/InMemTaskStoreTest.java 84fa3ff48a0957760f2002ed026e72b882171109

>   src/test/java/org/apache/aurora/scheduler/storage/mem/MemAttributeStoreTest.java PRE-CREATION

>   src/test/java/org/apache/aurora/scheduler/storage/mem/MemCronJobStoreTest.java 91e591f3c84489386a7c20d33278ba691b3ae7ba

>   src/test/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStoreTest.java PRE-CREATION

>   src/test/java/org/apache/aurora/scheduler/storage/mem/MemLockStoreTest.java PRE-CREATION

>   src/test/java/org/apache/aurora/scheduler/storage/mem/MemQuotaStoreTest.java PRE-CREATION

>   src/test/java/org/apache/aurora/scheduler/storage/mem/MemStorageTest.java PRE-CREATION

>   src/test/java/org/apache/aurora/scheduler/storage/mem/StorageTransactionTest.java 25f34e2bc26c6d4754c1591fad7f2165dd465d32

>   src/test/java/org/apache/aurora/scheduler/thrift/ThriftIT.java 2cd19d572d2945e28630b9bce5ba58eb9753630a

>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 453366ef4e4b6db45643b1927887771dbd795bf9

> 
> 
> Diff: https://reviews.apache.org/r/62869/diff/4/
> 
> 
> Testing
> -------
> 
> Still need to complete (but should not block review):
> 
> * Run `./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh` from scratch at this patch.
> * Run `./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh` on master, then this patch,
then on master.  This verifies some level of replay compatibility between versions.
> 
> *Most relevant benchmark*
> 
> On master:
> ```
> Benchmark                                             (instanceOverrides)  (instances)
 (metadata)   Mode  Cnt    Score    Error  Units
> UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A         1000
        N/A  thrpt    5  696.186 ± 67.296  ops/s
> UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A         5000
        N/A  thrpt    5  160.567 ± 22.365  ops/s
> UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A        10000
        N/A  thrpt    5   82.897 ±  5.384  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                      1          N/A
        N/A  thrpt    5  166.567 ± 20.929  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                     10          N/A
        N/A  thrpt    5  161.149 ±  6.016  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                    100          N/A
        N/A  thrpt    5  141.993 ±  7.119  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                   1000          N/A
        N/A  thrpt    5   65.481 ±  0.861  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A
         10  thrpt    5  167.387 ± 10.009  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A
        100  thrpt    5  153.032 ± 19.067  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A
       1000  thrpt    5  140.656 ± 16.893  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A
      10000  thrpt    5   45.039 ±  4.840  ops/s
> ```
> 
> With this patch:
> ```
> Benchmark                                             (instanceOverrides)  (instances)
 (metadata)   Mode  Cnt         Score          Error  Units
> UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A         1000
        N/A  thrpt    5  43700548.771 ±  4491654.631  ops/s
> UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A         5000
        N/A  thrpt    5  44234835.787 ±  6078260.236  ops/s
> UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A        10000
        N/A  thrpt    5  43362266.895 ±  7546660.236  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                      1          N/A
        N/A  thrpt    5  45796640.959 ±  3698698.216  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                     10          N/A
        N/A  thrpt    5  46402021.197 ±  4800377.641  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                    100          N/A
        N/A  thrpt    5  44416822.563 ±  9591398.027  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                   1000          N/A
        N/A  thrpt    5  44004609.998 ± 20332063.977  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A
         10  thrpt    5  47403264.761 ±  4787575.678  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A
        100  thrpt    5  45629630.921 ±  7654875.770  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A
       1000  thrpt    5  44150253.103 ±  9376206.202  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A          N/A
      10000  thrpt    5  44215602.085 ±  8326237.269  ops/s
> ```
> 
> 
> Thanks,
> 
> Bill Farner
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message