cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aleksey Yeschenko <>
Subject Re: March 2015 QA retrospective
Date Mon, 13 Apr 2015 21:50:13 GMT
<> Aleksey Yeschenko
all hints related tasks to hints private executor Pierre's reproducer
represents something we weren't doing, but that users are. Is that now
being tested?

That particular issue will not happen again. That class of issues can only
be tested by a sufficiently long running stress test, with plenty of
chaos-monkeying thrown in. That's essentially how it got caught - by driver
duration tests with chaos-monkeying in. It's still being run exercised by
the driver tests, and we do now run them prior to releasing stuff, so I'd
say yes - it's being tested.

Once we have our own framework, we'll migrate it there, from driver tests.

<> Aleksey
Yeschenko Upgrading
a 2.0 to 2.1 breaks CFMetaData on 2.0 nodes Have additional dtest coverage,
need to do this in kitchen sink tests

dtests coverage would not help here, not really. It's a tricky race
condition that can only be triggered during upgrade. Would need some kind
of test that repeatedly upgrades the nodes in the cluster, again and again,
to have this caught, or else it'd be flaky at best.

The actual proper fix for that issue is the new, versioned, schema update
exchange protocol - is
the ticket. That one will come with a metric ton of tests.

On Thu, Apr 9, 2015 at 11:45 AM, Ariel Weisberg <
> wrote:

> Repeated with sort
>    *Key* *Assignee* *Summary* *Revisit reason*  CASSANDRA-8285
> <> Aleksey Yeschenko
> Move
> all hints related tasks to hints private executor Pierre's reproducer
> represents something we weren't doing, but that users are. Is that now
> being tested?  CASSANDRA-8462
> <> Aleksey
> Yeschenko Upgrading
> a 2.0 to 2.1 breaks CFMetaData on 2.0 nodes Have additional dtest coverage,
> need to do this in kitchen sink tests  CASSANDRA-8640
> <> Anthony Cozzie
> Paxos
> requires all nodes for CAS If PAXOS is not supposed to require all nodes
> for CAS we should be able to fail nodes or a certain number of nodes and
> still continue to CAS (test availability of CAS under failure conditions).
> No regression test.  CASSANDRA-8677
> <> Ariel Weisberg
> rpc_interface
> and listen_interface generate NPE on startup when specified interface
> doesn't exist Missing unit tests checking error messages for
> DatabaseDescriptor  CASSANDRA-8577
> <> Artem Aliev Values
> of set types not loading correctly into Pig Full set of interactions with
> PIG not validated  CASSANDRA-7704
> <> Benedict
> FileNotFoundException
> during STREAM-OUT triggers 100% CPU usage Streaming testing didn't
> reproduce this before release  CASSANDRA-8383
> <> Benedict Memtable
> flush may expire records from the commit log that are in a later memtable
> No
> regression test, no follow up ticket. Could/should this have been
> reproducable as an actual bug?  CASSANDRA-8429
> <> Benedict Some keys
> unreadable during compaction Running stress in CI would have caught this,
> and we're going to do that  CASSANDRA-8459
> <> Benedict
> "autocompaction"
> on reads can prevent memtable space reclaimation What would have reproduced
> this before release?  CASSANDRA-8499
> <> Benedict Ensure
> SSTableWriter cleans up properly after failure Testing error paths? Any way
> to test things in a loop to detect leaks?  CASSANDRA-8513
> <> Benedict
> SSTableScanner
> may not acquire reference, but will still release it when closed This had a
> user visible component, what test could have caught it befor erelease?
> CASSANDRA-8619 <>
> Benedict using CQLSSTableWriter gives ConcurrentModificationException What
> kind of test would have caught this before release?  CASSANDRA-8632
> <> Benedict
> cassandra-stress
> only generating a single unique row We rely on stress for performance
> testing, that might mean it needs real testing that demonstrates it
> generates load that looks like the load it is supposed to be generating.
> CASSANDRA-8668 <>
> Benedict We don't enforce offheap memory constraints; regression introduced
> by 7882 Memory constraints was a supported feature/UI, but not completely
> tested before release. Could this have been found most effectively by a
> unit test or a blackbox test?  CASSANDRA-8719
> <> Benedict Using
> thrift HSHA with offheap_objects appears to corrupt data Untested
> configuration before release, this would be straightforward if we ran with
> it?  CASSANDRA-8726 <>
> Benedict throw OOM in Memory if we fail to allocate OOM test Cassandra? Try
> and validate that it fails cleanly and can be restarted on OOM? Same for
> disk full.  CASSANDRA-8018
> <> Benjamin Lerer
> Cassandra
> seems to insert twice in custom PerColumnSecondaryIndex Custom secondary
> indexes not tested before release?  CASSANDRA-8231
> <> Benjamin Lerer
> Wrong
> size of cached prepared statements Expected cache capacity not validated
> with actual cache capcaity, no regression test  CASSANDRA-8365
> <> Benjamin Lerer
> CamelCase
> name is used as index name instead of lowercase How can we establish UI
> consistency?  CASSANDRA-8421
> <> Benjamin Lerer
> Cassandra
> 2.1.1 & Cassandra 2.1.2 UDT not returning value for LIST type as UDT Is
> there a test that could have found this condition before release?
> CASSANDRA-8514 <>
> Benjamin
> Lerer ArrayIndexOutOfBoundsException in nodetool cfhistograms Not released,
> but not caught by automated tests either  CASSANDRA-8243
> <> Björn Hegerfors
> can leave time-overlaps, limiting ability to expire entire SSTables
> Performance
> improving fast path not tested in a representative way  CASSANDRA-8448
> <> Brandon
> Williams "Comparison
> method violates its general contract" in AbstractEndpointSnitch This just
> happens periodically? Was the snitch no tested under load and the log
> output checked for errors?  CASSANDRA-8028
> <> Carl Yeksigian
> Unable
> to compute when histogram overflowed Histogram output not tested with
> representative data sets, no regression test  CASSANDRA-8122
> <> Carl Yeksigian
> Undeclare
> throwable exception while executing 'nodetool netstats localhost' nodetool
> not tested against cluster throughout lifecycle, no regression test
> CASSANDRA-8695 <>
> Chris
> Lockfort thrift column definition list sometimes immutable What user
> visible activities reproduced this, could we have done that before release?
> CASSANDRA-8588 <> Dave
> Brosius Fix DropTypeStatements isusedBy for maps (typo ignored values) Not
> released, but was it detected before release by an automated test?
> CASSANDRA-8652 <>
> Edward
> Ribeiro DROP TABLE should also drop BATCH prepared statements associated to
> it Not sure if this is an optimization or fixes a user visible issue, but
> could this have been detected by exercising the functionality better before
> release.  CASSANDRA-8694
> <> Jeff Jirsa Repair
> of
> empty keyspace hangs rather than ignoring the request Missing boundary
> condition test, requesting operation on empty, non-existent, or not
> applicable entity.  CASSANDRA-8687
> <> Jeremiah Jordan
> Keyspace
> should also check Config.isClientMode Is there a way to test for missing
> Config.isClientMode checks?  CASSANDRA-8579
> <> Jimmy Mårdell
> sstablemetadata
> can't load Running C* from
> source tree not representative of behavior of deployed builds
> CASSANDRA-8401 <>
> Jonathan
> Ellis dropping a CF doesn't remove the latency-sampling task Another
> argument for a schema change stress test, maybe tracking for constant
> memory utilization  CASSANDRA-8292
> <> Joshua McKenzie
> From
> Pig: org.apache.cassandra.exceptions.ConfigurationException: Expecting URI
> in variable: [cassandra.config]. Please prefix the file with file:/// for
> local files or file://<server>/ for remote files. PIG not tested
> CASSANDRA-8211 <>
> Marcus
> Eriksson Overlapping sstables in L1+ Noted hard to reproduce, but still is
> there a way we could have, no regression test  CASSANDRA-8316
> <> Marcus Eriksson
> "Did
> not get positive replies from all endpoints" error on incremental repair
> What
> were users doing differently, is there a reproducer for this running now?
> CASSANDRA-8320 <>
> Marcus
> Eriksson 2.1.2: NullPointerException in SSTableWriter What were users doing
> that caused this, are we doing that?  CASSANDRA-8386
> <> Marcus Eriksson
> Make
> sure we release references to sstables after incremental repair Is there a
> higher level test that could have observed this failure?  CASSANDRA-8432
> <> Marcus Eriksson
> Standalone
> Scrubber broken for LCS Standalone scrubber not tested, no regression test
> CASSANDRA-8458 <>
> Marcus
> Eriksson Don't give out positions in an sstable beyond its first/last
> tokens Streaming
> not done in realistic scenario with validation of logging  CASSANDRA-8463
> <> Marcus Eriksson
> Constant
> compaction under LCS What would have reproduced this before release?
> CASSANDRA-8510 <>
> Marcus
> Eriksson CompactionManager.submitMaximal may leak resources Not a user
> visible problem, so difficult to catch in test, but is there a way
> CASSANDRA-8525 <>
> Marcus
> Eriksson Bloom Filter truePositive counter not updated on key cache hit
> User
> visible metric not accurate, but only in one config. Possible to guess
> correct FP ratio and validate while exploring config space?  CASSANDRA-8532
> <> Marcus Eriksson Fix
> calculation of expected write size during compaction Did this manifest as a
> user visible issue, could we have tested for that?  CASSANDRA-8537
> <> Marcus Eriksson
> ConcurrentModificationException
> while executing 'nodetool cleanup' Nodetool cleanup not tested before
> release  CASSANDRA-8562
> <> Marcus Eriksson Fix
> checking available disk space before compaction starts Is there a user
> visible negative impact, could it have been tested for?  CASSANDRA-8580
> <> Marcus Eriksson
> AssertionErrors
> after activating unchecked_tombstone_compaction with leveled compaction How
> could this have been reproduced before release? No regression test
> CASSANDRA-8623 <>
> Marcus
> Eriksson sstablesplit fails *randomly* with Data component is missing
> Feature
> not tested before release? No regression test  CASSANDRA-8635
> <> Marcus Eriksson
> cold sstable omission does not handle overwrites without reads If this
> workload is a challenge for certain kinds of optimizations we should test
> it if we think it could happen again.  CASSANDRA-7538
> <> Sam Tunnicliffe
> Truncate
> of a CF should also delete Paxos CF Truncate not tested with PAXOS, what
> else?  CASSANDRA-8280
> <> Sam
> Tunnicliffe Cassandra crashing on inserting data over 64K into indexed
> strings Added tests are good example, could focusing on testing all access
> paths and boundary conditions per access path have prevented this
> CASSANDRA-8370 <> Sam
> Tunnicliffe cqlsh doesn't handle LIST statements correctly cqlsh untested
> functionality, no regression test?  CASSANDRA-7801
> <> Sylvain Lebresne A
> successful INSERT with CAS does not always store data in the DB after a
> DELETE Multiple access paths for data not tested together  CASSANDRA-8558
> <> Sylvain Lebresne
> deleted
> row still can be selected out Validate that deleted data stays deleted
> under * conditions (big matrix of interactions here with different
> configurations, streaming, repair, cleanup, scrub). Deleted data coming
> back shows up a lot.  CASSANDRA-8332
> <> T Jake Luciani Null
> pointer after droping keyspace Add/drop keyspace not tested under load,
> with server logs checked for errors  CASSANDRA-7910
> <> Tyler Hobbs
> wildcard
> prepared statements are incorrect after a column is added to the table
> Alter
> table not tested concurrently with ?  CASSANDRA-8264
> <> Tyler Hobbs
> Problems
> with multicolumn relations and COMPACT STORAGE How can we catch
> interactions like compact storage not being covered by the test
> CASSANDRA-8286 <>
> Tyler
> Hobbs Regression in ORDER BY There were tests that failed in some versions,
> but not all? Did this not ship?  CASSANDRA-8288
> <> Tyler Hobbs cqlsh
> describe needs to show 'sstable_compression': '' Roundtrip test for
> describe schema?  CASSANDRA-8302
> <> Tyler Hobbs
> Filtering
> for CONTAINS (KEY) on frozen collection clustering columns within a
> partition does not work More untested combinations, could we have spotted
> that there was an interaction and tested it? Or did this not ship?
> CASSANDRA-8408 <>
> Tyler
> Hobbs limit appears to replace page size under certain conditions No test
> that validates that paging returns the expected number of results? Another
> of the genre of queries we support but don't test all the combinations
> CASSANDRA-8410 <>
> Tyler
> Hobbs Select with many IN values on clustering columns can result in a
> StackOverflowError Another missing boundary conditions test, test maximum
> size in clause against *  CASSANDRA-8451
> <> Tyler Hobbs NPE
> when
> writetime() or ttl() are nested inside function call Is this testable? Can
> we check that functions compose correctly or validate that they are
> inherently composable. No regression test.  CASSANDRA-8490
> <> Tyler Hobbs
> queries with LIMITs or paging are incorrect when partitions are
> deleted Untested
> query forms, no regression test  CASSANDRA-8512
> <> Tyler Hobbs cqlsh
> unusable after encountering schema mismatch cqlsh not tested with other
> functionality active  CASSANDRA-8550
> <> Tyler Hobbs
> Internal
> pagination in CQL3 index queries creating substantial overhead Pagination
> not performance tested with representative data models  CASSANDRA-8563
> <> Tyler Hobbs cqlsh
> broken for some thrift created tables. Validate mixed CQL thrift
> interactions? Possibly abstract everything to be done either by CQL or
> Thrift and then permute? Seems low value, but necessary if both are claimed
> to be supported.  CASSANDRA-8733
> <> Tyler Hobbs List
> prepend reverses item order There was a test so sometimes this just
> happens.
> CASSANDRA-8641 <>
> *Unassigned* Repair causes a large number of tiny SSTables User says
> something doesn't work for them? Could we have anticipated that vnodes
> would not work as formulated for this case.  CASSANDRA-8675
> <> *Unassigned* COPY
> TO/FROM broken for newline characters COPY TO/FROM not tested with
> representative data  CASSANDRA-8691
> <> *Unassigned*
> SSTableReader.getPosition()
> does not correctly filter out queries that exceed its bounds Is there a
> scenario where this is user visible, should we test for that?
> CASSANDRA-8688 <> Yuki
> Morishita Standalone sstableupgrade tool throws exception Tool not tested
> before release, no regression test

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message