Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 45CD0200B97 for ; Sun, 9 Oct 2016 19:22:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 44609160ADA; Sun, 9 Oct 2016 17:22:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 8B1B5160AD6 for ; Sun, 9 Oct 2016 19:22:21 +0200 (CEST) Received: (qmail 59626 invoked by uid 500); 9 Oct 2016 17:22:20 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 59613 invoked by uid 99); 9 Oct 2016 17:22:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 09 Oct 2016 17:22:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 715902C0D53 for ; Sun, 9 Oct 2016 17:22:20 +0000 (UTC) Date: Sun, 9 Oct 2016 17:22:20 +0000 (UTC) From: "Benedict (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sun, 09 Oct 2016 17:22:22 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15560318#comment-15560318 ] Benedict commented on CASSANDRA-12490: -------------------------------------- I'm afraid I think this was a terrible idea, and it should probably be rolled back. The example yaml permits its use as a column value seed generator, which means the contents of a partition no longer depend on the partition's seed, but on the order of visitation. For partition and clustering columns (as in the example) this breaks behaviour for queries. Stress no longer knows what records exist (it will generate different values to query than it originally wrote). It also completely breaks any possibility of data validation, which is currently supported for thrift and always intended to be extending to CQL to improve testing. As already mentioned, the -pop seq=1..N mode can be provided on the command line for sequentially visiting partitions. For generating *values* that can step forwards with this, the most sensible design (and what had been on the cards) is to accept a functional specification that depends on the seed of the partition, the simplest being to return 1 when the partition's seed was 1. > Add sequence distribution type to cassandra stress > -------------------------------------------------- > > Key: CASSANDRA-12490 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12490 > Project: Cassandra > Issue Type: Improvement > Components: Tools > Reporter: Ben Slater > Assignee: Ben Slater > Priority: Minor > Fix For: 3.10 > > Attachments: 12490-trunk.patch, 12490.yaml, cqlstress-seq-example.yaml > > > When using the write command, cassandra stress sequentially generates seeds. This ensures generated values don't overlap (unless the sequence wraps) providing more predictable number of inserted records (and generating a base set of data without wasted writes). > When using a yaml stress spec there is no sequenced distribution available. It think it would be useful to have this for doing initial load of data for testing -- This message was sent by Atlassian JIRA (v6.3.4#6332)