Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 1E2BC200C6C for ; Fri, 5 May 2017 21:19:52 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 1CC19160BAF; Fri, 5 May 2017 19:19:52 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 63F5F160BAA for ; Fri, 5 May 2017 21:19:51 +0200 (CEST) Received: (qmail 35425 invoked by uid 500); 5 May 2017 19:19:49 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 35415 invoked by uid 99); 5 May 2017 19:19:49 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 May 2017 19:19:49 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 09797188A6E for ; Fri, 5 May 2017 19:19:49 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.002 X-Spam-Level: X-Spam-Status: No, score=-0.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id ITxE1JicKpsn for ; Fri, 5 May 2017 19:19:45 +0000 (UTC) Received: from smtp2.tineye.com (smtp2.tineye.com [204.15.199.173]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 512735FAF9 for ; Fri, 5 May 2017 19:19:45 +0000 (UTC) Received: from [192.168.0.104] (CPE802aa84c844d-CM00fc8dce1d30.cpe.net.cable.rogers.com [99.230.33.183]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: jonathan) by smtp2.tineye.com (Postfix) with ESMTPSA id E15767C0B19 for ; Fri, 5 May 2017 15:19:44 -0400 (EDT) From: Jonathan Guberman Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Cassandra as a key/object store for many small (10-60k) files Message-Id: Date: Fri, 5 May 2017 15:19:44 -0400 To: user@cassandra.apache.org Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) X-Mailer: Apple Mail (2.3124) archived-at: Fri, 05 May 2017 19:19:52 -0000 Hello, We=E2=80=99re currently testing Cassandra for use as a pure key-object = store for data blobs around 10kB - 60kB each. Our use case is storing on = the order of 10 billion objects with about 5-20 million new writes per = day. A written object will never be updated or deleted. Objects will be = read at least once, some time within 10 days of being written. This will = generally happen as a batch; that is, all of the images written on a = particular day will be read together at the same time. This batch read = will only happen one time; future reads will happen on individual = objects, with no grouping, and they will follow a long-tail = distribution, with popular objects read thousands of times per year but = most read never or virtually never. I=E2=80=99ve set up a small four node test cluster and have written test = scripts to benchmark writing and reading our data. The table I=E2=80=99ve = set up is very simple: an ascii primary key column with the object ID = and a blob column for the data. All other settings were left at their = defaults. =20 I=E2=80=99ve found write speeds to be very fast most of the time. = However, periodically, writes will slow to a crawl for anywhere between = half an hour to two hours, after which speeds recover to their previous = levels. I assume this is some sort of data compaction or flushing to = disk, but I haven=E2=80=99t been able to figure out the exact cause. Read speeds have been more disappointing. Cached reads are very fast, = but random read speed averages about 2 MB/sec, which is too slow when we = need to read out a batch of several million objects. I don=E2=80=99t = think it=E2=80=99s reasonable to assume that these rows will all still = be cached by the time we need to read them for that first large batch = read. My general question is whether anyone has any suggestions for how to = improve performance for our use case. More specifically: - Is there a way to mitigate or eliminate the huge slowdowns I see when = writing millions of rows? - Are there settings I should be using in order to maximize read speeds = for random reads? - Is there a way to design our tables to improve the read speeds for the = initial large batched reads? I was thinking of using a batch ID column = that could be used to retrieve the data for the initial block. However, = future reads would need to be done by the object ID, not the batch ID, = so it seems to me I=E2=80=99d need to duplicate the data, one in a = =E2=80=9Cobjects by batch=E2=80=9D table, and the other in a simple = =E2=80=9Cobjects=E2=80=9D table. Is there a better approach than this? Thank you! --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org For additional commands, e-mail: user-help@cassandra.apache.org