Return-Path: X-Original-To: apmail-couchdb-commits-archive@www.apache.org Delivered-To: apmail-couchdb-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D20917F29 for ; Sun, 30 Oct 2011 18:28:57 +0000 (UTC) Received: (qmail 28685 invoked by uid 500); 30 Oct 2011 18:28:57 -0000 Delivered-To: apmail-couchdb-commits-archive@couchdb.apache.org Received: (qmail 28650 invoked by uid 500); 30 Oct 2011 18:28:57 -0000 Mailing-List: contact commits-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list commits@couchdb.apache.org Received: (qmail 28640 invoked by uid 99); 30 Oct 2011 18:28:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 30 Oct 2011 18:28:57 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.114] (HELO tyr.zones.apache.org) (140.211.11.114) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 30 Oct 2011 18:28:54 +0000 Received: by tyr.zones.apache.org (Postfix, from userid 65534) id C86C45488E; Sun, 30 Oct 2011 18:28:33 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: kocolosk@apache.org To: commits@couchdb.apache.org X-Mailer: ASF-Git Admin Mailer Subject: git commit: Fix retrieval of headers larger than 4k Message-Id: <20111030182833.C86C45488E@tyr.zones.apache.org> Date: Sun, 30 Oct 2011 18:28:33 +0000 (UTC) X-Virus-Checked: Checked by ClamAV on apache.org Updated Branches: refs/heads/master 5f906a39b -> 1dc866b3c Fix retrieval of headers larger than 4k Our headers start with a <<1>> and then four bytes indicating the length of the header and its checksum. When the header is larger than 4090 bytes it will be split across multiple blocks in the file and will need to be reassembled on read. The reassembly consists of stripping out <<0>> from the beginning of each subsequent block in the remove_block_prefixes/2 function. The bug here is that we tell remove_block_prefixes that we're starting 1 byte into the current block instead of 5, so it ends up removing one good byte from the header and injecting one or more random <<0>>s. Headers larger than 4k are very rare and generally require a view group with a huge number of indexes or indexes with fairly large reductions, which explains why this bug has gone undetected until now. Closes COUCHDB-1319. Project: http://git-wip-us.apache.org/repos/asf/couchdb/repo Commit: http://git-wip-us.apache.org/repos/asf/couchdb/commit/1dc866b3 Tree: http://git-wip-us.apache.org/repos/asf/couchdb/tree/1dc866b3 Diff: http://git-wip-us.apache.org/repos/asf/couchdb/diff/1dc866b3 Branch: refs/heads/master Commit: 1dc866b3c795d7d7515f732f444cd54e656390bc Parents: 5f906a3 Author: Adam Kocoloski Authored: Wed Oct 26 14:04:54 2011 -0400 Committer: Adam Kocoloski Committed: Sun Oct 30 14:27:39 2011 -0400 ---------------------------------------------------------------------- src/couchdb/couch_file.erl | 2 +- test/etap/011-file-headers.t | 9 ++++++++- 2 files changed, 9 insertions(+), 2 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/couchdb/blob/1dc866b3/src/couchdb/couch_file.erl ---------------------------------------------------------------------- diff --git a/src/couchdb/couch_file.erl b/src/couchdb/couch_file.erl index f1d6cc4..2c2f11a 100644 --- a/src/couchdb/couch_file.erl +++ b/src/couchdb/couch_file.erl @@ -433,7 +433,7 @@ load_header(Fd, Block) -> RawBin = <> end, <> = - iolist_to_binary(remove_block_prefixes(1, RawBin)), + iolist_to_binary(remove_block_prefixes(5, RawBin)), Md5Sig = couch_util:md5(HeaderBin), {ok, HeaderBin}. http://git-wip-us.apache.org/repos/asf/couchdb/blob/1dc866b3/test/etap/011-file-headers.t ---------------------------------------------------------------------- diff --git a/test/etap/011-file-headers.t b/test/etap/011-file-headers.t index 81ffdb2..a26b032 100755 --- a/test/etap/011-file-headers.t +++ b/test/etap/011-file-headers.t @@ -22,7 +22,7 @@ main(_) -> {S1, S2, S3} = now(), random:seed(S1, S2, S3), - etap:plan(17), + etap:plan(18), case (catch test()) of ok -> etap:end_tests(); @@ -68,6 +68,13 @@ test() -> etap:is({ok, Size2}, couch_file:bytes(Fd), "Rewriting the same second header returns the same second size."), + couch_file:write_header(Fd, erlang:make_tuple(5000, <<"CouchDB">>)), + etap:is( + couch_file:read_header(Fd), + {ok, erlang:make_tuple(5000, <<"CouchDB">>)}, + "Headers larger than the block size can be saved (COUCHDB-1319)" + ), + ok = couch_file:close(Fd), % Now for the fun stuff. Try corrupting the second header and see