couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Anderson (JIRA)" <j...@apache.org>
Subject [jira] Created: (COUCHDB-285) tail append headers
Date Sat, 07 Mar 2009 23:30:56 GMT
tail append headers
-------------------

                 Key: COUCHDB-285
                 URL: https://issues.apache.org/jira/browse/COUCHDB-285
             Project: CouchDB
          Issue Type: Improvement
          Components: Database Core
            Reporter: Chris Anderson
             Fix For: 1.0


this will make .couch files resilient even when truncated (data-loss but still usable). also
cuts down on the # of disk seeks.


[3:02pm] <jchris>
damienkatz: the offset in a header would corresponds to number of bytes from the front of
the file?
[3:03pm]
andysky joined the chat room.
[3:03pm] <damienkatz>
jchris: yes
[3:03pm] <jchris>
because my offset seems to suggest that just over a MB of the file is missing
[3:03pm] ยป jchris 
blames not couchdb
[3:03pm] <jchris>
but the streaming writes you've talked about would make this more resilent, eh?
[3:03pm] <jchris>
where the header is also appended each time
[3:04pm] <jchris>
there could be data lost but the db would still be usable
[3:04pm] <damienkatz>
yes, a file truncation just  gives you and earlier version of the file
[3:05pm] <jchris>
now's not a good time for me to work on that, but after Amsterdam I may want to pick it up
[3:05pm] <damienkatz>
the hardest part is finding the header again
[3:06pm] <jan____>
hu? isn't the header the firs 4k?
[3:06pm] <jan____>
t
[3:06pm] <jchris>
it would only really change couch_file:read_header and write_header I think
[3:06pm] <jchris>
jan____: we're talking about moving it to the end
[3:06pm] <jchris>
so it never gets overwritten
[3:06pm] <damienkatz>
jan____: this is for tail append headers
[3:06pm] <jan____>
duh
[3:06pm] <jan____>
futuretalk
[3:06pm] <jan____>
n/m me
[3:07pm] <damienkatz>
jchris: so one way is to sign the header regions, but you need to make it unforgable.
[3:08pm] <jchris>
basically a boundary problem...
[3:08pm] <damienkatz>
because if a client wrote a binary that looked like it had a header, they could do bad things.
[3:08pm] <jchris>
like for instance an attachment that's a .couch file :)
[3:08pm] <damienkatz>
right
[3:09pm] <damienkatz>
so you can salt the db file on creation with a key in the header. And use that key to sign
and verify headers.
[3:09pm]
tlrobinson joined the chat room.
[3:09pm] <jchris>
doesn't sound too tough
[3:10pm] <jan____>
damienkatz: I looked into adding conflict inducing bulk docs in rep_security. would this work:
POST /db/_bulk_docs?allow_conflicts=true could do a regular bulk save but grab the "error"
responses and do a update_docs() call with the replicated_changes option for all "errors"
from the first bulk save while assigning new _revs for "new" docs?
[3:10pm] <damienkatz>
the key is crypto-random, and must stay hidden from clients.
[3:10pm] <jchris>
if you have the file, you could forge headers...
[3:10pm] <jchris>
but under normal operation, it sounds like not a big deal
[3:10pm]
Qaexl joined the chat room.
[3:11pm] <jchris>
so we just give the db an internal secret uuid
[3:11pm]
mmalone left the chat room. (Connection reset by peer)
[3:11pm]
peritus_ joined the chat room.
[3:11pm] <damienkatz>
I'm not sure I like this approach.
[3:11pm] <jchris>
damienkatz: drawbacks?
[3:11pm] <damienkatz>
if a client can see a file share with the db, they can attack it.
[3:12pm]
mmalone joined the chat room.
[3:12pm]
mmalone left the chat room. (Read error: 104 (Connection reset by peer))
[3:12pm] <damienkatz>
how about this approach. every 4k, we write a NULL byte.
[3:13pm] <damienkatz>
we always write headers at the 4k boundary
[3:13pm]
mmalone joined the chat room.
[3:13pm] <damienkatz>
and make that byte 1
[3:13pm] <jan____>
grr
[3:13pm] <jan____>
did my bulk-docs proposal get through?
[3:13pm] <jchris>
the attacker could still get lucky
[3:13pm] <jan____>
(or got it shot down? :)
[3:13pm] <damienkatz>
jan____: sorry.
[3:13pm] <jan____>
damienkatz: I couldn't read the backlog
[3:13pm] <damienkatz>
Let me think about the conflict stuff a little bit.
[3:13pm] <jan____>
sure
[3:13pm] <jan____>
no baclog then
[3:14pm] <jan____>
+k
[3:14pm] <jchris>
jan____: your paragraph is dense there -
[3:14pm] <damienkatz>
jchris: no, this is immune from attack
[3:14pm] <jchris>
because you'd write an attachment marker after the null byte for attachments?
[3:14pm] <damienkatz>
every 4k, we just write a 0 byte, we skip that byte.
[3:15pm] <jan____>
jchris: yeah, sorry, will let you finish the file stuff
[3:15pm] <damienkatz>
no matter what, we never write anything into that byte.
[3:15pm] <jan____>
wasting all these 0 bytes
[3:15pm] <damienkatz>
a big file right whil write all the surrounding bytes, but not that byte.
[3:16pm] <damienkatz>
only 1 every 4k jan. I think we'll manage ;)
[3:16pm] <jan____>
:)
[3:16pm] <damienkatz>
when that byte is a 1 though, that means is the start of a header.
[3:16pm] <jchris>
oh, gotcha
[3:17pm] <damienkatz>
so headers always get written on a 4k boundary.
[3:17pm] <jchris>
and nothing else does
[3:17pm] <damienkatz>
that means the average cost of a write, before we even write the data of the new header, is
2k on average.
[3:17pm] <jchris>
because we have to jump at least that far for the new header
[3:18pm] <damienkatz>
yes
[3:18pm] <jchris>
compaction could fill it in though
[3:18pm] <damienkatz>
we can change the boundary to be 1k,2k etc.
[3:19pm] <damienkatz>
I think couch_file would handle the skipping of the byte, make it transparent.
[3:19pm] <Monty>
kind of mohican or bleach, but someone else?
[3:19pm] <damienkatz>
both on reading and writing. and it would handle writing the header.
[3:19pm] <damienkatz>
and reading the header on start.
[3:19pm] <jchris>
handle_call({append_bin can do the 4k stuff
[3:20pm] <davisp>
damienkatz: how does the null byte protect against a malicious attachment?
[3:20pm] <damienkatz>
and of course, you can link to the previou headers, chaining the different db states as a
history.
[3:20pm] <damienkatz>
davisp: because nothing can fill that byte ever.
[3:21pm] <jchris>
the guaranteed 4k offset also makes finding a header on cold start faster
[3:21pm] <davisp>
damienkatz: But couldn't I design a malicious binary that looked like a header?
[3:21pm] <damienkatz>
yes, but we wouldn't see it.
[3:22pm] <davisp>
I must be missing something
[3:22pm] <damienkatz>
when we write a header, we set the byte too 1.
[3:22pm] <davisp>
The 4k offset thing would be harder to overcome, but you could brute force it with 4k*4k max
[3:22pm] <davisp>
I have no idea how big that is
[3:22pm] <jan____>
jchris: less dense: v
[3:22pm] <jan____>
http://friendpaste.com/4rid4TsUNBYjlUse9wm8qP
[3:22pm] <damienkatz>
attachments are written around the byte
[3:22pm] <jan____>
*lets dance*
[3:23pm] <damienkatz>
davisp: is not possible to attack this scheme.
[3:23pm] <davisp>
I must be missing something
[3:23pm] <davisp>
Then again, my knowledge of the attachment writing is slim to none
[3:23pm] <damienkatz>
when the db engine writes attachments, is leaves a 0 byte every 4k.
[3:24pm] <damienkatz>
when it reads the attachment back, it skips that byte again.
[3:24pm] <davisp>
light bulb!
[3:24pm] <jan____>
same here :D
[3:24pm] <jan____>
davisp: *5*
[3:24pm] <jchris>
is there a ticket for this yet?
[3:24pm] <jchris>
if not I'll attach this transcript to one

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message