accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Tubbs <ctubb...@gmail.com>
Subject Re: Security and data design advice on structuring data on accumulo
Date Sat, 11 Aug 2012 01:05:02 GMT
I think an important take-away here (so far) is that you can't just
use "doctor" as a role... because that doesn't encapsulate all the
security considerations. Doctor X doesn't get to see patient Y's data,
unless X is Y's doctor, or Y has signed a release for him/her to see
it. So, "doctorOf<Y>" is an essential consideration. If this was all
that was encapsulated, then the labels would grow roughly linearly
with the number of patients, not the number of "users" (if patients
happen to be users, that's simply a coincidence).

Since patient privacy is primarily what is being protected, I'd make
the roles relative to the patient:
doctorOfPatientX
familyMemberOfPatientX
isPatientX
lawyerOfPatientX
insurerOfPatientX
nurseOfPatientX
etc...

So, the roles would scale n*m, where n is the number of patients, and
m is roughly a fixed set of roles relative to each patient (m should
be pretty small).

You could put the patient in the row, but then you're relying on an
external system to filter the data (constrain the query) based on
roles *that* system understands. The built-in Accumulo roles would
simply constrain that external query system.

On Fri, Aug 10, 2012 at 1:00 PM, Marc Parisi <marc@accumulo.net> wrote:
> My suggestion of roles was to have a finite number of roles, with a finite
> number of actions. you would only store auths for those roles and actions.
> another lookup mechanism, in my system, will determine which user to use (
> as I recall. i don't have the code in front of me ). I did mention something
> about putting an id ( a key id perhaps ) in the CV; however, this could be
> moved elsewhere.
>
> doctor is a role. Dr. Parisi is not a role, it's a lookup to see if Parisi
> is a doctor, if so use that user ( role ). The doctor user would have the cv
> to see the user visibility. With the cryptographic hash in the cv, the goal
> was to limit which patients a doctor could see, but I can just as easily put
> that in the row to enforce that limitation.
>
> hopefully that makes sense.
>
> On Fri, Aug 10, 2012 at 12:33 PM, Edmon Begoli <ebegoli@gmail.com> wrote:
>>
>> > But that's not really n*m, since it only specifies me by name. This
>> > should
>> > be roughly linear with users, no?
>>
>> Correct.
>>
>> On Fri, Aug 10, 2012 at 12:05 PM, Adam Fuchs <afuchs@apache.org> wrote:
>> > But that's not really n*m, since it only specifies me by name. This
>> > should
>> > be roughly linear with users, no?
>> >
>> > There is definitely a reliance on some external service managing the
>> > roles
>> > that docs are in, but this should be tractable.
>> >
>> > Adam
>> >
>> > On Aug 10, 2012 11:56 AM, "Josh Elser" <josh.elser@gmail.com> wrote:
>> >>
>> >> That's what I meant, user*doctors.
>> >>
>> >> It's not enough to say "healthteam", you have to qualify it by user
>> >> too:
>> >> "adamhealthteam".
>> >>
>> >> On 8/10/12 9:02 AM, Adam Fuchs wrote:
>> >>
>> >> I guess I should have specified that the access time labels should be
>> >> used
>> >> in conjunction with the role labels, like
>> >>
>> >> "(adamsHealthTeam&(regularCheckup|illnessEvaluation))|(massStateResearcher&populationStudy)".
>> >>
>> >> Adam
>> >>
>> >> On Aug 10, 2012 8:56 AM, "Benson Margulies" <bimargulies@gmail.com>
>> >> wrote:
>> >>>
>> >>> On Fri, Aug 10, 2012 at 8:52 AM, Adam Fuchs <afuchs@apache.org>
wrote:
>> >>> > Not sure I understand why this gets into n*m roles. Can you
>> >>> > elaborate?
>> >>> >
>> >>> > The question of when your physician should have access seems like
it
>> >>> > could
>> >>> > be represented by just a few labels, like "regularCheckup",
>> >>> > "illnessEvaluation", and "populationStudy". Those labels could
then
>> >>> > be
>> >>> > tied
>> >>> > to an auditing system that could verify appropriateness of access
>> >>> > over
>> >>> > time.
>> >>>
>> >>> And if you change doctors? Maybe that's a job for some sort of
>> >>> role/group
>> >>> model.
>> >>>
>> >>>
>> >>> >
>> >>> > Adam
>> >>> >
>> >>> > On Aug 9, 2012 10:19 PM, "Josh Elser" <josh.elser@gmail.com>
wrote:
>> >>> >>
>> >>> >> I've thought quite a bit about the approach you've outlined
>> >>> >> previously..
>> >>> >>
>> >>> >> The main caveat I've always struggled to overcome is how to
>> >>> >> encapsulate
>> >>> >> *when* a physician should have access to your records. This
expands
>> >>> >> the
>> >>> >> problem into n*m roles which becomes difficult to manage inside
>> >>> >> Accumulo,
>> >>> >> especially as time elapses.
>> >>> >>
>> >>> >> On 8/8/2012 6:29 PM, Marc Parisi wrote:
>> >>> >>>
>> >>> >>> Just some ideas and thoughts....
>> >>> >>>
>> >>> >>> With a system I'm building I have code to take care of
user roles.
>> >>> >>> Roles
>> >>> >>> will define visibilities, how analysis is performed, information
>> >>> >>> sharing, etc. I have a particular role for sharing. I also
have an
>> >>> >>> area
>> >>> >>> of interest, usually assigned to a physician role, therefore
only
>> >>> >>> a
>> >>> >>> physician's office can see certain data from it. The data
>> >>> >>> corresponding
>> >>> >>> to a given person can be accessed by that person ( if they
have
>> >>> >>> app
>> >>> >>> access ), the physician that created it, and other physicians
(
>> >>> >>> with
>> >>> >>> a
>> >>> >>> different area of interest ) with whom the user wants to
share
>> >>> >>> their
>> >>> >>> data. Each area of interest will be cryptographically secured.
Our
>> >>> >>> approach will utilize multiple crypto technologies. I would
>> >>> >>> suggest
>> >>> >>> making crypto your last stop. Focus on getting
>> >>> >>> the visibility hierarchy designed. HIPAA requirements can
come
>> >>> >>> later.
>> >>> >>>
>> >>> >>> In my approach, there is no elevation of fields per se.
Instead,
>> >>> >>> there
>> >>> >>> are visibiilities for all assigned parties,so in my case
it is a
>> >>> >>> matter
>> >>> >>> of labeling. The data can have hierarchies, and each hierarchy
has
>> >>> >>> different labels to control access.
>> >>> >>>
>> >>> >>> " Patient demographic fields are PHI (personal health information)
>> >>> >>> and
>> >>> >>> these should not be visible to all who want to perform
analysis,
>> >>> >>> but
>> >>> >>> only to main administrators,
>> >>> >>> patient and maybe physician. I assume these would have
to have
>> >>> >>> separate authorization label. "
>> >>> >>>
>> >>> >>> Yes. I think this is where roles will help. Assign roles
and
>> >>> >>> visibilities to those roles. As of right now, I'm putting
>> >>> >>> ephemeral
>> >>> >>> data
>> >>> >>> in my visibilities ( user ID for a physician, among other
things
>> >>> >>> ). I
>> >>> >>> will probably move this to the qualifier and take a more
simple
>> >>> >>> approach
>> >>> >>> to visibilities.
>> >>> >>>
>> >>> >>> Each role has different actions. Right now I have four
actions;
>> >>> >>> syncing,
>> >>> >>> querying, deleting, and sharing. You don't have to capture
>> >>> >>> actions,
>> >>> >>> but
>> >>> >>> you might want to limit how the roles of users vary, and
I think
>> >>> >>> modeling the security actions within each role is an excellent
way
>> >>> >>> to
>> >>> >>> do
>> >>> >>> so.
>> >>> >>>
>> >>> >>>
>> >>> >>> On Wed, Aug 8, 2012 at 4:08 PM, Edmon Begoli <ebegoli@gmail.com
>> >>> >>> <mailto:ebegoli@gmail.com>> wrote:
>> >>> >>>
>> >>> >>>     I am trying to model the healthcare claim on accumulo
and I
>> >>> >>> want
>> >>> >>> to
>> >>> >>>     lay it out so that it:
>> >>> >>>
>> >>> >>>     A. Accurately reflects the structure of the claim
>> >>> >>>
>> >>> >>>     B. I could have controls finely applied to different
sections
>> >>> >>> of
>> >>> >>> the
>> >>> >>>     document
>> >>> >>>
>> >>> >>>     I am simplifying matter but claim contains claim document
>> >>> >>> identifiers,
>> >>> >>>     demographics of the patient, and line items for the
procedures
>> >>> >>>     performed:
>> >>> >>>
>> >>> >>>     claim identifier, data submitted, data processed, state
of
>> >>> >>> origin,
>> >>> >>> ...
>> >>> >>>     patient name, dob, location, other identifiers
>> >>> >>>     procedure 1 code, procedure 1 provider, procedure 1
cost, ...
>> >>> >>>     ...
>> >>> >>>     procedure n code, procedure n provider, procedure n
cost, ...
>> >>> >>>
>> >>> >>>
>> >>> >>>     Patient demographic fields are PHI (personal health
>> >>> >>> information)
>> >>> >>> and
>> >>> >>>     these should not be visible to all who want to perform
>> >>> >>> analysis,
>> >>> >>> but
>> >>> >>>     only to main administrators,
>> >>> >>>     patient and maybe physician. I assume these would have
to have
>> >>> >>>     separate authorization label.
>> >>> >>>
>> >>> >>>     Other fields may be visible to different groups of
people -
>> >>> >>> i.e.
>> >>> >>>     federal claim administrators can see all, but  regional
>> >>> >>> offices
>> >>> >>> can
>> >>> >>>     only see their states.
>> >>> >>>     Separate, more permissive labels.
>> >>> >>>
>> >>> >>>     Finally, it might make sense to "elevate" some fields
for easy
>> >>> >>> access
>> >>> >>>     and analysis - ie. diagnostic codes, zip code, cost.
>> >>> >>>     This would not be a matter of labels, but data design.
>> >>> >>>
>> >>> >>>
>> >>> >>>     With all this in mind, I would welcome if anyone has
any
>> >>> >>> security
>> >>> >>> and
>> >>> >>>     data design suggestions.
>> >>> >>>
>> >>> >>>
>> >>> >
>> >>
>> >>
>> >
>
>

Mime
View raw message