Return-Path: X-Original-To: apmail-drill-issues-archive@minotaur.apache.org Delivered-To: apmail-drill-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CF93018965 for ; Wed, 23 Mar 2016 12:43:25 +0000 (UTC) Received: (qmail 78101 invoked by uid 500); 23 Mar 2016 12:43:25 -0000 Delivered-To: apmail-drill-issues-archive@drill.apache.org Received: (qmail 77987 invoked by uid 500); 23 Mar 2016 12:43:25 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 77956 invoked by uid 99); 23 Mar 2016 12:43:25 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Mar 2016 12:43:25 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 836742C14F3 for ; Wed, 23 Mar 2016 12:43:25 +0000 (UTC) Date: Wed, 23 Mar 2016 12:43:25 +0000 (UTC) From: "John Omernik (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (DRILL-3820) Nested Directories : Metadata Cache in a directory stores information from sub-directories as well creating security issues MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DRILL-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15208350#comment-15208350 ] John Omernik commented on DRILL-3820: ------------------------------------- This is likely related to my issue here: https://issues.apache.org/jira/browse/DRILL-4143 First thought: Is there any sensitive data in the metadata cache that can be leaked to a user who doesn't have access to the directories? If so, we need validate the user running the query has access to the data prior to providing that information from the cache. Then, I would agree with Rahul, read and write with the drillbit process user. However, what happens in situation where the drillbit process user doesn't have access to the directories but the impersonated user does? Is it a requirement that with impersonation, that the drillbit process user has access to the data? If it doesn't, how would it write the files? If this is a requirement (I don't think it is) (that the drillbit process data also has access to the data) then the answer here is simple: metadata reads and writes as drillbit process user, and can be issued by anyone. (Don't queries that notice that the metadata is out of date or missing try to create it by default? Should metadata operations be privileged? This is a tricky subject in that I think as I have mentioned earlier, if Drill sees things out of data, missing metadata it tries to refresh on it's own. Thus indicating to me, meta data operations should not be privileged. Hmm... tricky issue :) > Nested Directories : Metadata Cache in a directory stores information from sub-directories as well creating security issues > --------------------------------------------------------------------------------------------------------------------------- > > Key: DRILL-3820 > URL: https://issues.apache.org/jira/browse/DRILL-3820 > Project: Apache Drill > Issue Type: Bug > Components: Metadata > Reporter: Rahul Challapalli > Assignee: Parth Chandra > Priority: Critical > Fix For: 1.7.0 > > > git.commit.id.abbrev=3c89b30 > User A has access to lineitem folder and its subfolders > User B had access to lineitem folder but not its sub-folders. > Now when User A runs the "refresh table metadata lineitem" command, the cache file gets created under lineitem folder. This file contains information from the underlying sub-directories as well. > Now User B can download this file and get access to information which he should not be seeing in the first place. > This can be very easily reproducible if impersonation is enabled on the cluster. > Let me know if you need more information to reproduce this issue -- This message was sent by Atlassian JIRA (v6.3.4#6332)