airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-855) Security - Airflow SQLAlchemy PickleType Allows for Code Execution
Date Tue, 15 Aug 2017 19:25:00 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16127747#comment-16127747
] 

ASF subversion and git services commented on AIRFLOW-855:
---------------------------------------------------------

Commit 4cf904cf5a7a070bbeaf3a0e985ed2b840276015 in incubator-airflow's branch refs/heads/master
from [~aoen]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=4cf904c ]

[AIRFLOW-855] Replace PickleType with LargeBinary in XCom

PickleType in Xcom allows remote code execution.
In order to deprecate
it without changing mysql table schema, change
PickleType to LargeBinary
 because they both maps to blob type in mysql. Add
"enable_pickling" to
function signature to control using ether pickle
type or JSON. "enable_pickling"
 should also be added to core section of
airflow.cfg

Picked up where https://github.com/apache
/incubator-airflow/pull/2132 left off. Took this
PR, fixed merge conflicts, added
documentation/tests, fixed broken tests/operators,
and fixed the python3 issues.

Closes #2518 from aoen/disable-pickle-type


> Security - Airflow SQLAlchemy PickleType Allows for Code Execution
> ------------------------------------------------------------------
>
>                 Key: AIRFLOW-855
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-855
>             Project: Apache Airflow
>          Issue Type: Bug
>            Reporter: Rui Wang
>            Assignee: Rui Wang
>         Attachments: test_dag.txt
>
>
> Impact: Anyone able to modify the application's underlying database, or a computer where
certain DAG tasks are executed, may execute arbitrary code on the Airflow host.
> Location: The XCom class in /airflow-internal-master/airflow/models.py
> Description: Airflow uses the SQLAlchemy object-relational mapping (ORM) to allow for
a database agnostic, object-oriented manipulation of application data. You express database
tables and values using Python (in this application's use) classes, and the ORM transparently
manipulates the underlying database, when you programatically access these structures.
> Airflow defines the following class, defining an XCom's11 ORM model:
> {code}
> class XCom(Base): 
>   """
>   Base class for XCom objects. 
>   """
>   __tablename__ = "xcom"
>   id = Column(Integer, primary_key=True) 
>   key = Column(String(512))
>   value = Column(PickleType(pickler=dill)) 
>   timestamp = Column(
>     DateTime, default=func.now(), nullable=False) 
>   execution_date = Column(DateTime, nullable=False)
> {code}
> XComs are used for inter-task communication, and their values are either defined in a
DAG, or the return value of the python_callable() function or the task's execute() method,
executed on an remote host. XCom values are, according to this model, of the PickleType, meaning
that objects assigned to the value column are transparently serialized (when being written
to) and deserialized (when being read from). The deserialization of user- controlled pickle
objects allows for the execution of arbitrary code. This means that "slaves" (where DAG code
is executed) can compromise "masters" (where DAGs are defined in code) by returning an object
that, when serialized (and subsequently deserialized), causes remote code execution. This
can also be triggered by anyone who has write access to this portion of the database.
> Note: NCC Group plans to meet with developers in the coming days to discuss this finding,
and it will be updated to reflect any additional insight provided by this meeting.
> Reproduction Steps:
> 1. Configure a local instance of Airflow.
> 2. Insert the attached DAG into your AIRFLOW_HOME/dags directory.
> This example models a slave returning a malicious object to a task's python_callable
by creating a portable object (with reduce) containing a reverse shell and pushing it as an
XCom's value. This value is serialized upon xcom_push and deserialized upon xcom_pull.
> In an actual exploit scenario, this value would be DAG function's return value, as assigned
by code within the function, executing on a malicious remote machine.
> 3. Start a netcat listener on your machine's port 4444
> 4. Execute this task from the command line with airflow run push 2016-11-17. Note that
your netcat listener has received a shell connect-back.
> Remediation: Consider the use of a custom SQLAlchemy data type that performs this transparent
serialization and deserialization, but with JSON (a text-based exchange format), rather than
pickles (which may contain code).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message