mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jie Yu (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (MESOS-2319) Unable to set --work_dir to a non /tmp device
Date Wed, 04 Feb 2015 18:46:34 GMT

     [ https://issues.apache.org/jira/browse/MESOS-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jie Yu reassigned MESOS-2319:
-----------------------------

    Assignee: Jie Yu

> Unable to set --work_dir to a non /tmp device
> ---------------------------------------------
>
>                 Key: MESOS-2319
>                 URL: https://issues.apache.org/jira/browse/MESOS-2319
>             Project: Mesos
>          Issue Type: Bug
>          Components: slave
>    Affects Versions: 0.22.0
>         Environment: mesos 0.22.0 git SHA b22d7addbc03dfe4a5aa63a05e4f805b1c15631d 
> relevant filesystem mounts:
> {code}
> /dev/xvda9 on / type ext4 (rw,relatime,data=ordered)
> tmpfs on /tmp type tmpfs (rw)
> {code}
>            Reporter: Jeremy Lingmann
>            Assignee: Jie Yu
>              Labels: mesosphere, recovery
>
> When starting mesos-slave with --work_dir set to a directory which is not the same device
as /tmp results in mesos-slave throwing a core dump:
> {code}
> mesos # GLOG_v=1 sbin/mesos-slave --master=zk://10.171.59.83:2181/mesos --work_dir=/var/lib/mesos/
> WARNING: Logging before InitGoogleLogging() is written to STDERR
> I0204 18:24:49.274619 22922 process.cpp:958] libprocess is initialized on 10.169.146.67:5051
for 8 cpus
> I0204 18:24:49.274978 22922 logging.cpp:177] Logging to STDERR
> I0204 18:24:49.275111 22922 main.cpp:152] Build: 2015-02-03 22:59:30 by 
> I0204 18:24:49.275233 22922 main.cpp:154] Version: 0.22.0
> I0204 18:24:49.275485 22922 containerizer.cpp:103] Using isolation: posix/cpu,posix/mem
> 2015-02-04 18:24:49,275:22922(0x7ffdd4d5c700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper
C client 3.4.5
> 2015-02-04 18:24:49,275:22922(0x7ffdd4d5c700):ZOO_INFO@log_env@716: Client environment:host.name=ip-10-169-146-67.ec2.internal
> 2015-02-04 18:24:49,276:22922(0x7ffdd4d5c700):ZOO_INFO@log_env@723: Client environment:os.name=Linux
> 2015-02-04 18:24:49,276:22922(0x7ffdd4d5c700):ZOO_INFO@log_env@724: Client environment:os.arch=3.18.2
> 2015-02-04 18:24:49,276:22922(0x7ffdd4d5c700):ZOO_INFO@log_env@725: Client environment:os.version=#2
SMP Tue Jan 27 23:34:36 UTC 2015
> 2015-02-04 18:24:49,276:22922(0x7ffdd4d5c700):ZOO_INFO@log_env@733: Client environment:user.name=core
> 2015-02-04 18:24:49,276:22922(0x7ffdd4d5c700):ZOO_INFO@log_env@741: Client environment:user.home=/root
> 2015-02-04 18:24:49,276:22922(0x7ffdd4d5c700):ZOO_INFO@log_env@753: Client environment:user.dir=/opt/mesosphere/dcos/0.0.1-0.1.20150203225612/mesos
> 2015-02-04 18:24:49,276:22922(0x7ffdd4d5c700):ZOO_INFO@zookeeper_init@786: Initiating
client connection, host=10.171.59.83:2181 sessionTimeout=10000 watcher=0x7ffdd97bccf0 sessionId=0
sessionPasswd=<null> context=0x7ffdc8000ba0 flags=0
> I0204 18:24:49.276793 22922 main.cpp:180] Starting Mesos slave
> 2015-02-04 18:24:49,307:22922(0x7ffdd151f700):ZOO_INFO@check_events@1703: initiated connection
to server [10.171.59.83:2181]
> I0204 18:24:49.307548 22922 slave.cpp:173] Slave started on 1)@10.169.146.67:5051
> I0204 18:24:49.307955 22922 slave.cpp:300] Slave resources: cpus(*):1; mem(*):2728; disk(*):24736;
ports(*):[31000-32000]
> I0204 18:24:49.308404 22922 slave.cpp:329] Slave hostname: ip-10-169-146-67.ec2.internal
> I0204 18:24:49.308459 22922 slave.cpp:330] Slave checkpoint: true
> I0204 18:24:49.310431 22924 state.cpp:33] Recovering state from '/var/lib/mesos/meta'
> I0204 18:24:49.310583 22924 state.cpp:668] Failed to find resources file '/var/lib/mesos/meta/resources/resources.info'
> I0204 18:24:49.310670 22924 state.cpp:74] Failed to find the latest slave from '/var/lib/mesos/meta'
> I0204 18:24:49.310803 22924 status_update_manager.cpp:197] Recovering status update manager
> I0204 18:24:49.310916 22924 containerizer.cpp:300] Recovering containerizer
> I0204 18:24:49.311110 22924 slave.cpp:3527] Finished recovery
> F0204 18:24:49.311312 22924 slave.cpp:3537] CHECK_SOME(state::checkpoint(path, bootId.get())):
Failed to rename '/tmp/PSHLqV' to '/var/lib/mesos/meta/boot_id': Invalid cross-device link

> 2015-02-04 18:24:49,310:22922(0x7ffdd151f700):ZOO_INFO@check_events@1750: session establishment
complete on server [10.171.59.83:2181], sessionId=0x14b51bc8506039a, negotiated timeout=10000
> *** Check failure stack trace: ***
>     @     0x7ffdd9a6596d  google::LogMessage::Fail()
> I0204 18:24:49.313356 22930 group.cpp:313] Group process (group(1)@10.169.146.67:5051)
connected to ZooKeeper
>     @     0x7ffdd9a677ad  google::LogMessage::SendToLog()
> I0204 18:24:49.313786 22930 group.cpp:790] Syncing group operations: queue size (joins,
cancels, datas) = (0, 0, 0)
> I0204 18:24:49.314487 22930 group.cpp:385] Trying to create path '/mesos' in ZooKeeper
> I0204 18:24:49.323668 22930 group.cpp:717] Found non-sequence node 'log_replicas' at
'/mesos' in ZooKeeper
> I0204 18:24:49.323806 22930 detector.cpp:138] Detected a new leader: (id='1')
> I0204 18:24:49.323958 22930 group.cpp:659] Trying to get '/mesos/info_0000000001' in
ZooKeeper
> I0204 18:24:49.324595 22930 detector.cpp:433] A new leading master (UPID=master@10.171.59.83:5050)
is detected
>     @     0x7ffdd9a6555c  google::LogMessage::Flush()
>     @     0x7ffdd9a680a9  google::LogMessageFatal::~LogMessageFatal()
>     @     0x7ffdd94b7179  _CheckFatal::~_CheckFatal()
>     @     0x7ffdd96718e2  mesos::internal::slave::Slave::__recover()
>     @     0x7ffdd9a1524a  process::ProcessManager::resume()
>     @     0x7ffdd9a1550c  process::schedule()
>     @     0x7ffdd83832ad  (unknown)
>     @     0x7ffdd80b834d  (unknown)
> Aborted (core dumped)
> {code}
> Removing the --work_dir option results in the slave starting successfully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message