mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Bianchi <jazzist...@gmail.com>
Subject Re: how to change mesos resources
Date Fri, 08 Apr 2016 22:28:03 GMT
yes i imagined that, indeed i will not allocate that amount of RAM, but at
least a number higher than 920 M.
OK so summarizing:
1) i stop mesos slave with: service mesos-slave stop
2)then as June suggest i run: sudo sh -c "echo
MESOS_WORK_DIR=/scratch.local/mesos >> /etc/default/mesos-slave"
3) then as Arjun suggets:

rm -f /tmp/mesos/meta/slaves/latest

mesos-slave --master=MASTER_ADDRESS:5050 --hostname=slave_public_IP_i_set
--resources='cpu(*):1;mem(*):1000;disk(*):8000'







Is this correct procedure?




2016-04-08 23:57 GMT+02:00 Stefano Bianchi <jazzista88@gmail.com>:

> i tried the command: free -m
> and i obtained this output:
>                       total        used        free      shared
> buff/cache   available
>
> Mem:           1840         120        1407          40         312
> 1507
>
> Swap:             0           0           0
>
> so there is not 2048 MB of RAM? i'm sure that openstack tells me that this
> is a machine with 2048 MB of RAM...
>
> 2016-04-08 23:44 GMT+02:00 Arkal Arjun Rao <aarao@ucsc.edu>:
>
>> You set it up with 2048MB but you  probably don't really get all of it
>> (try `free -m` on the slave). Same with Disk (look at the value of df).
>> from the book "Building Applications in Mesos":
>> "The slave will reserve 1 GB or 50% of detected memory, whichever is
>> smaller, in order to run itself and other operating system services.
>> Likewise, it will reserve 5 GB or 50% of detected disk, whichever is
>> smaller."
>>
>> If you want to explicitly reserve a value, first ensure you have the
>> resources you want per slave then run this
>> <kill the mesos slave process>
>> rm -f /tmp/mesos/meta/slaves/latest
>> mesos-slave --master=MASTER_ADDRESS:5050 --hostname=slave_public_IP_i_set
>> --resources='cpu(*):1;mem(*):2000;disk(*):9000'
>>
>> Arjun
>>
>> On Fri, Apr 8, 2016 at 2:23 PM, Stefano Bianchi <jazzista88@gmail.com>
>> wrote:
>>
>>> What has to be clear is that i'm running virtual machines on openstack,
>>> so i am not on bare metal.
>>> All the VMs are Openstack Images, and each slave has been built with
>>> 2048 MB of RAM, so since slaves are 3 i should see in mesos something close
>>> to 6144 MB, but mesos shows only 2.7 GB.
>>> If you look at the command output i posted in previous messages, the
>>> current mesos resources configuration allows 920 MB and 5112 MB of disk
>>> space for each slave. I would like that mesos can see for instance 2000 MB
>>> of RAM and 9000 MB of disk. and for this reason i have run: mesos-slave
>>> --master=MASTER_ADDRESS:5050 --resources='cpu:1;mem:2000;disk:9000'
>>>
>>> June Taylor, i need to understand:
>>> 1) What the command you suggest do?
>>> 2) Should i stop mesos-slave before? and then run your command?
>>>
>>> Thanks in advance.
>>>
>>> 2016-04-08 21:28 GMT+02:00 June Taylor <june@umn.edu>:
>>>
>>>> How much actual RAM do your slaves contain? You can only make available
>>>> up to that amount, minus the bit that the slave reserves.
>>>>
>>>>
>>>> Thanks,
>>>> June Taylor
>>>> System Administrator, Minnesota Population Center
>>>> University of Minnesota
>>>>
>>>> On Fri, Apr 8, 2016 at 1:29 PM, Stefano Bianchi <jazzista88@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi i would like to enter in this mailing list.
>>>>> i'm currently doing my Master Thesis on Mesos and Calico.
>>>>> I'm working at INFN, institute of nuclear physics. The goal of the
>>>>> thesis is to build a PaaS where mesos is the scheduler and Calico must
>>>>> allow the interconnection between multiple datacenters linked to the
CERN.
>>>>>
>>>>> I'm exploiting an IaaS based on Openstack, here i have created 6
>>>>> Virtual Machines, 3 Masters and 3 Slaves, on one slave is running Mesos-DNS
>>>>> from Marathon.
>>>>> All is perfectly working, since i am on another network i changed
>>>>> correctly the hostnames such that on mesos are resolvable and i tried
to
>>>>> run from marathon a simple http server which is scalable on all my machine.
>>>>> So all is fine and working.
>>>>>
>>>>> The only thing that i don't like is that each 3 slaves have 1 CPU 10
>>>>> GB of disk memory and 2GB of RAM, but mesos currently show for each one
>>>>> only 5 GB of disk memory and 900MB of RAM.
>>>>> So checking in documentation i found the command to manage the
>>>>> resources.
>>>>> I stopped Slave1, for instance, and i have run this command:
>>>>>
>>>>> mesos-slave --master=MASTER_ADDRESS:5050
>>>>> --resources='cpu:1;mem:2000;disk:9000'
>>>>>
>>>>> where i want set 2000 GB of RAM and 9000GB of disk memory.
>>>>>  The output is the following:
>>>>>
>>>>> I0408 15:11:00.915324  7892 main.cpp:215] Build: 2016-03-10 20:32:58
by root
>>>>>
>>>>> I0408 15:11:00.915436  7892 main.cpp:217] Version: 0.27.2
>>>>>
>>>>> I0408 15:11:00.915448  7892 main.cpp:220] Git tag: 0.27.2
>>>>>
>>>>> I0408 15:11:00.915459  7892 main.cpp:224] Git SHA: 3c9ec4a0f34420b7803848af597de00fedefe0e2
>>>>>
>>>>> I0408 15:11:00.923334  7892 systemd.cpp:236] systemd version `219` detected
>>>>>
>>>>> I0408 15:11:00.923384  7892 main.cpp:232] Inializing systemd state
>>>>>
>>>>> I0408 15:11:00.950050  7892 systemd.cpp:324] Started systemd slice `mesos_executors.slice`
>>>>>
>>>>> I0408 15:11:00.951529  7892 containerizer.cpp:143] Using isolation: posix/cpu,posix/mem,filesystem/posix
>>>>>
>>>>> I0408 15:11:00.963232  7892 linux_launcher.cpp:101] Using /sys/fs/cgroup/freezer
as the freezer hierarchy for the Linux launcher
>>>>>
>>>>> I0408 15:11:00.965541  7892 main.cpp:320] Starting Mesos slave
>>>>>
>>>>> I0408 15:11:00.966008  7892 slave.cpp:192] Slave started on 1)@192.168.100.56:5051
>>>>>
>>>>> I0408 15:11:00.966023  7892 slave.cpp:193] Flags at startup: --appc_store_dir="/tmp/mesos/store/appc"
--authenticatee="crammd5" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false"
--cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs"
--containerizers="mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker"
--docker_auth_server="https://auth.docker.io" --docker_kill_orphans="true" --docker_puller_timeout="60"
--docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock"
--docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --enforce_container_disk_quota="false"
--executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/mesos/fetch"
--fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1"
--hadoop_home="" --help="false" --hostname_lookup="true" --image_provisioner_backend="copy"
--initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --launcher_dir="/usr/libexec/mesos"
--logbufsecs="0" --logging_level="INFO" --master="192.168.100.55:5050" --oversubscribed_resources_interval="15secs"
--perf_duration="10secs" --perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns"
--quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs"
--resources="cpu:1;mem:2000;disk:9000" --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox"
--strict="true" --switch_user="true" --systemd_enable_support="true" --systemd_runtime_directory="/run/systemd/system"
--version="false" --work_dir="/tmp/mesos"
>>>>>
>>>>> I0408 15:11:00.967485  7892 slave.cpp:463] Slave resources: cpu(*):1;
mem(*):2000; disk(*):9000; cpus(*):1; ports(*):[31000-32000]
>>>>>
>>>>> I0408 15:11:00.967547  7892 slave.cpp:471] Slave attributes: [  ]
>>>>>
>>>>> I0408 15:11:00.967560  7892 slave.cpp:476] Slave hostname: slave1.openstacklocal
>>>>>
>>>>> I0408 15:11:00.971304  7893 state.cpp:58] Recovering state from '/tmp/mesos/meta'
>>>>>
>>>>> *Failed to perform recovery: Incompatible slave info detected*.
>>>>>
>>>>> ------------------------------------------------------------
>>>>>
>>>>> Old slave info:
>>>>>
>>>>> hostname: "*slave_public_IP_i_set*"
>>>>>
>>>>> resources {
>>>>>
>>>>>   name: "cpus"
>>>>>
>>>>>   type: SCALAR
>>>>>
>>>>>   scalar {
>>>>>
>>>>>     value: 1
>>>>>
>>>>>   }
>>>>>
>>>>>   role: "*"
>>>>>
>>>>> }
>>>>>
>>>>> resources {
>>>>>
>>>>>   name: "mem"
>>>>>
>>>>>   type: SCALAR
>>>>>
>>>>>   scalar {
>>>>>
>>>>>     value: 920
>>>>>
>>>>>   }
>>>>>
>>>>>   role: "*"
>>>>>
>>>>> }
>>>>>
>>>>> resources {
>>>>>
>>>>>   name: "disk"
>>>>>
>>>>>   type: SCALAR
>>>>>
>>>>>   scalar {
>>>>>
>>>>>     value: 5112
>>>>>
>>>>>   }
>>>>>
>>>>>   role: "*"
>>>>>
>>>>> }
>>>>>
>>>>> resources {
>>>>>
>>>>>   name: "ports"
>>>>>
>>>>>   type: RANGES
>>>>>
>>>>>   ranges {
>>>>>
>>>>>     range {
>>>>>
>>>>>       begin: 31000
>>>>>
>>>>>       end: 32000
>>>>>
>>>>>     }
>>>>>
>>>>>   }
>>>>>
>>>>>   role: "*"
>>>>>
>>>>> }
>>>>>
>>>>> id {
>>>>>
>>>>>   value: "ad490064-1a6e-415c-8536-daef0d8e3572-S7"
>>>>>
>>>>> }
>>>>>
>>>>> checkpoint: true
>>>>>
>>>>> port: 5051
>>>>>
>>>>> ------------------------------------------------------------
>>>>>
>>>>> New slave info:
>>>>>
>>>>> hostname: "
>>>>>
>>>>> slave1.openstacklocal
>>>>>
>>>>> "
>>>>>
>>>>> resources {
>>>>>
>>>>>   name: "cpu"
>>>>>
>>>>>   type: SCALAR
>>>>>
>>>>>   scalar {
>>>>>
>>>>>     value: 1
>>>>>
>>>>>   }
>>>>>
>>>>>   role: "*"
>>>>>
>>>>> }
>>>>>
>>>>> resources {
>>>>>
>>>>>   name: "mem"
>>>>>
>>>>>   type: SCALAR
>>>>>
>>>>>   scalar {
>>>>>
>>>>>     value: 2000
>>>>>
>>>>>   }
>>>>>
>>>>>   role: "*"
>>>>>
>>>>> }
>>>>>
>>>>> resources {
>>>>>
>>>>>   name: "disk"
>>>>>
>>>>>   type: SCALAR
>>>>>
>>>>>   scalar {
>>>>>
>>>>>     value: 9000
>>>>>
>>>>>   }
>>>>>
>>>>>   role: "*"
>>>>>
>>>>> }
>>>>>
>>>>> resources {
>>>>>
>>>>>   name: "cpus"
>>>>>
>>>>>   type: SCALAR
>>>>>
>>>>>   scalar {
>>>>>
>>>>>     value: 1
>>>>>
>>>>>   }
>>>>>
>>>>>   role: "*"
>>>>>
>>>>> }
>>>>>
>>>>> resources {
>>>>>
>>>>>   name: "ports"
>>>>>
>>>>>   type: RANGES
>>>>>
>>>>>   ranges {
>>>>>
>>>>>     range {
>>>>>
>>>>>       begin: 31000
>>>>>
>>>>>       end: 32000
>>>>>
>>>>>     }
>>>>>
>>>>>   }
>>>>>
>>>>>   role: "*"
>>>>>
>>>>> }
>>>>>
>>>>> id {
>>>>>
>>>>>   value: "ad490064-1a6e-415c-8536-daef0d8e3572-S7"
>>>>>
>>>>> }
>>>>>
>>>>> checkpoint: true
>>>>>
>>>>> port: 5051
>>>>>
>>>>> ------------------------------------------------------------
>>>>>
>>>>> To remedy this do as follows:
>>>>>
>>>>> Step 1: rm -f /tmp/mesos/meta/slaves/latest
>>>>>
>>>>>         This ensures slave doesn't recover old live executors.
>>>>>
>>>>> Step 2: Restart the slave.
>>>>>
>>>>>
>>>>>
>>>>> I can notice two things:
>>>>>
>>>>>
>>>>> 1)the message of failure;
>>>>>
>>>>> 2)the hostname is changed; the right one is a public IP i have set in
order to resolve the hostname for mesos.
>>>>>
>>>>> As a consequence, when i start the slave, the resources are exaclty the
same, nothing is changed.
>>>>>
>>>>> Can you please help me?
>>>>>
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Arjun Arkal Rao
>>
>> PhD Student,
>> Haussler Lab,
>> UC Santa Cruz,
>> USA
>>
>> aarao@ucsc.edu
>>
>>
>

Mime
View raw message