cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajesh Battala <rajesh.batt...@citrix.com>
Subject RE: system vm disk space issue in ACS 4.3
Date Wed, 19 Mar 2014 14:33:02 GMT
Can you please file a bug and send your fix for review. 

Thanks
Rajesh Battala

-----Original Message-----
From: Saurav Lahiri [mailto:saurav.lahiri@sungard.com] 
Sent: Wednesday, March 19, 2014 7:20 PM
To: dev@cloudstack.apache.org
Subject: Re: system vm disk space issue in ACS 4.3

The problem appears to be the start function in the /etc/init.d/cloud service for console
proxy.
More specifically the following line also writes to /var/log/cloud.out

----------------------------------------------------------------------------------------------------------------------------------
(cd $CLOUD_COM_HOME/systemvm; nohup ./run.sh > /var/log/cloud/cloud.out
2>&1 & )
----------------------------------------------------------------------------------------------------------------------------------

since run.sh calls _run.sh and both has "set -x" enabled, in certain situations they can keep
logging messages to cloud.out without being aware of the settings in log4j-cloud.xml


One way to fix that could be that run.sh and _run.sh would log to cloud.out only if a debug
flag was set to true, otherwise only the java process would write to cloud.out and log4j would
respect the settings in log4j-cloud.xml


Thanks
Saurav



On Mon, Mar 17, 2014 at 8:47 PM, Saurav Lahiri <saurav.lahiri@sungard.com>wrote:

> Could it have  something to do with the RollingFileAppender that is 
> being used.
> The following 
> rollingfileappender<http://apache-logging.6191.n7.nabble.com/RollingFi
> leAppender-not-working-consistently-td8582.html> link appears to be a bit outdated
but they more or less describe a similar problem that we are seeing?
>
>
> On our environment that is what we have seeing for sometime on console 
> proxy.  The root filesystem goes full with the cloud.out.* occupying 
> all the space. This happens pretty frequently and we have to regularly 
> recycle the console proxy to resolve this issue.
>
>
> As seen below, cloud.out.2 should not have exceeded 10MB but it stands 
> at 217MB now.
>
> drwxr-xr-x 2 root root 4.0K Mar 17 14:57 .
> drwxr-xr-x 8 root root 4.0K Mar 17 15:01 ..
> -rw-r--r-- 1 root root    0 Mar 12 18:18 api-server.log
> -rw-r--r-- 1 root root 357K Mar 17 15:06 cloud.out
> -rw-r--r-- 1 root root 2.1M Mar 17 14:56 cloud.out.1
> -rw-r--r-- 1 root root 217M Mar 17 15:06 cloud.out.2
>
> root@v-zzzz-VM:/var/log/cloud# lsof | grep cloud.out
> sleep       649 root    1w      REG      202,1 226122291     181737
> /var/log/cloud/cloud.out.2
> sleep       649 root    2w      REG      202,1 226122291     181737
> /var/log/cloud/cloud.out.2
> bash       2312 root    1w      REG      202,1 226122291     181737
> /var/log/cloud/cloud.out.2
> bash       2312 root    2w      REG      202,1 226122291     181737
> /var/log/cloud/cloud.out.2
> bash       2339 root    1w      REG      202,1 226122291     181737
> /var/log/cloud/cloud.out.2
> bash       2339 root    2w      REG      202,1 226122291     181737
> /var/log/cloud/cloud.out.2
> bash       2786 root    1w      REG      202,1 226122291     181737
> /var/log/cloud/cloud.out.2
> bash       2786 root    2w      REG      202,1 226122291     181737
> /var/log/cloud/cloud.out.2
> java       2805 root    1w      REG      202,1 226122291     181737
> /var/log/cloud/cloud.out.2
> java       2805 root    2w      REG      202,1 226122291     181737
> /var/log/cloud/cloud.out.2
> java       2805 root  116w      REG      202,1    319382     181769
> /var/log/cloud/cloud.out
> root@v-zzzz-VM:/var/log/cloud# ls -alh
>
> Thanks
> Saurav
>
>
> On Tue, Mar 11, 2014 at 7:58 AM, Chiradeep Vittal < 
> Chiradeep.Vittal@citrix.com> wrote:
>
>> Yes, it was deliberate. I canĀ¹t find the discussion, but it revolved 
>> around a security best practice of having separate partitions for /, 
>> /swap, home directories
>>
>>
>> On 3/10/14, 11:35 AM, "Marcus" <shadowsor@gmail.com> wrote:
>>
>> >There have been several raised, actually regarding /var/log.  As for 
>> >the system vm partitioning, it was explicitly changed from single to 
>> >multiple partitions last year. I have no idea why, but I generally 
>> >don't file bugs without community discussion on things that seem 
>> >deliberate.
>> >
>> >On Sat, Mar 8, 2014 at 11:32 AM, Marcus <shadowsor@gmail.com> wrote:
>> >> Yeah, I've just seen on busy systems where even with log rotation 
>> >>working  properly the little space left in var after OS files is 
>> >>barely enough, for  example the conntrackd log on a busy VPC. We 
>> >>actually ended up rolling our  own system vm, the existing image 
>> >>has plenty of space, its just locked up in  other partitions.
>> >>
>> >> On Mar 8, 2014 8:58 AM, "Rajesh Battala" 
>> >><rajesh.battala@citrix.com>
>> >>wrote:
>> >>>
>> >>> Yes, only 435MB is available for /var . we can increase the space
>> also.
>> >>> But we need to find out the root cause which services are causing 
>> >>>the /var  to fill up.
>> >>> Can you please find out and post which log files are taking up 
>> >>>more space  in /var
>> >>>
>> >>> Thanks
>> >>> Rajesh Battala
>> >>>
>> >>> -----Original Message-----
>> >>> From: Marcus [mailto:shadowsor@gmail.com]
>> >>> Sent: Saturday, March 8, 2014 8:19 PM
>> >>> To: dev@cloudstack.apache.org
>> >>> Subject: RE: system vm disk space issue in ACS 4.3
>> >>>
>> >>> Perhaps there's a new service. I know in the past we've seen 
>> >>>issues with  this , specifically the conntrackd log. I think the 
>> >>>cloud logs weren't  getting rolled either, but I thought it was 
>> >>>all fixed.
>> >>>
>> >>> There's also simply not a ton of space on /var, I wish we would 
>> >>>go back to  just having one partition because it orphans lots of 
>> >>>free space in other  filesystems.
>> >>> On Mar 8, 2014 12:37 AM, "Rajesh Battala" 
>> >>><rajesh.battala@citrix.com>
>> >>> wrote:
>> >>>
>> >>> > AFAIK, log roation is enabled in the systemvm.
>> >>> > Can you check whether the logs are getting zipped .?
>> >>> >
>> >>> > -----Original Message-----
>> >>> > From: Anirban Chakraborty [mailto:abchak@juniper.net]
>> >>> > Sent: Saturday, March 8, 2014 12:46 PM
>> >>> > To: dev@cloudstack.apache.org
>> >>> > Subject: system vm disk space issue in ACS 4.3
>> >>> >
>> >>> > Hi All,
>> >>> >
>> >>> > I am seeing system vm disk has no space left after running for

>> >>> > few
>> >>>days.
>> >>> > Cloudstack UI shows the agent in v-2-VM in alert state, while 
>> >>> > agent state of s-1-VM shows blank (hyphen in the UI).
>> >>> > Both the system vms are running and ssh-able from the host. The

>> >>> > log
>> >>>in
>> >>> > s-1-Vm shows following errors:
>> >>> >
>> >>> > root@s-1-VM:~# grep 'Exception' /var/log/cloud/*.*
>> >>> > /var/log/cloud/cloud.out.2:java.io.IOException: No space left 
>> >>> > on device
>> >>> > /var/log/cloud/cloud.out.2:java.io.IOException: No space left 
>> >>> > on device
>> >>> >
>> >>> > whereas logs in v-1-VM shows
>> >>> > /var/log/cloud/cloud.out.3:java.io.IOException: No space left 
>> >>> > on device
>> >>> > /var/log/cloud/cloud.out.3:java.io.IOException: No space left 
>> >>> > on device
>> >>> > /var/log/cloud/cloud.out.3:07:18:00,547  INFO
>> CSExceptionErrorCode:87
>> >>> > - Could not find exception:
>> >>> > com.cloud.exception.AgentControlChannelException
>> >>> > in error code list for exceptions
>> >>> >
>> >>> >
>>
>> >>>/var/log/cloud/cloud.out.3:com.cloud.exception.AgentControlChannel
>> >>>Except
>> >>>ion:
>> >>> > Unable to post agent control request as link is not available
>> >>> >
>> >>> > Looks like cloud agent is filling up the log, which is leading

>> >>> > to
>> the
>> >>> > disk full state.
>> >>> >
>> >>> > Is this a known issue? Thanks.
>> >>> >
>> >>> > Anirban
>> >>> >
>>
>>
>>
>

Mime
View raw message