zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koen De Groote <koen.degro...@limecraft.com>
Subject Re: log files not being cleaned up despite purgeInterval
Date Tue, 23 Jul 2019 13:27:38 GMT
Hello again, Norbert,

I haven't been able to get it working yet, but did notice something else
concerning the zookeeper user getting that error for the directory not
being found.

After looking into it a bit more, these are my findings:

The zookeeper dockerfile can be found here:
https://github.com/31z4/zookeeper-docker/blob/master/3.4.14/Dockerfile

And the relevant part shows up at the very top:

ENV ZOO_CONF_DIR=/conf \
    ZOO_DATA_DIR=/data \
    ZOO_DATA_LOG_DIR=/datalog \
    ZOO_LOG_DIR=/logs \
    ZOO_TICK_TIME=2000 \
    ZOO_INIT_LIMIT=5 \
    ZOO_SYNC_LIMIT=2 \
    ZOO_AUTOPURGE_PURGEINTERVAL=0 \
    ZOO_AUTOPURGE_SNAPRETAINCOUNT=3 \
    ZOO_MAX_CLIENT_CNXNS=60

This sets these settings as environment variables inside the container.

First thing of note: These environment variables are only available to the
root user. The process does run as the zookeeper user, to which said
environment variables are not available.

As can be seen from the output of this command, it is indeed the zookeeper
user running the process:

bash-4.4# ps -a | grep zookeeper
    1 zookeepe  0:04 /usr/lib/jvm/java-1.8-openjdk/jre/bin/java
-Dzookeeper.log.dir=/logs -Dzookeeper.root.logger=INFO,CONSOLE -cp
/zookeeper-3.4.13/bin/../build/classes:/zookeeper-3.4.13/bin/../build/lib/*.jar:/zookeeper-3.4.13/bin/../lib/slf4j-log4j12-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/slf4j-api-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/netty-3.10.6.Final.jar:/zookeeper-3.4.13/bin/../lib/log4j-1.2.17.jar:/zookeeper-3.4.13/bin/../lib/jline-0.9.94.jar:/zookeeper-3.4.13/bin/../lib/audience-annotations-0.5.0.jar:/zookeeper-3.4.13/bin/../zookeeper-3.4.13.jar:/zookeeper-3.4.13/bin/../src/java/lib/*.jar:/conf:
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.local.only=false
org.apache.zookeeper.server.quorum.QuorumPeerMain /conf/zoo.cfg

Second thing of note: The 3rd part of the dockerfile installs gosu, but the
user isn't actually changed to the zookeeper user at this point. This
happens later in docker-entrypoint.sh. Only the install of gosu is verified
to work at this point.

Third thing of note: At the end of the dockerfile, this happens:

ENV PATH=$PATH:/$DISTRO_NAME/bin \
   ZOOCFGDIR=$ZOO_CONF_DIR

But again, this environment variable is only available to the root user and
not the zookeeper user.

Then, zkServer.sh is executed to start the process. The thing of note here
is the docker-entrypoint file:
https://github.com/31z4/zookeeper-docker/blob/master/3.4.14/docker-entrypoint.sh

Which does in fact change the user to zookeeper, but doesn't take along
with it any environment variables.

The part where it all goes wrong is when zkCleanup.sh calls zkEnv.sh to get
environment variables. Since the zookeeper user is the one running that
process, it won't actually see what it needs to see.

This part:


if [ "x$ZOOCFGDIR" = "x" ]
then
  if [ -e "${ZOOKEEPER_PREFIX}/conf" ]; then
    ZOOCFGDIR="$ZOOBINDIR/../conf"
  else
    ZOOCFGDIR="$ZOOBINDIR/../etc/zookeeper"
  fi
fi


Does not follow the logic of directory layout we see at the start of the
dockerfile(the environment variables) at all.

The warning about the folder not being found is fixed if I perform this
first:

export ZOOCFGDIR="/conf"

But the cleanup still doesn't work. The script just finishes with no output
and the files are still there. The user is correct, zookeeper is owner of
the files and owner has write permissions.
There's no extended file attributes on the files either.

So I'm at my wit's end here.

For information: the command that the script generates and runs is this:

java -Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp
'/zookeeper-3.4.13/bin/../build/classes:/zookeeper-3.4.13/bin/../build/lib/*.jar:/zookeeper-3.4.13/bin/../lib/slf4j-log4j12-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/slf4j-api-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/netty-3.10.6.Final.jar:/zookeeper-3.4.13/bin/../lib/log4j-1.2.17.jar:/zookeeper-3.4.13/bin/../lib/jline-0.9.94.jar:/zookeeper-3.4.13/bin/../lib/audience-annotations-0.5.0.jar:/zookeeper-3.4.13/bin/../zookeeper-3.4.13.jar:/zookeeper-3.4.13/bin/../src/java/lib/*.jar:/conf:'
org.apache.zookeeper.server.PurgeTxnLog /data /data -n 3

Changing logger to TRACE offers no output either.




On Mon, Jul 22, 2019 at 10:13 AM Koen De Groote <koen.degroote@limecraft.com>
wrote:

> Performing "bash -ex ./zkCleanup.sh /data/version-2 -n 3" as root results
> in the creation of another version-2 folder(empty) in the existing
> version-2 folder.
>
> As both root and zookeeper user I am able to create files in the
> /data/version-2 directory inside the container.
>
> The zookeeper user is indeed not the owner of anything in the zk/bin
> folder(/zookeeper-3.4.13/bin). Executing zkCli.sh works, but creating a
> file in there doesn't.
>
> Permission level for the folder seems to be 0755 on all files and the
> folder itself.
>
> Just ran into what I think is the problem: the relative path to the
> zoo.cfg file isn't correct.
>
> I tried running just plain "./zkCleanup.sh" as the zookeeper user from
> within the folder and it printed that it could not find the zoo.cfg file,
> but the path it printed was basically "current_dir/../expected_cfg_dir",
> which is one ".." too little.
>
> Will check if this is due to a setting of mine.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Jul 19, 2019 at 2:23 PM Norbert Kalmar
> <nkalmar@cloudera.com.invalid> wrote:
>
>> I would first check the permission on zkCleanup.sh and the bin folder.
>> Sounds like zookeeper user has no access to the /zk/bin directory.
>> That might also explain why it is not getting deleted by the zk instance.
>>
>> And I'm not sure in this one, but did you try giving the full path to the
>> txn log files like bash -ex ./zkCleanup.sh /data/version-2 -n 3 as root?
>> I think this script might be expecting the full path, including the
>> version-2 directory.
>>
>> Regards,
>> Norbert
>>
>> On Fri, Jul 19, 2019 at 2:00 PM Koen De Groote <
>> koen.degroote@limecraft.com>
>> wrote:
>>
>> > Hello Norbert,
>> >
>> > I've set up a new environment which then reached at least 4 *.log files
>> > All snapshots and log files are kept in /data/version-2/(default for the
>> > image)
>> >
>> > I went into the zookeeper container and executed:
>> >
>> > bash -ex ./zkCleanup.sh /data -n 3
>> >
>> > As root, this changes nothing. There are still 4 *.log files
>> >
>> > Changing to the zookeeper user, I get the following output:
>> >
>> > Path '/zookeeper-3.4.13/bin' does not exist.
>> > Usage:
>> > PurgeTxnLog dataLogDir [snapDir] -n count
>> > dataLogDir -- path to the txn log directory
>> > snapDir -- path to the snapshot directory
>> > count -- the number of old snaps/logs you want to keep, value should be
>> > greater than or equal to 3
>> >
>> > And the 4 *.log files still exist.
>> > Also printing the usage, indicating, to me at least, that something
>> about
>> > the input is wrong, even though it is identical to the one used as root,
>> > which did not result in this output.
>> >
>> > No actual error messages seem to be printed or logged anywhere.
>> >
>> > Not sure what to do next.
>> >
>> >
>> >
>> > On Fri, Jul 19, 2019 at 11:01 AM Norbert Kalmar
>> > <nkalmar@cloudera.com.invalid> wrote:
>> >
>> > > Hi Koen,
>> > >
>> > > It should do just as you said. You can also set
>> > autopurge.snapRetainCount,
>> > > bu default it is set to 3, so if you didn't set anything it is not a
>> > reason
>> > > to keep old logs.
>> > >
>> > > As a plan B you could use zkCleanup.sh [snapshotDir] -n 3 to delete
>> all
>> > > except the last 3 log files. You can add this to a cron job.
>> > >
>> > > As for why the old log files not getting deleted, could be something
>> > > related to the docker image, maybe a permission problem? Do you see
>> any
>> > > errors in the server log?
>> > >
>> > > Regards,
>> > > Norbert
>> > >
>> > > On Thu, Jul 18, 2019 at 9:25 PM Koen De Groote <
>> > > koen.degroote@limecraft.com>
>> > > wrote:
>> > >
>> > > > Greetings,
>> > > >
>> > > > Working with Zookeeper version 3.4.13 in the official docker image.
>> > > >
>> > > > I was under the impression that the setting
>> "autopurge.purgeInterval=1"
>> > > > meant that log files would be cleaned up every hour.
>> > > >
>> > > > Instead, I now find that months of these files are just sitting in
>> > their
>> > > > directory, untouched.
>> > > >
>> > > > So perhaps I'm wrong about that, but I'm not sure.
>> > > >
>> > > > What I wish to achieve is that these log files stop accumulating and
>> > keep
>> > > > only the most recent. Is there a way to achieve this? Or are they
>> > merely
>> > > > historical and can they be deleted freely?
>> > > >
>> > > > Kind regards,
>> > > > Koen De Groote
>> > > >
>> > >
>> >
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message