www-infrastructure-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grzegorz Kossakowski <gkossakow...@apache.org>
Subject Making git mirror synchronized on every commit
Date Sun, 14 Sep 2008 20:27:33 GMT
Hello,

I was thinking about improving git clones of Apache projects kindly served by Jukka on his
server[1].

The idea is to make them as up-to-date as possible which means they would be synchronized
whenever
any new commit arrives to svn.eu.apache.org repository.

Obviously, running git-svn periodically would put too much of unnecessary load on svn.eu.apache.org
especially when we take into account that svn protocol is rather verbose and git-svn seems
to be
still far from being perfect.

So we need to be notified about new commits. Actually, Apache has such notification system
already
in place - I'm talking here about mailing lists where commit notifications are sent.

I decided to go with that idea but in order to keep things as simple as possible I decided
to
exploit mailing list indirectly by accessing RSS feed generated by mail archive site (gmane.org
in
my case).

As a result I've prepared a little script that monitors RSS feed provided by gmane.org so
git-svn
can be called whenever new commit arrives.

The source code of script is:
#!/bin/bash

# Path to storage where files containing timestamps of last synchronization are stored
TIMESTAMP_STORAGE=".";
PROJECT_NAME="cocoon";
PROJECT_GNAME_LIST="gmane.text.xml.cocoon.cvs";

#Check a lock (to avoid concurrent checkings)
if [[ -a "${PROJECT_NAME}.lock" ]]; then
  exit 0;
fi

#Create a lock
touch "${PROJECT_NAME}.lock";

#Here we ask wget to print headers returned by gmane.org (--server-response) using HEAD request
so
no contents of RSS file itself is being transfered and written (--spider).
#Then we extract Last-Modified header and date that is stored there. Result is written to
a
temporary file.
wget --server-response --spider "http://rss.gmane.org/messages/excerpts/${PROJECT_GNAME_LIST}"
2>&1
| grep Last-Modified | cut -d : -f 2- | cut -c 2- > "${PROJECT_NAME}.timestamp.new";

if !(cmp "${PROJECT_NAME}.timestamp" "${PROJECT_NAME}.timestamp.new" &> /dev/null);
then
  echo "There are new commits, here script calling git-svn should be executed for project
${PROJECT_NAME}.";
  mv -f "${PROJECT_NAME}.timestamp.new" "${PROJECT_NAME}.timestamp";
else
  rm "${PROJECT_NAME}.timestamp.new";
fi

rm "${PROJECT_NAME}.lock";


As you can see in this particular example I used Cocoon project and its cvs mailing list but
whole
script is rather generic. Such script could be added as cron job executed every 1-5s as the
load put
on gmane.org will be minimal. Apart from a few HTTP headers no data is being transferred.

The final question goes to Jukka: Are you fine with adding such feature to your server? At
the
beginning we could add this experimentally just for one project (Cocoon) and see how this
performs.

[1] http://jukka.zitting.name/git/

Mime
View raw message