subversion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Neyman <sti...@att.net>
Subject Pool usage in Python bindings
Date Wed, 08 Oct 2014 00:27:09 GMT
Hi,

I have encountered a problem with Subversion's Python bindings. For example,
the following simple program does not work:

[[[
#!/usr/bin/python
from svn import repos, core
if __name__ == '__main__':
    stream = core.Stream(core.svn_stream_open_readonly("dump"))
    print stream.read()
    stream.close()
]]]

$ ./0.py
Traceback (most recent call last):
  File "./0.py", line 7, in <module>
    print stream.read()
  File "/usr/lib/python2.7/dist-packages/svn/core.py", line 212, in read
    data = svn_stream_read(self._stream, SVN_STREAM_CHUNK_SIZE)
  File "/usr/lib/python2.7/dist-packages/libsvn/core.py", line 5362, in svn_stream_read
    return _core.svn_stream_read(*args)
  File "/usr/lib/python2.7/dist-packages/libsvn/core.py", line 5585, in assert_valid
    assert self.__dict__["_is_valid"](), "Variable has already been deleted"
AssertionError: Variable has already been deleted

The problem is that the function to open a stream takes two pool arguments:
result_pool and scratch_pool. In the bindings, the following code is
generated to wrap svn_stream_open_readonly() [formatted for readability]:

[[[
  if (svn_swig_py_get_pool_arg(args, SWIGTYPE_p_apr_pool_t,
      &_global_py_pool, &_global_pool))
    SWIG_fail;
  arg3 = _global_pool;
  if (svn_swig_py_get_pool_arg(args, SWIGTYPE_p_apr_pool_t,
      &_global_py_pool, &_global_pool))
    SWIG_fail;
  arg4 = _global_pool;
...
  result = (svn_error_t *)svn_stream_open_readonly(arg1,
      (char const *)arg2,arg3,arg4);
...
  Py_XDECREF(_global_py_pool);
  Py_XDECREF(_global_py_pool);
]]]

In case the pool arguments are passed as default (None), each of the calls to
svn_swig_py_get_pool_arg() will create a subpool of the application pool and
assign it to _global_pool. Then, the 2nd subpool will be decremented twice
at the end of the function, while the 1st one will leak.

If the caller passes both pool arguments, it is even worse: both arg3 and
arg4 will be initialized by svn_swig_get_pool_arg() to point to the scratch
pool - it always checks the last argument of the function to be a valid pool
and if it is - uses that to initialize the _global_pool:

[[[
  int argnum = PyTuple_GET_SIZE(args) - 1;

  if (argnum >= 0)
    {
      PyObject *input = PyTuple_GET_ITEM(args, argnum);
      if (input != Py_None && PyObject_HasAttrString(input, markValid))
        {
          *pool = svn_swig_py_must_get_ptr(input, type, argnum+1);
]]]

Obviously, the caller would expect the result from the wrapped function
to be allocated in the result pool - it would be unexpected to see the
returned variables to disappear when the *scratch* pool is freed.

Even if the svn_swig_py_get_pool_arg had recovered both pool pointers
correctly, an attempt to pass two pools would fail in the validity check
in the wrapper function. Obviously, if one passes two distinct pools as
arguments, only one of them can be equal to the _global_pool:

[[[
  if (obj1) {
    /* Verify that the user supplied a valid pool */
    if (obj1 != Py_None && obj1 != _global_py_pool) {
      SWIG_Python_TypeError(..., obj1);
...
  if (obj2) {
    /* Verify that the user supplied a valid pool */
    if (obj2 != Py_None && obj2 != _global_py_pool) {
      SWIG_Python_TypeError(..., obj2);
]]]

Am I correct in assuming that the division of a single 'pool' argument
into distinct 'result_pool' and 'scratch_pool' is rather recent change
and the bindings were not updated for that?

Now, fixing this is somewhat complicated. The typemaps for 'apr_pool_t *'
use %typemap(default) to obtain the object passed in, not the %typemap(in).
There's a reason for this: the _global_pool may be then used to perform
the memory allocations needed to convert other arguments - which, according
to Subversion's coding style, always precede the pool arguments passed last.
For the %typemap(default), however, SWIG does not set the $input macro -
so the $svn_argnum 'hack' cannot be used to determine the argument number
from which the pool should be obtained.

The allocations from thus created _global_pool can be either temporary,
needed to convert a Python object into an appropriate C/APR type (e.g.
svn_swig_py_seq_to_array to convert a list to array) or more permanent
(such as svn_swig_py_setup_ra_callbacks - which should be kept for the
lifetime of the returned svn_ra_session_t - the svn_ra_local__open()
stashes the passed in 'callbacks' pointer for later use).

Given all that, I could think of two approaches to fixing this issue.

1. Hackish, but simple.

Change the call signature of the functions receiving one or two pool
arguments: make them always receive a single argument for any pools
it may have, which may be either None (meaning "use application pool"),
a pool object (which will be used for both the result and scratch
allocations) or a (result_pool, scratch_pool) tuple. 

This approach has two obvious drawbacks: first, it changes the
number of arguments on the existing interfaces (but this may not
be a bad thing, actually, given that it makes the breakage more
explicit - rather than subtly using the scratch pool as described above).
Second, it makes the argument lists for functions taking two pool
arguments not match their C counterparts. E.g., svn_stream_open_readonly
would be invoked as:

   // C
   res = svn_stream_open_readonly(&stream, "path",
        result_pool, scratch_pool);

   # Python
   stream = core.svn_stream_open_readonly("path", \
        (result_pool, scratch_pool))
Mime
View raw message