apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Gilbert" <2jx64e...@sneakemail.com>
Subject Re: [PATCH] Fix for Subversion issue #1789
Date Tue, 20 Feb 2007 06:05:50 GMT
At 11:36 AM 2/19/2007 -0600, William A. Rowe, Jr. wrote:
>Jonathan Gilbert wrote:
>> Consider that Windows itself, when copying a file from one place to another
>> in, say, Exploder uses 64 KB blocks for that copy operation. Are you
>> suggesting that Exploder's file copies are significantly slower than they
>> would be if they used, say, a 2 MB block for the copy operation? CMD.EXE
>> uses the same block size, and other utilities often use smaller blocks.
>
>Uhm, try xcopy /s/v/e src dst  ... then compare exploder ^c / ^v... orders
>of magnitude slower (for a host of reasons, I'm certain).  Want optimal?
>mmap source, write source to dest in a single operation :)

XCOPY is likely just more efficient at scanning directory subtrees. I
tested it; it also uses a 64KB buffer size. There may be something else it
does, but it certainly isn't getting massive speed from its buffer size.

>> Someone on the Subversion list suggested that perhaps Visual Studio needs
>> to be able to find the Python runtime in order to perform the build, [snip]
>
>That's quite possibly true of some .py wrapper stuff, although we've
>completely eliminated the need for awk / perl / python in our nmake of
>libapr / libaprutil.

Yeah. Well, APR built just fine for me right off the bat. I haven't tried
the Subversion fix yet, but I will shortly.

>>> What are your thoughts on that approach?
>> 
>> It sounds sensible, though it will require more structural change to the
>> Win32 apr_file_write function than the current patch. I'll take a look at
>> it tonight.
>
>One thought; once we identify that it will fails, we could write a segment
>some 32000 bytes, mark it for 'small io', and return (as Joe suggests).
>
>One beauty of apr_file_t (we wish it was true for all) - it's opaque; we
>can modify the struct to have a small_file_ops flag or something like that,
>and avert the error-case in the future by forcing one 32k write.

Ah, I hadn't realized apr_file_t was an opaque platform-specific thing that
could be modified almost at a whim. I suppose it would have occurred to me
if I'd thought about it much :-)

>A larger patch, I know, especially with initializing that flag value to
>zero in all the right places.

Actually, the patch contains significantly less new code. I discovered that
APR already implements buffered file I/O and, with the current
implementation, enforces the buffer size even on large writes through
apr_file_write. It's perfect for chopping the write up into smaller chunks :-)

The default buffer size is currently only 4 KB, and there was no way to
override it that I saw, so I had two choices: Use the existing buffer
initialization or implement buffer allocation separately for console
handles. Doing the former places an upper limit on the value of
APR_DEFAULT_FILE_BUFSIZE, since of course we can't depend on WriteFile
being able to send more than 32,000 bytes at once to a console handle, so I
opted for the second option, with its own macro APR_CONSOLE_BUFSIZE in
include/arch/win32/apr_arch_file_io.h. As noted in the comment block, I
chose a buffer size of 28,672 bytes because it is the largest multiple of
the x86 system page size below 32,000 bytes.

I also factored out the common console file handle initialization into a
new method open_console_handle() which is called by the three
apr_file_open_flags_* functions. The new method includes code to initialize
the console handle's output buffer using APR_CONSOLE_BUFSIZE.

The only catch with buffering is that by default, it will actually buffer
data and nothing will appear until an entire buffer-full is queued up. To
get around this, I added a platform-specific apr_file_open flag called
APR_AUTOFLUSH_BUFFER. When the flag is present, apr_file_flush() is called
after each apr_file_write() finishes chewing through the buffer. This
ensures that small writes, including the partial buffer at the end of a
large write, do get sent to the console.

I think this is a lot more elegant than the previous version of the patch
:-) Please find the patch itself inline and attached.

Jonathan Gilbert

Index: include/arch/win32/apr_arch_file_io.h
===================================================================
--- include/arch/win32/apr_arch_file_io.h	(revision 509010)
+++ include/arch/win32/apr_arch_file_io.h	(working copy)
@@ -86,6 +86,14 @@
 /* For backwards-compat */
 #define APR_FILE_BUFSIZE APR_FILE_DEFAULT_BUFSIZE
 
+/* The WriteFile API fails when used on a console handle if the data
+ * buffer is too large. On Windows XP SP2, the boundary seems to be
+ * around 60 KB, but on earlier versions of Windows, it may be around
+ * 32,000 bytes. This buffer size is the largest multiple of the Win32
+ * page size less than 32,000.
+ */
+#define APR_CONSOLE_BUFSIZE 28672
+
 /* obscure ommissions from msvc's sys/stat.h */
 #ifdef _MSC_VER
 #define S_IFIFO        _S_IFIFO /* pipe */
@@ -96,11 +104,12 @@
 #endif
 
 /* Internal Flags for apr_file_open */
-#define APR_OPENINFO     0x00100000 /* Open without READ or WRITE access */
-#define APR_OPENLINK     0x00200000 /* Open a link itself, if supported */
-#define APR_READCONTROL  0x00400000 /* Read the file's owner/perms */
-#define APR_WRITECONTROL 0x00800000 /* Modifythe file's owner/perms */
-#define APR_WRITEATTRS   0x01000000 /* Modify the file's attributes */
+#define APR_OPENINFO         0x00100000 /* Open without READ or WRITE
access */
+#define APR_OPENLINK         0x00200000 /* Open a link itself, if
supported */
+#define APR_READCONTROL      0x00400000 /* Read the file's owner/perms */
+#define APR_WRITECONTROL     0x00800000 /* Modify the file's owner/perms */
+#define APR_WRITEATTRS       0x01000000 /* Modify the file's attributes */
+#define APR_AUTOFLUSH_BUFFER 0x02000000 /* Flush the buffer after every
write */
 
 /* Entries missing from the MSVC 5.0 Win32 SDK:
  */
Index: file_io/win32/open.c
===================================================================
--- file_io/win32/open.c	(revision 509010)
+++ file_io/win32/open.c	(working copy)
@@ -534,7 +534,7 @@
         (*file)->bufsize = APR_FILE_DEFAULT_BUFSIZE;
     }
 
-    if ((*file)->append || (*file)->buffered) {
+    if ((*file)->append || (*file)->buffered || (flags &
APR_AUTOFLUSH_BUFFER)) {
         apr_status_t rv;
         rv = apr_thread_mutex_create(&(*file)->mutex, 
                                      APR_THREAD_MUTEX_DEFAULT, pool);
@@ -567,17 +567,19 @@
     return APR_SUCCESS;
 }   
 
-APR_DECLARE(apr_status_t) apr_file_open_flags_stderr(apr_file_t **thefile, 
-                                                     apr_int32_t flags, 
-                                                     apr_pool_t *pool)
+apr_status_t open_console_handle(apr_file_t **thefile, 
+                                 DWORD handle_id,
+                                 apr_int32_t flags, 
+                                 apr_pool_t *pool)
 {
 #ifdef _WIN32_WCE
     return APR_ENOTIMPL;
 #else
     apr_os_file_t file_handle;
+    apr_status_t rv;
 
     apr_set_os_error(APR_SUCCESS);
-    file_handle = GetStdHandle(STD_ERROR_HANDLE);
+    file_handle = GetStdHandle(handle_id);
     if (!file_handle || (file_handle == INVALID_HANDLE_VALUE)) {
         apr_status_t rv = apr_get_os_error();
         if (rv == APR_SUCCESS) {
@@ -586,54 +588,56 @@
         return rv;
     }
 
-    return apr_os_file_put(thefile, &file_handle, flags | APR_WRITE, pool);
+    /* Set the APR_AUTOFLUSH_BUFFER flag. The existing
+     * buffer functionality is used to handle large
+     * writes, but we don't want small writes (including
+     * the remainder that will typically be present from
+     * a large write) to sit around in the buffer. The
+     * APR_AUTOFLUSH_BUFFER flag triggers a call to
+     * apr_file_flush() at the end of each apr_file_write().
+     */
+    flags |= APR_AUTOFLUSH_BUFFER;
+
+    rv = apr_os_file_put(thefile, &file_handle, flags, pool);
+
+    /* If we're successful and the handle is to an output
+     * stream, set up a write buffer to ensure that large
+     * console write operations don't fail. The WriteFile
+     * API can't handle operations larger than a certain
+     * size. This is done separately here instead of
+     * setting flag APR_BUFFERED because the buffer size
+     * is different from the APR_FILE_DEFAULT_BUFSIZE of
+     * 4KB.
+     */
+    if ((rv == APR_SUCCESS) && (flags & APR_WRITE)) {
+        (*thefile)->buffered = 1;
+        (*thefile)->buffer = apr_palloc(pool, APR_CONSOLE_BUFSIZE);
+        (*thefile)->bufsize = APR_CONSOLE_BUFSIZE;
+    }
+
+    return rv;
 #endif
 }
 
+APR_DECLARE(apr_status_t) apr_file_open_flags_stderr(apr_file_t **thefile,
+                                                     apr_int32_t flags,
+                                                     apr_pool_t *pool)
+{
+    return open_console_handle(thefile, STD_ERROR_HANDLE, flags |
APR_WRITE, pool);
+}
+
 APR_DECLARE(apr_status_t) apr_file_open_flags_stdout(apr_file_t **thefile, 
                                                      apr_int32_t flags,
                                                      apr_pool_t *pool)
 {
-#ifdef _WIN32_WCE
-    return APR_ENOTIMPL;
-#else
-    apr_os_file_t file_handle;
-
-    apr_set_os_error(APR_SUCCESS);
-    file_handle = GetStdHandle(STD_OUTPUT_HANDLE);
-    if (!file_handle || (file_handle == INVALID_HANDLE_VALUE)) {
-        apr_status_t rv = apr_get_os_error();
-        if (rv == APR_SUCCESS) {
-            return APR_EINVAL;
-        }
-        return rv;
-    }
-
-    return apr_os_file_put(thefile, &file_handle, flags | APR_WRITE, pool);
-#endif
+    return open_console_handle(thefile, STD_OUTPUT_HANDLE, flags |
APR_WRITE, pool);
 }
 
 APR_DECLARE(apr_status_t) apr_file_open_flags_stdin(apr_file_t **thefile, 
                                                     apr_int32_t flags,
                                                     apr_pool_t *pool)
 {
-#ifdef _WIN32_WCE
-    return APR_ENOTIMPL;
-#else
-    apr_os_file_t file_handle;
-
-    apr_set_os_error(APR_SUCCESS);
-    file_handle = GetStdHandle(STD_INPUT_HANDLE);
-    if (!file_handle || (file_handle == INVALID_HANDLE_VALUE)) {
-        apr_status_t rv = apr_get_os_error();
-        if (rv == APR_SUCCESS) {
-            return APR_EINVAL;
-        }
-        return rv;
-    }
-
-    return apr_os_file_put(thefile, &file_handle, flags | APR_READ, pool);
-#endif
+    return open_console_handle(thefile, STD_INPUT_HANDLE, flags |
APR_READ, pool);
 }
 
 APR_DECLARE(apr_status_t) apr_file_open_stderr(apr_file_t **thefile,
apr_pool_t *pool)
Index: file_io/win32/readwrite.c
===================================================================
--- file_io/win32/readwrite.c	(revision 509010)
+++ file_io/win32/readwrite.c	(working copy)
@@ -282,6 +282,19 @@
             size -= blocksize;
         }
 
+        if ((rv == APR_SUCCESS) && (thefile->flags & APR_AUTOFLUSH_BUFFER))
{
+            /* The file handle is marked with the APR_AUTOFLUSH_BUFFER
+             * flag. This is primarily used for Win32 console output
+             * handles, for which the WriteFile API function can't
+             * handle single writes above a certain size. We use the
+             * existing buffer functionality because it chops the data
+             * into bite-sized pieces nicely, but we don't actually
+             * want the handle to be fully buffered, waiting for a
+             * full buffer before displaying a single line of text.
+             */
+            rv = apr_file_flush(thefile);
+        }
+
         apr_thread_mutex_unlock(thefile->mutex);
         return rv;
     } else {

Mime
View raw message