jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Guggisberg <stefan.guggisb...@gmail.com>
Subject Re: Problems storing/accessing very large files via WebDAV
Date Wed, 06 May 2009 11:37:42 GMT
On Wed, May 6, 2009 at 12:00 PM, Stefan Guggisberg
<stefan.guggisberg@gmail.com> wrote:
> hi nicolas,
>
> On Thu, Apr 30, 2009 at 1:22 AM, Laird, Nicholas J.
> <Nicholas.Laird@gd-ais.com> wrote:
>> I am having an issue when storing and retrieving very large files (
>> 400MB -> 2GB ) to my Jackrabbit repository via WebDAV.  I am using the
>> FileDataStore to store resources of size greater than 100 bytes (i.e.,
>> the default configuration for FileDataStore on the Jackrabbit wiki).
>>
>> I need to support these large files and have the repository presented to
>> the end user as a mapped network drive in Windows Explorer.  I have
>> tried using both Windows Explorer's built-in WebDAV client (by mapping
>> my repository as a network drive) and a product called WebDrive, which
>> also does network drive mapping.  Performance with WebDrive is better
>> (Explorer seems to have known WebDAV issues, at least in Windows XP),
>> but for large enough files even it gets bogged down and confused.
>>
>> When uploading a file by drag/drop to the mapped network drive in
>> Explorer, the upload seems to proceed and finish normally, from the
>> WebDAV client's perspective, however Jackrabbit is performing some sort
>> of internal caching (it seems) while the WebDAV session is still "open",
>> with the client believing that the transfer is still in progress.
>>
>> A snippet of the Jackrabbit log file (with trace logging enabled) during
>> the transfer is below.  Notice the timestamp difference between lines 3
>> and 4:
>>
>> 1) 29.04.2009 16:04:56 *DEBUG* ImportContextImpl: Starting IOHandler
>> (org.apache.jackrabbit.server.io.DefaultHandler)
>> (DefaultIOListener.java, line 43)
>> 2) 29.04.2009 16:04:56 *DEBUG* ItemManager: caching item
>> e7ab8f92-d6a5-4bbf-bb9c-fb7e0ab9042e (ItemManager.java, line 787)
>> 3) 29.04.2009 16:04:56 *DEBUG* ItemManager: caching item
>> e7ab8f92-d6a5-4bbf-bb9c-fb7e0ab9042e/{http://www.jcp.org/jcr/1.0}data
>> (ItemManager.java, line 787)
>> 4) 29.04.2009 16:05:12 *DEBUG* ItemManager: caching item
>> e7ab8f92-d6a5-4bbf-bb9c-fb7e0ab9042e/{http://www.jcp.org/jcr/1.0}mimeTyp
>> e (ItemManager.java, line 787)
>> 5) 29.04.2009 16:05:12 *DEBUG* ItemManager: caching item
>> e7ab8f92-d6a5-4bbf-bb9c-fb7e0ab9042e/{http://www.jcp.org/jcr/1.0}encodin
>> g (ItemManager.java, line 787)
>> 6) 29.04.2009 16:05:12 *DEBUG* ItemManager: destroyed item
>> e7ab8f92-d6a5-4bbf-bb9c-fb7e0ab9042e/{http://www.jcp.org/jcr/1.0}encodin
>> g (ItemManager.java, line 884)
>> 7) 29.04.2009 16:05:12 *DEBUG* ItemManager: removing items
>> e7ab8f92-d6a5-4bbf-bb9c-fb7e0ab9042e/{http://www.jcp.org/jcr/1.0}encodin
>> g from cache (ItemManager.java, line 801)
>> 8) 29.04.2009 16:05:12 *DEBUG* ItemManager: caching item
>> e7ab8f92-d6a5-4bbf-bb9c-fb7e0ab9042e/{http://www.jcp.org/jcr/1.0}lastMod
>> ified (ItemManager.java, line 787)
>> 9) 29.04.2009 16:05:12 *DEBUG* ImportContextImpl: Result for IOHandler
>> (org.apache.jackrabbit.server.io.DefaultHandler): OK
>> (DefaultIOListener.java, line 50)
>>
>> 16 seconds isn't an eternity, but the time increases as the size of the
>> file increases.  At a large enough file size, Explorer gives up on the
>> transfer with a "Write Delay Failed" error and WebDrive thinks the
>> server has taken too long to respond and times out.  WebDrive can be
>> configured to wait longer, but I have had to increase the time to 2
>> minutes to try to manage files nearing 2GB.
>>
>> During downloads, the same situation occurs, except "caching item" of
>> the jcr:data property occurs before the download can start.
>>
>> I am not sure exactly what Jackrabbit is doing or if there is a way to
>> speed up the process (or prevent the caching altogether, if that is
>> indeed what is happening).  It could be that some other operation is
>> occurring that is not being revealed by the logging (though the trace
>> logging seems pretty thorough and verbose).
>
> the trace msgs sent you on a wrong track. the "temManager: caching ..."
> msgs are irrelevant here. ItemManager does cache implementations
> of the javax.jcr.Item interface. the real 'data' is managed separatly.
>
> AFAIK the time spent on storing a large binary mainly accounts for
> building the hash (used by the datastore) and spoolign the data to
> the datastore. depenending on the type of binary and your configuration
> text extractors might also be involved.

i ran a quick test on my machine (os-x 10.5, 2.8ghz core duo, 7200 hdd,
256mb jvm heap, FileDataStore):

- storing a 700mb video file in a local rep: ~25s
- spooling a local 700mb file using a 4k buffer: ~16s

the diff probably accounts for  computing the hash of the file content.

cheers
stefan

>
> maybe thomas can provide some more input...
>
> cheers
> stefan
>
>>
>> Any suggestions on how to configure or optimize for this situation is
>> greatly appreciated.
>>
>> Sincerely,
>> Nicholas Laird
>>
>>
>

Mime
View raw message