directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Karasulu <akaras...@apache.org>
Subject Re: Question about the Log file
Date Fri, 16 Mar 2012 18:18:26 GMT
On Fri, Mar 16, 2012 at 8:07 PM, Selcuk AYA <ayaselcuk@gmail.com> wrote:

> On Fri, Mar 16, 2012 at 2:26 AM, Emmanuel L├ęcharny <elecharny@gmail.com>
> wrote:
> > Hi,
> >
> > AFAICS, the og file contains a buffer which stores UserLogRecord until we
> > write them on disk. This allows the Log system to be fast, and also
> allow a
> > dedicated thread to flush the data on disk regularly.
> >
> > So far, so good.
> >
> > But I'm wondering if it would'nt be simpler to use a MemoryMappedFile
> > instead, as the Log file size is fixed. We will then let the underlying
> OS
> > to deal with flushes, not needing a dedicated thread to handle it. Also
> > MemoryMappedFile are faster than RandomFile, as it's working on pages,
> which
> > are managed by the OS (their size is depending on the OS tuning). Last,
> not
> > least, we won't need to dedicate a 4Mb buffer to store temporary
> > userRecords, as MemoryMappedFiles aren't using the JVM memory.
>
> the current implementation is flexible to work with any underlying
> file system rather than being tied to a single implementation.
> Currently it is a random access file system. But memory mapped file
> implementation should work as well. I am also thinking of using HDFS
> files in the future.A couple of notes about some concerns:
>
> *I dont see 4MB being a big concern. This size can be tuned. It can
> also be made zero if the underlying implementation is good to deal
> with writes.
> *I dont think a dedicated background thread is a problem either. If we
> want, we can make user threads to do the log sync as they log the
> records.
>
> Right now this is very low priority given the things we need to
> implement to get the txns to work.
>
>
I agree that this is lower priority than getting the branch working
correctly. This is an optimization IMO. I'd like to get things working and
then get some metrics on performance. Then we can start looking at
optimizations.

It's certainly worth mentioning, noting and discussing. Maybe we should put
these notes into JIRA so we can force ourselves to get back to asking these
questions. I'm always thinking about what performance gains we can get from
using memory mapped files but never had the chance to try it out. I'd love
to put it to the test when we get the chance after getting a full
implementation completed.

-- 
Best Regards,
-- Alex

Mime
View raw message