harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Regis <xu.re...@gmail.com>
Subject Re: [classlib] add cache for file's canonical path
Date Mon, 06 Jul 2009 05:42:02 GMT
Regis wrote:
> Tim Ellison wrote:
>> Regis wrote:
>>> Tim Ellison wrote:
>>>> Regis wrote:
>>>>> Recently I'm trying to optimize implementation of java.File,
>>>> Cool.
>>>>> and find File.getCanonicalPath is very time cost and is heavily used
>>>>> by checking file permission. But in the most of cases, the canonical
>>>>> path is never change, so I think it's better to cache them to avoid
>>>>> calculate every time. And I found RI also has cache:
>>>>> File.getCanonicalPath doesn't reflect change in file system
>>>>> real-time.
>>>> Hmm, sounds risky to me -- at least liable to create unpredictable
>>>> behavior.
>>> The file's canonical path changed rarely, on Linux, only symbol link in
>>> the path changed, can cause canonical path change; and on Windows, only
>>> when path contains short name, the canonical path can be changed.
>>> Maybe we can choose to not cache paths which are relatively "volatile",
>>> such as path contained short name on Windows?
>>>>> I have created a JIRA HARMONY-6200 for this, and attach a simple cache
>>>>> implementation for file's canonical path, someone interested in 
>>>>> this may
>>>>> can help to review it.
>>>>> The cache has fixed size, when it's full oldest element would be 
>>>>> remove,
>>>>> and elements would be expired in 10 minutes, that mean if the element
>>>>> stay in cache more than 10 minutes, it's not validate and will be
>>>>> removed.
>>>> Ten minutes seems like a very long time.  I would have thought that
>>>> having a cache that expires in a relatively short time (since that file
>>>> path was last accessed) would be sufficient.
>>> So how about one minute?
>>> And I prefer to define expiring time as time since that file path added
>>> to the cache, because there may be some hot paths accessed frequently,
>>> if expiring time is count from last accessed, they may stay in cache too
>>> long and have no chance to update from file system.
>>>>> It's a initial implementation, I believe there are still many 
>>>>> places can
>>>>> be improved, any comments and suggestions are welcome.
>>>> My initial thought is that it would be preferable to have the OS tell
>>>> you when a file path changes so you can invalidate the cache, but there
>>>> may not be enough info to be able to do that...
>> Now that M10 is out the way, it may be time to revisit this idea.
>> WDYT?
>> Tim
> Hi Tim,
> I see your comment on JIRA, it's a good suggestion, I'll take a look at 
> MapMaker  of Google-collections to find if there are anything we can 
> learn from it.
> A file's canonical path is changed rarely, so in the almost time, it's 
> safe to cache them. Now, the cache implementation used a map to cache 
> path and a extra LinkList to keep all cached element in time order. I'm 
> thinking are there any ways to do this in one data structure?

MapMaker used a j.u.Timer, when put element to map, schedule a new j.u.TimerTask
according to expiration time. For me, using timer can save the extra LinkedList, 
but a little complicated, since it need to start a timer thread and management 
of TimerTask is more complicated than LinkedList.

Best Regards,

View raw message