incubator-ooo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rory O'Farrell <ofarr...@iol.ie>
Subject Re: Files replaced by hashes, let's face it
Date Tue, 13 Mar 2012 07:17:32 GMT
On Tue, 13 Mar 2012 05:38:51 +0100
eric b <eric.bachard@free.fr> wrote:

> 
> Le 12 mars 12 à 22:47, Hagar Delest a écrit :
> 
> > Hi All,
> >
> 
> Hi,
> 
> 
> > We have had a recent discussion in the forum (after another case)  
> > about the problem where files are replaced with only hashes,  
> > leading to serious data loss.
> > Is there any plan to handle it or at least to double check the save  
> > process?
> >
> 
> Currently, not that I know.
> 
> 
> > For the record: usually after a power loss, the opened file is  
> > wrecked and no data is recoverable. In very rare cases (I've seen  
> > it twice IIRC), user is able to recover the last version from the  
> > temporary files.
> > The discussion: http://user.services.openoffice.org/en/forum/ 
> > viewtopic.php?f=6&t=17677
> > The post where I've listed more than 90 similar reports in forums:  
> > http://user.services.openoffice.org/en/forum/viewtopic.php? 
> > f=6&t=17677#p81363
> > The issue I'd filed: https://issues.apache.org/ooo/show_bug.cgi? 
> > id=107847
> >
> 
> 
> To be honest, I had a lot of similar issues with (old versions of)  
> Windows + MS Office, exactly the same way, long time ago. I even  
> remeber I lost a lof of work myself and I never complained to  
> Microsoft, who does not care (imho).
> 
> This isssue looks like a true issue, but a one extremely difficult to  
> reproduce. There are really a lot of possible reasons to turn  
> something readable into ####, like a simple 1 bit offset somewhere in  
> the datas, or some unneeded address incrementation in some loop,  
> including bad things with the file system (somewhere in sal or more  
> complicated). I think you understand things are awfully complicated  
> to track.
> 
> Until we find a track, the most important is to collect as much of  
> datas as possible. There is certainly one common denominator for a  
> big part of those issues imho, but the area of investigations is  
> enormous.
> 
> Of course, I don't have a solution, and only common work could help.  
> To make progress, we could define a strategy when one new issue is  
> detected, e.g. :
> 
> - create a meta-issue (I'll let other people propose a name)
> - propose a process to collect datas, and what do if ever such issue  
> occurs (like not power off the computer, or provide us a previous  
> version of the damaged document if possible .. and so on)
> - explain the users the difficulty to reproduce, and the analyze  
> needs more information than other issues, so we need to collect a lot  
> before to imagine a root and solve it
> - cross the issues with other OpenOffice.org derivatives could help :  
> I'll ask on our lists, to know whether the problem occured with  
> OOo4Kids too.
> 
> - (please propose other ideas)
> 
> 
> 
> > NB: not sure if LibO has inherited this problem too but I guess so  
> > according to a quick Google search: http://www.mail-archive.com/ 
> > libreoffice-bugs@lists.freedesktop.org/msg19017.html
> >
> 
> I think you should keep an eye on this side, but I bet this is the  
> case too.
> 
> 
> 
> > Of course the bug is not reproducible, it happens on several OS,  
> > with different versions but has appeared clearly end of 2008.
> >
> 
> 
> More the date is precise, more it will help : there is probably some  
> history somewhere, and a list of cws introduced in meantime could  
> help to isolate a good candidate for the (possible) bug or regression.
> 
> 
> 
> > Please remember that this bug is very detrimental to the product  
> > reputation, leading to a loss of confidence in the code.
> 
> 
> Yes, but we should not exagerate either. Know data loss is possible,  
> is true. This is a serious, but very seldom issue : we can create,  
> use files without lose something most of the time. I'd even bet  
> people lose more often their datas on windows because of viruses,  
> trojan, whatever than with OOo.
> 
> Last but not least the code is open, what is something really good in  
> this case.
> 
> 
> > Especially for a very basic feature. Facing say a power loss is not  
> > usual but the original file should not be processed until the new  
> > file is correctly written (or its temporary version should at least  
> > be available for recovery).
> 
> 
> My 2 cts
> 
> Eric
> 

Eric remarks 
> This isssue looks like a true issue, but a one extremely difficult to  
> reproduce. There are really a lot of possible reasons to turn  
> something readable into ####, like a simple 1 bit offset somewhere in  
> the datas, or some unneeded address incrementation in some loop,  
> including bad things with the file system (somewhere in sal or more  
> complicated). I think you understand things are awfully complicated  
> to track.

I agree that the hash problem is difficult to track, but an immediate worry should be that
the original file is frequently destroyed in the event of crash/power failure.  I think we
all understand that a file in progress of editing is not secure if there is a crash, but the
original file ought not be zapped under almost all circumstances.  

So I see two problems - 
one) the need to preserve the original file, which ought be easy as it is a change in the
logic of the Save process, 

two) the need to investigate what code shortcoming leads to files of hashes, which may involve
detailed analysis of low level code (my thoughts are that it may be caused by unmasked interrupts,
but I haven't coded at that level for 30 years, so am out of my experience).


-- 
Rory O'Farrell <ofarrwrk@iol.ie>

Mime
View raw message