lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless" <luc...@mikemccandless.com>
Subject Re: [jira] Reopened: (LUCENE-1044) Behavior on hard power shutdown
Date Wed, 14 Nov 2007 10:54:45 GMT

"robert engels" <rengels@ix.netcom.com> wrote:
> You might be misreading the results for the mac mini. If you compare  
> the mac mini with the sync pro, they are what is expected, the  
> unsync'd is roughly the same as the sync'd.
> 
> It might be that Apple configures the driver to not allow lazy writes  
> for the internal drive? Maybe for reliability.

This does seem likely and would explain why sync() vs no-sync would
be such a minor performance difference on the Mac Mini.

> Or it might be that the internal drive is severely fragmented - so  
> being able to coalesce blocks doesn't help much.
> 
> I have a mac mini as well, and find the writes to the external  
> firewire drive much faster.
> 
> XBench shows for my mac mini internal drive
> 
> Results	29.37	
> 	System Info		
> 		Xbench Version		1.3
> 		System Version		10.4.10 (8R2232)
> 		Physical RAM		2048 MB
> 		Model		Macmini1,1
> 		Drive Type		ST98823AS
> 	Disk Test	29.37	
> 		Sequential	43.69	
> 			Uncached Write	42.43	26.05 MB/sec [4K blocks]
> 			Uncached Write	43.26	24.48 MB/sec [256K blocks]
> 			Uncached Read	50.58	14.80 MB/sec [4K blocks]
> 			Uncached Read	39.83	20.02 MB/sec [256K blocks]
> 		Random	22.12	
> 			Uncached Write	7.52	0.80 MB/sec [4K blocks]
> 			Uncached Write	50.36	16.12 MB/sec [256K blocks]
> 			Uncached Read	67.14	0.48 MB/sec [4K blocks]
> 			Uncached Read	76.26	14.15 MB/sec [256K blocks]
> 
> For the external firewire
> 
> Results	44.36	
> 	System Info		
> 		Xbench Version		1.3
> 		System Version		10.4.10 (8R2232)
> 		Physical RAM		2048 MB
> 		Model		Macmini1,1
> 		Drive Type		ST350063 0A
> 	Disk Test	44.36	
> 		Sequential	53.50	
> 			Uncached Write	47.01	28.86 MB/sec [4K blocks]
> 			Uncached Write	56.23	31.82 MB/sec [256K blocks]
> 			Uncached Read	44.11	12.91 MB/sec [4K blocks]
> 			Uncached Read	76.72	38.56 MB/sec [256K blocks]
> 		Random	37.89	
> 			Uncached Write	13.94	1.48 MB/sec [4K blocks]
> 			Uncached Write	70.45	22.55 MB/sec [256K blocks]
> 			Uncached Read	92.09	0.65 MB/sec [4K blocks]
> 			Uncached Read	113.54	21.07 MB/sec [256K blocks]
> 
> 
> 
> 
> 
> On Nov 13, 2007, at 3:54 PM, Michael McCandless (JIRA) wrote:
> 
> >
> >      [ https://issues.apache.org/jira/browse/LUCENE-1044? 
> > page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
> >
> > Michael McCandless reopened LUCENE-1044:
> > ----------------------------------------
> >
> >
> > OK I ran sync/nosync tests across various platforms/IO system.  In
> > each case I ran the test once with doSync=true and once with
> > doSync=false, using this alg:
> >
> >   analyzer=org.apache.lucene.analysis.SimpleAnalyzer
> >   doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
> >   docs.file=/lucene/wikifull.txt
> >
> >   doc.maker.forever=false
> >   ram.flush.mb = 8
> >   max.buffered = 0
> >   directory = FSDirectory
> >   max.field.length = 2147483647
> >   doc.term.vector=false
> >   doc.stored=false
> >   work.dir = /tmp/lucene
> >   fsdirectory.dosync = false
> >
> >   ResetSystemErase
> >   CreateIndex
> >   {AddDoc >: 150000
> >   CloseIndex
> >
> >   RepSumByName
> >
> > Ie, time to index the first 150K docs from Wikipedia.
> >
> >
> > Results for single hard drive:
> >
> >   Mac mini (10.5 Leopard) single 4200 RPM "notebook" (2.5") drive  
> > -- 2.3% slower:
> >
> >       sync - 296.80 sec
> >     nosync - 290.06 sec
> >
> >   Mac pro (10.4 Tiger), single external drive -- 35.5% slower:
> >
> >       sync - 259.61 sec
> >     nosync - 191.53 sec
> >
> >   Win XP Pro laptop, single drive -- 38.2% slower
> >
> >       sync - 536.00 sec
> >     nosync - 387.90 sec
> >
> >   Linux (2.6.22.1), ext3 single drive -- 23% slower
> >
> >       sync - 185.42 sec
> >     nosync - 150.56 sec
> >
> > Results for multiple hard drives (RAID arrays):
> >
> >   Linux (2.6.22.1), reiserfs 6 drive RAID5 array -- 49% slower (!!)
> >
> >       sync - 239.32 sec
> >     nosync - 160.56 sec
> >
> >   Mac Pro (10.4 Tiger), 4 drive RAID0 array -- 1% faster
> >
> >       sync - 157.26 sec
> >     nosync - 158.93 sec
> >
> >
> > So at this point I'm torn...
> >
> > The performance cost of the simplest approach (sync() before close())
> > is very costly in many cases (not just laptop IO subsystems).  The
> > reiserfs test was rather shocking.  Then, it's oddly very lost cost in
> > other cases: the Mac Mini test I find amazing.
> >
> > It's frustrating to lose such performance "out of the box" for the
> > presumably extremely rare event of OS/machine crash/power cut.
> >
> > Maybe we should leave the default as false for now?
> >
> >
> >> Behavior on hard power shutdown
> >> -------------------------------
> >>
> >>                 Key: LUCENE-1044
> >>                 URL: https://issues.apache.org/jira/browse/ 
> >> LUCENE-1044
> >>             Project: Lucene - Java
> >>          Issue Type: Bug
> >>          Components: Index
> >>         Environment: Windows Server 2003, Standard Edition, Sun  
> >> Hotspot Java 1.5
> >>            Reporter: venkat rangan
> >>            Assignee: Michael McCandless
> >>             Fix For: 2.3
> >>
> >>         Attachments: LUCENE-1044.patch, LUCENE-1044.take2.patch,  
> >> LUCENE-1044.take3.patch
> >>
> >>
> >> When indexing a large number of documents, upon a hard power  
> >> failure  (e.g. pull the power cord), the index seems to get  
> >> corrupted. We start a Java application as an Windows Service, and  
> >> feed it documents. In some cases (after an index size of 1.7GB,  
> >> with 30-40 index segment .cfs files) , the following is observed.
> >> The 'segments' file contains only zeros. Its size is 265 bytes -  
> >> all bytes are zeros.
> >> The 'deleted' file also contains only zeros. Its size is 85 bytes  
> >> - all bytes are zeros.
> >> Before corruption, the segments file and deleted file appear to be  
> >> correct. After this corruption, the index is corrupted and lost.
> >> This is a problem observed in Lucene 1.4.3. We are not able to  
> >> upgrade our customer deployments to 1.9 or later version, but  
> >> would be happy to back-port a patch, if the patch is small enough  
> >> and if this problem is already solved.
> >
> > -- 
> > This message is automatically generated by JIRA.
> > -
> > You can reply to this email to add a comment to the issue online.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message