Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 70D4B200C3E for ; Tue, 21 Mar 2017 15:41:30 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 6F467160B74; Tue, 21 Mar 2017 14:41:30 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 4A36D160B68 for ; Tue, 21 Mar 2017 15:41:29 +0100 (CET) Received: (qmail 5617 invoked by uid 500); 21 Mar 2017 14:41:26 -0000 Mailing-List: contact dev-help@lucenenet.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucenenet.apache.org Delivered-To: mailing list dev@lucenenet.apache.org Received: (qmail 5605 invoked by uid 99); 21 Mar 2017 14:41:26 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Mar 2017 14:41:26 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id D1A5FC0370 for ; Tue, 21 Mar 2017 14:41:25 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.001 X-Spam-Level: X-Spam-Status: No, score=0.001 tagged_above=-999 required=6.31 tests=[URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id k84XBJNzxb8b for ; Tue, 21 Mar 2017 14:41:22 +0000 (UTC) Received: from ex10cshbfe01.apps4rent.net (ex10cshbfe02.apps4rent.net [69.160.246.193]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with SMTP id 40F6F5FCF4 for ; Tue, 21 Mar 2017 14:41:22 +0000 (UTC) Received: from EX10DAG10-N1.apps4rent.net ([10.10.10.123]) by EX10CSHBFE01 ([10.10.10.113]) with mapi id 14.03.0319.002; Tue, 21 Mar 2017 07:41:53 -0700 From: Shad Storhaug To: "Van Den Berghe, Vincent" CC: "dev@lucenenet.apache.org" Subject: RE: Bug in Lucene static initialization with multiple threads. Thread-Topic: Bug in Lucene static initialization with multiple threads. Thread-Index: AdKBMS6XiJ+FBFIaREK/pEter8YMbwggiztgACNfvgAAAv9TwA== Date: Tue, 21 Mar 2017 14:41:53 +0000 Message-ID: <4606B227B7AF19498F107C2C59CC9849B82B44D1@EX10DAG10-N1> References: <4606B227B7AF19498F107C2C59CC9849B82B316B@EX10DAG10-N1> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [180.180.13.238] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 archived-at: Tue, 21 Mar 2017 14:41:30 -0000 Hi Vincent, Sure, you can email the files to me directly. For a quick start on Git/GitHub, there is a fairly short book called Pragma= tic Version Control Using Git (https://pragprog.com/book/tsgit/pragmatic-ve= rsion-control-using-git) that gets you up and running quickly. I think you = might be attempting to push to the main repo - and you won't have permissio= n unless it is explicitly granted. What you need to do is to fork the repo = to your own GitHub account, then you can read/write it as much as necessary= . Once you get it to a point where you want to submit something, you can do= a pull request (either through GitHub or just manually email a request) an= d someone else can then review and merge the changes. Update I found the source of the Lucene.Net.Tests.Index.TestIndexReaderWriter.Test= DuringAddIndexes() problem - it always occurs when you call an overload of = the IndexSearcher constructor that takes a TaskScheduler as a parameter and= pass a non-null value. This is built into the test framework (https://gith= ub.com/apache/lucenenet/blob/api-work/src/Lucene.Net.TestFramework/Util/Luc= eneTestCase.cs#L1778) to happen rarely, which explains many of the random f= ailures we are seeing. If you change the random code to never use a TaskSch= eduler, the test will always pass, change it to always use a TaskScheduler = and it will always fail. The implementation of TaskScheduler we are using for testing (https://githu= b.com/apache/lucenenet/blob/api-work/src/Lucene.Net.Core/Support/LimitedCon= currencyLevelTaskScheduler.cs) was copied directly from MSDN, so I doubt th= at is the issue. In fact, there is a good chance that the issue is similar = to the WeakIdentityMap issue in that there is an enumerator/call to enumera= tor that is not thread-safe see (https://github.com/apache/lucenenet/blob/a= pi-work/src/Lucene.Net.Core/Search/IndexSearcher.cs#L474-L500) and (https:/= /github.com/apache/lucenenet/blob/api-work/src/Lucene.Net.Core/Search/Index= Searcher.cs#L569-L590). Anyway, I know of at least 3 tests that are failing as a result of this, so= fixing it would be a big prize. Thanks, Shad Storhaug (NightOwl888) -----Original Message----- From: Van Den Berghe, Vincent [mailto:Vincent.VanDenBerghe@bvdinfo.com]=20 Sent: Tuesday, March 21, 2017 8:51 PM To: Shad Storhaug Cc: dev@lucenenet.apache.org Subject: RE: Bug in Lucene static initialization with multiple threads. Hello Shad, I had a little time on my hands and looked into this WeakIdentityMap issue,= more specifically TestConcurrentHashMap which fails for me as well (in 100= % of the cases). Maybe I have something to contribute: I have 3 observation= s: First, notice that in TestConcurrentHashMap , 8 threads are created and the= n all joined by doing the following: finally { foreach (var w in workers) { w.Join(1000L); } } This gives the first thread 1 second to end, the second one at most 2 secon= ds (1 second + whatever time the first thread needed to end) and so on. Gi= ven the amount of work of each test thread, this is far too little time eve= n on a fast machine. It takes 13 seconds for all threads to end here. The corresponding java test has the following: while (!exec.awaitTermination(1000L, TimeUnit.MILLISECONDS)); ... which in effect just loops until the execution of each thread is finish= ed, in units of 1 second. In TPL, threads would be tasks and we would just be able to call Task.Wait= All. Since we're dealing with "real" threads here, I would suggest just ca= ll w.Join() and be done with it. This would align the test with the java behavior. Second, there are various weaknesses in the WeakIdentityMap: 1) the implementation of the Keys enumerator (IteratorAnonymousInnerClassHe= lper) relies on the order of the elements in the keys collection (outerInst= ance.backingStore.Keys.ElementAt(position)). This is bad for two reasons: - it is extremely slow (there is no indexed access on outerInstance.backing= Store.Keys in any current implementation, so ElementAt needs to skip "posit= ion" elements to get at the correct one) - it relies on the assumption that removing (or adding) a key in a dictiona= ry doesn't change the previous relative key order, which is incorrect in an= y current .NET implementation I am aware of (dictionaries are hash tables w= ith collision resolution through chaining, and reusing of slots through a f= ree list: it's just asking for trouble). It turns out that you can use the backing store enumerator to implement the= keys enumerator directly. The main loop simply becomes: public bool MoveNext() { while (enumerator.MoveNext()) { next =3D enumerator.Current.Key.Target; if (next !=3D null) { // unfold "null" special value: if (next =3D=3D NULL) next =3D null; return true; } } return false; } This works in the non-concurrent case (because we don't touch the collectio= n while the enumerator is running), and in the concurrent case as well (bec= ause the ConcurrentDictionary enumerator works by design and handles c= oncurrent modifications without a problem). 2) calling Reap() create objects on the heap, even when there are no elemen= ts to be removed. Sadly, not all of these allocation can be eliminated, but= you can delay the creation of the keysToRemove list until it's really need= ed: List keysToRemove =3D null; foreach (IdentityWeakReference zombie in backingStore.Keys) { if (!zombie.IsAlive) { // create the list of keys to remove only if there are keys to remove. // this reduces heap pressure if (keysToRemove =3D=3D null) keysToRemove =3D new List(); keysToRemove.Add(zombie); } } if (keysToRemove !=3D null) foreach (var key in keysToRemove) { backingStore.Remove(key); } Note that I don't iterate the Keys collection, but use the dictionary enume= rator. Believe it or not, but this is slightly more efficient for reasons I= won't explain here since this e-mail is already long enough. It's sad but inevitable that a heap object is created for the dictionary en= umerator, because we call it through an interface (IDictionary): it we= had the actual object, no enumerator object would be created on the heap. 3) Equality of weak identity references can be done using only one case (us= ing "as" instead of "is"), which is more efficient. Third, the test itself uses enumerators in a nonstandard manner. The two wi= tness cases are: IEnumerator iter =3D map.Keys.GetEnumerator(); Assert.IsTrue(iter.MoveNext()); Assert.IsNull(iter.Current); Assert.IsFalse(iter.MoveNext()); Assert.IsFalse(iter.MoveNext()); And for (IEnumerator iter =3D map.Keys.GetEnumerator(); ite= r.MoveNext();) { //Assert.IsTrue(iter.hasNext()); // try again, should retur= n same result! string k =3D iter.Current; ... } All the other instances are variants of these witnesses. The correct way of using IEnumerator is by calling IEnumerator.Dispos= e() after you're finished with the instance. Note that in Lucene itself, fo= reach() is used which does it correctly (ByteBufferIndexInput.cs): foreach (ByteBufferIndexInput clone in clones.Keys) { clone.UnsetBuffers(); } All usages of enumerators in TestWeakIdentityMap.cs must be rewritten accor= dingly. For example: using (IEnumerator iter =3D map.Keys.GetEnumerator()) { Assert.IsTrue(iter.MoveNext()); Assert.IsNull(iter.Current); Assert.IsFalse(iter.MoveNext()); Assert.IsFalse(iter.MoveNext()); } And foreach (object k in map.Keys) { ... } In case you are wondering why this is so important: you cannot guarantee th= at future implementations of an enumerator (especially one on a concurrent = collection) doesn't have a cleanup to do to get rid of various synchronizat= ion objects. Right now this isn't the case, but you never know what the fut= ure will bring. And besides, nice guys Dispose() after their enumeration . The test passes now. Every time. I've made the changes to api-work in my local repository, but when I tried = to "push" or "sync" them, I get : Error encountered while pushing to the remote repository: Response status = code does not indicate success: 403 (Forbidden). I know next to nothing about GitHub. Can I e-mail the changed files to some= one? Vincent -----Original Message----- From: Shad Storhaug [mailto:shad@shadstorhaug.com]=20 Sent: Monday, March 20, 2017 9:19 PM To: Van Den Berghe, Vincent Cc: dev@lucenenet.apache.org Subject: RE: Bug in Lucene static initialization with multiple threads. Hi Vincent, Thanks for reporting this. In fact, thank you for all of your assistance tr= acking down bugs. This issue boils down to being a failed attempt to replace Lucene's WeakIde= ntityMap with a new data structure called WeakDictionary. Since there are a= lready tests to verify concurrency on WeakIdentityMap and it is used in a c= ouple of other places in Lucene, it would be far better to get it working r= ight than to try to fix this alternative version. I guess for the time bein= g your workaround should suffice (though, a fix rather than a hack would be= preferred). I have spent quite a bit of time on this, but the best I have been able to = do is to get the Lucene.Net.Tests.Util.TestWeakIdentityMap.TestConcurrentHa= shMap() test to pass about 50% of the time (and I can't seem to even get it= back into that state).=20 Here are a couple of attempts I have made: https://github.com/NightOwl888/lucenenet/commits/api-work-weak-identity-map= -1 - using a port of the original Java backing classes https://github.com/NightOwl888/lucenenet/commits/api-work-weak-identity-map= -2 - using the .NET WeakIdentity class And here is the original Java version: https://github.com/apache/lucene-sol= r/blob/releases/lucene-solr/4.8.0/lucene/core/src/java/org/apache/lucene/ut= il/WeakIdentityMap.java=20 The complicated part is getting it to "reap" the elements in a thread-safe = way so the counts are right on several concurrent enumerators. Any assistan= ce you could provide to make WeakIdentityMap thread-safe would be much appr= eciated. Do note that the lead branch is now at https://github.com/apache/l= ucenenet/tree/api-work, so please do any work from that branch. Also note there are also currently a few other concurrency tests that are f= ailing: Lucene.Net.Tests.Index.TestIndexReaderWriter.TestDuringAddIndexes() Lucene.Net.Tests.Search.TestControlledRealTimeReopenThread.TestControlledRe= alTimeReopenThread_Mem() Lucene.Net.Tests.Search.TestControlledRealTimeReopenThread.TestCRTReopen() I am sure that getting to the bottom of these issues will probably fix most= of the issues you are seeing. If you have any spare time, your help would = be appreciated on these as well. Thanks, Shad Storhaug (NightOwl888) -----Original Message----- From: Van Den Berghe, Vincent [mailto:Vincent.VanDenBerghe@bvdinfo.com]=20 Sent: Tuesday, February 7, 2017 6:00 PM To: dev@lucenenet.apache.org Subject: Bug in Lucene static initialization with multiple threads. Hello, Every once in a while, I get an error when using Lucene in a multithreaded = scenario (meaning: using a single IndexWriter in multiple threads, or using= a distinct IndexWriter in each thread: it doesn't matter). The exception chain thrown is: Unhandled Exception: System.ArgumentException: Could not instantiate implem= enting class for Lucene.Net.Analysis.Tokenattributes.ICharTermAttribute ---> System.ArgumentException: Could not find implementing class for ICharT= ermAttribute --->System.InvalidOperationException: Collection was modified; enumeration = operation may not execute. I could not understand what was going on, especially because it only occurr= ed "sometimes". It took me a while to figure out, but I think it's a bug. Here's the stack trace of the exception when it occurs: [External Code] > Lucene.Net.dll!Lucene.Net.Support.HashMap.WeakKey, Sys= tem.WeakReference>.GetEnumerator() Line 229 C# [External Code] Lucene.Net.dll!Lucene.Net.Support.WeakDictionary.Clean() Line 59 C# Lucene.Net.dll!Lucene.Net.Support.WeakDictionary.CleanIfNeeded() Line 71 C# Lucene.Net.dll!Lucene.Net.Support.WeakDictionary.Add(System.Type key, System.WeakReference value) Li= ne 134 C# Lucene.Net.dll!Lucene.Net.Util.AttributeSource.AttributeFac= tory.DefaultAttributeFactory.GetClassForInterface() Line 90 C# Lucene.Net.dll!Lucene.Net.Util.AttributeSource.AttributeFac= tory.DefaultAttributeFactory.CreateAttributeInstance() Line 70 C# Lucene.Net.dll!Lucene.Net.Util.AttributeSource.AddAttribute= () Line 350 = C# Lucene.Net.dll!Lucene.Net.Documents.Field.StringTokenStream.= InitializeInstanceFields() Line 658 C# Lucene.Net.dll!Lucene.Net.Documents.Field.StringTokenStream.= StringTokenStream() Line 676 C# Lucene.Net.dll!Lucene.Net.Documents.Field.GetTokenStream(Luc= ene.Net.Analysis.Analyzer analyzer) Line 629 C# Lucene.Net.dll!Lucene.Net.Index.DocInverterPerField.ProcessF= ields(Lucene.Net.Index.IndexableField[] fields, int count) Line 105 = C# Lucene.Net.dll!Lucene.Net.Index.DocFieldProcessor.ProcessDo= cument(Lucene.Net.Index.FieldInfos.Builder fieldInfos) Line 279 C# Lucene.Net.dll!Lucene.Net.Index.DocumentsWriterPerThread.Up= dateDocument(System.Collections.Generic.IEnumerable doc, Lucene.Net.Analysis.Analyzer analyzer, Lucene.Net.Index.Term= delTerm) Line 287 C# Lucene.Net.dll!Lucene.Net.Index.DocumentsWriter.UpdateDocum= ent(System.Collections.Generic.IEnumerable= doc, Lucene.Net.Analysis.Analyzer analyzer, Lucene.Net.Index.Term delTerm)= Line 574 C# Lucene.Net.dll!Lucene.Net.Index.IndexWriter.UpdateDocument(L= ucene.Net.Index.Term term, System.Collections.Generic.IEnumerable doc, Lucene.Net.Analysis.Analyzer analyzer) Line 18= 30 C# Lucene.Net.dll!Lucene.Net.Index.IndexWriter.AddDocument(Sys= tem.Collections.Generic.IEnumerable doc, L= ucene.Net.Analysis.Analyzer analyzer) Line 1455 C# Lucene.Net.dll!Lucene.Net.Index.IndexWriter.AddDocument(Sys= tem.Collections.Generic.IEnumerable doc) L= ine 1436 C# ... and to wit, here are the threads just rushing in to do the same: Not Flagged 35428 17 Worker Thread Lucene.Net.dll!Lucene.Net.Support.WeakDictionary.Clean Normal Not Flagged 35444 11 Worker Thread Lucene.Net.dll!Lucene.Net.Support.WeakDictionary.Clean Normal Not Flagged 44124 12 Worker Thread Lucene.Net.dll!Lucene.Net.Support.WeakDictionary.Clean Normal Not Flagged > 44140 13 Worker Thread Lucene.Net.dll!Lucene.Net.Support.WeakDictionary.Clean Normal Not Flagged 47700 14 Worker Thread Lucene.Net.dll!Lucene.Net.Support.WeakDictionary.Clean Normal Not Flagged 28168 15 Worker Thread Lucene.Net.dll!Lucene.Net.Support.WeakDictionary.Clean Normal Not Flagged 30988 16 Worker Thread Lucene.Net.dll!Lucene.Net.Support.WeakDictionary.Clean Normal Not Flagged 21828 6 Worker Thread Lucene.Net.dll!Lucene.Net.Support.WeakDictionary.Clean Normal The reason why it only reproduces "sometimes" is because of this little nug= get of code: private void CleanIfNeeded() { int currentColCount =3D GC.CollectionCount(0); if (currentColCount > _gcCollections) { Clean(); _gcCollections =3D currentColCount; } } If one thread does a Clean() operation in the middle of another Clean() ope= ration on the same collection that replaces the object being enumerated on,= you get the exception. Always. To avoid the intermittence, create a bunch of threads like this and elimina= te the test "if (currentColCount > _gcCollections)" so that the Clean() cod= e is always executed. You'll get the exception every time. I will not post the correction, but there's a simple workaround: just make = sure the static initializers are performed in a single thread. I.e. before creating your threads, do something like this: new global::Lucene.Net.Documents.TextField("dummy", "dummyvalue", global::L= ucene.Net.Documents.Field.Store.NO).GetTokenStream(new (some Analyzer objec= t)); Replace "some Analyzer object" with an instance of an Analyzer object, it d= oesn't matter which one. It's meaningless, but it has the side effect of in= itializing the static fields without problems. Vincent