lucene-java-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-java Wiki] Trivial Update of "LucenePapers" by jpountz
Date Sun, 24 Jun 2012 20:57:25 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-java Wiki" for change notification.

The "LucenePapers" page has been changed by jpountz:
http://wiki.apache.org/lucene-java/LucenePapers?action=diff&rev1=4&rev2=5

Comment:
wording

  = Lucene Papers =
  
- To understand the fundamental ideas behind Lucene, you should first get familiar with InformationRetrieval.
This page tries to collect links to resources that present more advanced ideas.
+ To understand the fundamental ideas behind Lucene, you should first get familiar with InformationRetrieval.
This page tries to collect links to resources that explain some advanced topics.
  
  == Storage ==
  
  === Postings list encoding ===
  
- In addition to VInt encoding, Lucene supports (or plans to support) other postings list
encoding formats (FOR, PFOR, Simple9 ...):
+ In addition to VInt encoding, Lucene supports (or plans to support) other postings list
encoding formats (FOR-delta, PFOR-delta, Simple9, ...).
  
   * [[http://www2008.org/papers/pdf/p387-zhangA.pdf|Performance of Compressed Inverted List
Caching in Search Engines]]. Jiangong Zhang, Xiaohui Long, Torsten Suel. (2008)
   * [[http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html|Lucene
performance with the PForDelta codec]]. Mike !McCandless, Changing bits, August 2nd, 2010.
@@ -24, +24 @@

  
  Improved concurrency of index updates.
  
-  * [[http://www.searchworkings.org/blog/-/blogs/lucene-indexing-gains-concurrency|Lucene
indexing gains concurrency]]. Simon Willnauer, SearchWorkings blog (May 3rd, 2011),
+  * [[http://www.searchworkings.org/blog/-/blogs/lucene-indexing-gains-concurrency|Lucene
indexing gains concurrency]]. Simon Willnauer, !SearchWorkings blog (May 3rd, 2011),
-  * [[http://www.searchworkings.org/blog/-/blogs/gimme-all-resources-you-have-i-can-use-them%21|Exploiting
full IO and CPU concurrency when indexing with Apache Lucene]]. Simon Willnauer, SearchWorkings
blog (May 3rd, 2011).
+  * [[http://www.searchworkings.org/blog/-/blogs/gimme-all-resources-you-have-i-can-use-them%21|Exploiting
full IO and CPU concurrency when indexing with Apache Lucene]]. Simon Willnauer, !SearchWorkings
blog (April 1st, 2011).
  
  == Query execution ==
  
  === Terms dictionary ===
  
- Lucene has a new block tree terms dictionary, inspired of burst tries.
+ In addition to its binary-search based default terms dictionary, Lucene has a "block tree"
terms dictionary, inspired of burst tries.
  
   * [[https://issues.apache.org/jira/browse/LUCENE-3030|LUCENE-3030 Block tree terms dict
& index]],
   * [[http://www.lucidimagination.com/sites/default/files/file/LR2012/AutomatonInvasionLuceneRevolution2012.pdf|Automata
invasion]] Robert Muir, Michael !McCandless,
@@ -53, +53 @@

  
  === Scoring models ===
  
- In addition to its default TF-IDF scoring algorithm, Lucene supports other scoring models
such as Okapi BM25 and divergence from randomness.
+ In addition to its default TF-IDF scoring algorithm, Lucene supports other scoring models
such as Okapi BM25 and models based on language models.
  
   * [[http://blog.mikemccandless.com/2012/03/new-index-statistics-in-lucene-40.html|New index
statistics in Lucene 4.0]]. Mike !McCandless, Changing bits, March 14th, 2012,
   * [[http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.9922| Okapi at TREC-3]].
Stephen E. Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford,
In Proceedings of the Third Text REtrieval Conference (TREC 1994),
@@ -69, +69 @@

  
  === Twitter Earlybird ===
  
- Modifications that Twitted made to Lucene to support lock-free updates and efficient early
query termination for time-based relevance.
+ Modifications that Twitter made to Lucene to support lock-free updates and efficient early
query termination for time-based relevance.
  
   * [[http://www.umiacs.umd.edu/~jimmylin/publications/Busch_etal_ICDE2012.pdf|Earlybird:
Real-Time Search at Twitter]], Michael Busch, Krishna Gade, Brian Larson, Patrick Lok, Samuel
Luckenbill, and Jimmy Lin (2012).
   * [[http://vimeo.com/44299231|Earlybird - Realtime search @twitter]]. Talk by Michael Busch
at Berlin Buzzwords (2012).

Mime
View raw message