lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wettin (JIRA)" <j...@apache.org>
Subject [jira] Updated: (LUCENE-550) InstantiatedIndex - faster but memory consuming index
Date Sun, 25 Feb 2007 23:38:06 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Karl Wettin updated LUCENE-550:
-------------------------------

    Attachment: trunk.diff.bz2

New Patch. Mainly updates in contrib/didyoumean. Merged some core conflicts.

TestGoalJuror now import 200,000 real user queries from a log containing session id, query,
category, timestamp and number of hits, ordered by session id and time. 

This means that the trainer and suggester are not aware of if the user followed or ignored
a suggestion from the system, what results was inspected, if the query contained a goal, et
c. So it does not work as if trained from the start with the adaptive layer.

Still, the suggester navigates the dictionary fairly well and misspelled queries will be suggested
the correct suggestion, but many correct spelled phrases will recommend something silly. 
As one start reporting user interaction to the suggester any silly recommendation should go
away.

In essence, it can only adapt the suggestions positive based on what the QueryGoalJuror says
is a goal. Negative is only when a user don't take a suggestion. It could be solved with bootstrapping.
Will mess with that later. 

> InstantiatedIndex - faster but memory consuming index
> -----------------------------------------------------
>
>                 Key: LUCENE-550
>                 URL: https://issues.apache.org/jira/browse/LUCENE-550
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Store
>    Affects Versions: 2.0.0
>            Reporter: Karl Wettin
>         Assigned To: Karl Wettin
>         Attachments: didyoumean.jpg, lucene-550.jpg, test-reports.zip, trunk.diff.bz2,
trunk.diff.bz2, trunk.diff.bz2, trunk.diff.bz2, trunk.diff.bz2, trunk.diff.bz2, trunk.diff.bz2,
trunk.diff.bz2
>
>
> An non file centrinc all in memory index. Consumes some 2x the memory of a RAMDirectory
(in a term satured index) but is between 3x-60x faster depending on application and how one
counts. Average query is about 8x faster. IndexWriter and IndexModifier have been realized
in InterfaceIndexWriter and InterfaceIndexModifier. 
> InstantiatedIndex is wrapped in a new top layer index facade (class Index) that comes
with factory methods for writers, readers and searchers for unison index handeling. There
are decorators with notification handling that can be used for automatically syncronizing
searchers on updates, et.c. 
> Index also comes with FS/RAMDirectory implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message