lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Noble Paul നോബിള്‍ नोब्ळ्" <noble.p...@gmail.com>
Subject Re: Blob storage
Date Sat, 27 Dec 2008 05:42:29 GMT
On Fri, Dec 26, 2008 at 10:05 PM, Otis Gospodnetic
<otis_gospodnetic@yahoo.com> wrote:
> Similar thoughts here.  I don't have ML thread pointers nor JIRA issue pointers, but
there has been discussion in this area before, and I believe the thinking was that what's
needed is a general interface/abstraction/API for storing and loading field data to an external
component, be that a BDB, an RDBMS, or something like Skwish.  I *think* that often came up
in the context of Document updates (as opposed to delete+add).
This is an area of interest for me as well SOLR-828
>
>
> I didn't look at Skwish, but I think this is the direction to explore, Babak, esp. if
we can come up with something that let's one plug in other types of storage, as well as deal
with transaction type stuff that Ian mentioned.
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
>> From: Ian Holsman <lists@holsman.net>
>> To: java-dev@lucene.apache.org
>> Sent: Friday, December 26, 2008 5:40:36 AM
>> Subject: Re: Blob storage
>>
>> Babak Farhang wrote:
>> > Most of all, I'm trying to communicate an *idea* which itself cannot
>> > be encumbered by any license, anyway. But if you want to incorporate
>> > some of this code into an asf project, I'd be happy to also release it
>> > under the apache license. Hope the license I chose for my project
>> > doesn't get in the way of this conversation..
>> >
>>
>> as an idea, let me offer some thoughts.
>> - there will be a trade-off where reading the info from a 2nd system
>> would be slower than just a single call which has all the results.
>> Especially if you have to fetch a couple of these things.
>>
>> - how is this different than BDB, and a UUID. couldn't you just store it
>> using that?
>>
>> - how are you going to deal with situations where the commit fails in
>> lucene. does the client have to recognize this and rollback skwish?
>>
>> - there will need to be some kind of reconciliation process that will
>> need to deal with inconsistencies where someone forgets to delete the
>> skiwsh object when they have deleted the lucene record.
>>
>> on a positive note, it would shrink the index size and allow more
>> records to fit in memory.
>>
>> Regards
>> Ian
>> > On Fri, Dec 26, 2008 at 12:46 AM, Noble Paul നോബിള്‍ नोब्ळ्
>> > wrote:
>> >
>> >> The license is GPL . It cannont be used directly in any apache projects
>> >>
>> >> On Fri, Dec 26, 2008 at 12:47 PM, Babak Farhang wrote:
>> >>
>> >>>> I assume one could use Skwish instead of Lucene's normal stored
fields to
>> >>>> store & retrieve document data?
>> >>>>
>> >>> Exactly: instead of storing the field's value directly in Lucene, you
>> >>> could store it in skwish and then store its skwish id in the Lucene
>> >>> field instead.  This works well for serving large streams (e.g.
>> >>> original document contents).
>> >>>
>> >>>
>> >>>> Have you run any threaded performance tests comparing the two?
>> >>>>
>> >>> No direct comps, yet.
>> >>>
>> >>> -b
>> >>>
>> >>>
>> >>> On Thu, Dec 25, 2008 at 5:22 AM, Michael McCandless
>> >>> wrote:
>> >>>
>> >>>> This looks interesting!
>> >>>> I assume one could use Skwish instead of Lucene's normal stored
fields to
>> >>>> store & retrieve document data?
>> >>>> Have you run any threaded performance tests comparing the two?
>> >>>> Mike
>> >>>>
>> >>>> Babak Farhang wrote:
>> >>>>
>> >>>>> Hi everyone,
>> >>>>>
>> >>>>> I've been working on a library called Skwish to complement indexes
>> >>>>> like Lucene,  for blob storage and retrieval. This is nothing
more
>> >>>>> than a structured implementation of storing all the files in
one file
>> >>>>> and managing their offsets in another.  The idea is to provide
a fast,
>> >>>>> concurrent, lock-free way to serve lots of files to lots of
users.
>> >>>>>
>> >>>>> Hope you find it useful or interesting.
>> >>>>>
>> >>>>> -Babak
>> >>>>> http://skwish.sourceforge.net/
>> >>>>>
>> >>>>> ---------------------------------------------------------------------
>> >>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >>>>>
>> >>>>>
>> >>>>
>> >>> ---------------------------------------------------------------------
>> >>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >>>
>> >>>
>> >>>
>> >>
>> >> --
>> >> --Noble Paul
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >>
>> >>
>> >>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>



-- 
--Noble Paul
Mime
View raw message