lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Open Relevance Infrastucture Request
Date Tue, 26 May 2009 14:14:20 GMT

On May 26, 2009, at 9:50 AM, Mark Miller wrote:

> Grant Ingersoll wrote:
>>
>>> Even so, people with really big pipes may be interested in larger  
>>> collections.  Typically, when others have done this kind of thing,  
>>> they actually send out hard drives containing the data.  We are  
>>> not proposing that.
>>>
>>>
>>> Another option is to ask the board for funding for us to use  
>>> Amazon.  I don't particularly like this approach b/c it is not  
>>> obvious to me how one would cap the cost.
> You can cap the cost by limiting how much data you store right? You  
> can use RequesterPayBuckets http://docs.amazonwebservices.com/AmazonS3/latest/index.html?RequesterPaysBuckets.html

>  to move the cost onto the users who want the data. Per user, it  
> would still be fairly cheap. You get the added bonus of other S3  
> services, like being able to send a device back and forth to import/ 
> export on site. You would just pay for storage and transferring the  
> data in - both cap-able by limiting the amount of data you put in it.
>

One of the goals is to make the data available for free, so I don't  
think this would work.  Currently, one can get the TREC data for a  
nominal fee as well.


> Not a recommendation or anything (its more convenient to not charge  
> the downloaders), but I think you could technically cap the costs  
> associated with putting it on S3.
>
> -- 
> - Mark
>
>



Mime
View raw message