incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Luciani <jak...@gmail.com>
Subject Re: Thrift Perl API Timeout Issues
Date Thu, 15 Oct 2009 16:19:14 GMT
What happens if you set it to 100000?



On Oct 15, 2009, at 11:48 AM, Eric Lubow <eric.lubow@gmail.com> wrote:

> My connection section of the script is here:
>  # Connect to the database
>  my $socket = new Thrift::Socket('localhost',9160);
>     $socket->setSendTimeout(2500);
>     $socket->setRecvTimeout(7500);
>  my $transport = new Thrift::BufferedTransport($socket,2048,2048);
>  my $protocol = new Thrift::BinaryProtocol($transport);
>  my $client = Cassandra::CassandraClient->new($protocol);
>
> I even tried it with combinations of 1024 as the size and 1000 as  
> the SendTimeout and 5000 as the RecvTimeout.
>
> -e
>
> On Thu, Oct 15, 2009 at 11:42 AM, Jake Luciani <jakers@gmail.com>  
> wrote:
> I think it's 100ms. I need to increase it to match python I guess.
>
> Sent from my iPhone
>
>
> On Oct 15, 2009, at 11:40 AM, Jonathan Ellis <jbellis@gmail.com>  
> wrote:
>
> What is the default?
>
> On Thu, Oct 15, 2009 at 10:37 AM, Jake Luciani <jakers@gmail.com>  
> wrote:
> You need to call
> $socket->setRecvTimeout()
> With a higher number in ms.
>
>
> On Oct 15, 2009, at 11:26 AM, Eric Lubow <eric.lubow@gmail.com> wrote:
>
> Using the Thrift Perl API into Cassandra, I am running into what is
> endearingly referred to as the 4 bytes of doom:
>  TSocket: timed out reading 4 bytes from localhost:9160
> The script I am using is fairly simple.  I have a text file that has  
> about
> 3.6 million lines that are formatted like:  foo@bar.com  1234
> The Cassandra dataset is a single column family called Users in the  
> Mailings
> keyspace with a data layout of:
> Users = {
>    'foo@example.com': {
>        email: 'foo@example.com',
>        person_id: '123456',
>        send_dates_2009-09-30: '2245',
>        send_dates_2009-10-01: '2247',
>    },
> }
> There are about 3.5 million rows in the Users column family and each  
> row has
> no more than 4 columns (listed above).  Some only have 3 (one of the
> send_dates_YYYY-MM-DD isn't there).
> The script parses it and then connects to Cassandra and does a  
> get_slice and
> counts the return values adding that to a hash:
>     my ($value) = $client->get_slice(
>         'Mailings',
>         $email,
>         Cassandra::ColumnParent->new({
>                 column_family => 'Users',
>             }),
>         Cassandra::SlicePredicate->new({
>                 slice_range => Cassandra::SliceRange->new({
>                         start => 'send_dates_2009-09-29',
>                         finish => 'send_dates_2009-10-30',
>                     }),
>             }),
>         Cassandra::ConsistencyLevel::ONE
>     );
>     $counter{($#{$value} + 1)}++;
> For the most part, this script times out after 1 minute or so.  
> Replacing the
> get_slice with a get_count, I can get it to about 2 million queries  
> before I
> get the timeout.  Replacing the get_slice with a get, I make it to  
> about 2.5
> million before I get the timeout.  The only way I could get it to  
> run all
> the way through was to add a 1/100 of a second sleep during every  
> iteration.
>  I was able to get the script to complete when I shut down  
> everything else
> on the machine (and it took 177m to complete).  But since this is a
> semi-production machine, I had to turn everything back on afterwards.
> So for poops and laughs (at the recommendation of jbellis), I  
> rewrote the
> script in Python and it has since run (using get_slice) 3 times fully
> without timing out (approximately 130m in Python) with everything else
> running on the machine.
> My question is, having seen this same thing in the PHP API and it is  
> my
> understanding that the Perl API was based on the PHP API,
> could http://issues.apache.org/jira/browse/THRIFT-347 apply to Perl  
> here
> too?  Is anyone else seeing this issue?  If so, have you gotten  
> around it?
> Thanks.
> -e
>

Mime
View raw message