cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ali Akhtar <ali.rac...@gmail.com>
Subject Re: Timeout error in fetching million rows as results using clustering keys
Date Wed, 18 Mar 2015 08:33:29 GMT
Sorry, meant to say "that way when you have to render, you can just display
the latest cache."

On Wed, Mar 18, 2015 at 1:30 PM, Ali Akhtar <ali.rac200@gmail.com> wrote:

> I would probably do this in a background thread and cache the results,
> that way when you have to render, you can just cache the latest results.
>
> I don't know why Cassandra can't seem to be able to fetch large batch
> sizes, I've also run into these timeouts but reducing the batch size to 2k
> seemed to work for me.
>
> On Wed, Mar 18, 2015 at 1:24 PM, Mehak Mehta <memehta@cs.stonybrook.edu>
> wrote:
>
>> We have UI interface which needs this data for rendering.
>> So efficiency of pulling this data matters a lot. It should be fetched
>> within a minute.
>> Is there a way to achieve such efficiency
>>
>>
>> On Wed, Mar 18, 2015 at 4:06 AM, Ali Akhtar <ali.rac200@gmail.com> wrote:
>>
>>> Perhaps just fetch them in batches of 1000 or 2000? For 1m rows, it
>>> seems like the difference would only be a few minutes. Do you have to do
>>> this all the time, or only once in a while?
>>>
>>> On Wed, Mar 18, 2015 at 12:34 PM, Mehak Mehta <memehta@cs.stonybrook.edu
>>> > wrote:
>>>
>>>> yes it works for 1000 but not more than that.
>>>> How can I fetch all rows using this efficiently?
>>>>
>>>> On Wed, Mar 18, 2015 at 3:29 AM, Ali Akhtar <ali.rac200@gmail.com>
>>>> wrote:
>>>>
>>>>> Have you tried a smaller fetch size, such as 5k - 2k ?
>>>>>
>>>>> On Wed, Mar 18, 2015 at 12:22 PM, Mehak Mehta <
>>>>> memehta@cs.stonybrook.edu> wrote:
>>>>>
>>>>>> Hi Jens,
>>>>>>
>>>>>> I have tried with fetch size of 10000 still its not giving any
>>>>>> results.
>>>>>> My expectations were that Cassandra can handle a million rows easily.
>>>>>>
>>>>>> Is there any mistake in the way I am defining the keys or querying
>>>>>> them.
>>>>>>
>>>>>> Thanks
>>>>>> Mehak
>>>>>>
>>>>>> On Wed, Mar 18, 2015 at 3:02 AM, Jens Rantil <jens.rantil@tink.se>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Try setting fetchsize before querying. Assuming you don't set
it too
>>>>>>> high, and you don't have too many tombstones, that should do
it.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Jens
>>>>>>>
>>>>>>> –
>>>>>>> Skickat från Mailbox <https://www.dropbox.com/mailbox>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 18, 2015 at 2:58 AM, Mehak Mehta <
>>>>>>> memehta@cs.stonybrook.edu> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have requirement to fetch million row as result of my query
which
>>>>>>>> is giving timeout errors.
>>>>>>>> I am fetching results by selecting clustering columns, then
why the
>>>>>>>> queries are taking so long. I can change the timeout settings
but I need
>>>>>>>> the data to fetched faster as per my requirement.
>>>>>>>>
>>>>>>>> My table definition is:
>>>>>>>> *CREATE TABLE images.results (uuid uuid, analysis_execution_id
>>>>>>>> varchar, analysis_execution_uuid uuid, x  double, y double,
loc varchar, w
>>>>>>>> double, h double, normalized varchar, type varchar, filehost
varchar,
>>>>>>>> filename varchar, image_uuid uuid, image_uri varchar, image_caseid
varchar,
>>>>>>>> image_mpp_x double, image_mpp_y double, image_width double,
image_height
>>>>>>>> double, objective double, cancer_type varchar,  Area float,
submit_date
>>>>>>>> timestamp, points list<double>,  PRIMARY KEY ((image_caseid),Area,uuid));*
>>>>>>>>
>>>>>>>> Here each row is uniquely identified on the basis of unique
uuid.
>>>>>>>> But since my data is generally queried based upon *image_caseid
*I
>>>>>>>> have made it partition key.
>>>>>>>> I am currently using Java Datastax api to fetch the results.
But
>>>>>>>> the query is taking a lot of time resulting in timeout errors:
>>>>>>>>
>>>>>>>>  Exception in thread "main"
>>>>>>>> com.datastax.driver.core.exceptions.NoHostAvailableException:
All host(s)
>>>>>>>> tried for query failed (tried: localhost/127.0.0.1:9042
>>>>>>>> (com.datastax.driver.core.exceptions.DriverException: Timed
out waiting for
>>>>>>>> server response))
>>>>>>>>  at
>>>>>>>> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
>>>>>>>>  at
>>>>>>>> com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:289)
>>>>>>>>  at
>>>>>>>> com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:205)
>>>>>>>>  at
>>>>>>>> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
>>>>>>>>  at QueryDB.queryArea(TestQuery.java:59)
>>>>>>>>  at TestQuery.main(TestQuery.java:35)
>>>>>>>> Caused by:
>>>>>>>> com.datastax.driver.core.exceptions.NoHostAvailableException:
All host(s)
>>>>>>>> tried for query failed (tried: localhost/127.0.0.1:9042
>>>>>>>> (com.datastax.driver.core.exceptions.DriverException: Timed
out waiting for
>>>>>>>> server response))
>>>>>>>>  at
>>>>>>>> com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:108)
>>>>>>>>  at
>>>>>>>> com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:179)
>>>>>>>>  at
>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>>>>  at
>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>>>>  at java.lang.Thread.run(Thread.java:744)
>>>>>>>>
>>>>>>>> Also when I try the same query on console even while using
limit of
>>>>>>>> 2000 rows:
>>>>>>>>
>>>>>>>> cqlsh:images> select count(*) from results where
>>>>>>>> image_caseid='TCGA-HN-A2NL-01Z-00-DX1' and Area<100 and
Area>20 limit 2000;
>>>>>>>> errors={}, last_host=127.0.0.1
>>>>>>>>
>>>>>>>> Thanks and Regards,
>>>>>>>> Mehak
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message