incubator-gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henry Saputra (JIRA)" <j...@apache.org>
Subject [jira] Commented: (GORA-23) Limit result set in store reads
Date Wed, 26 Jan 2011 07:24:43 GMT

    [ https://issues.apache.org/jira/browse/GORA-23?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12986872#action_12986872
] 

Henry Saputra commented on GORA-23:
-----------------------------------

instead of throwing exception, could we just set the minimum of 2 to the recordsMax in the
GoraRecordReader?

> Limit result set in store reads
> -------------------------------
>
>                 Key: GORA-23
>                 URL: https://issues.apache.org/jira/browse/GORA-23
>             Project: Gora
>          Issue Type: Bug
>          Components: storage
>         Environment: MySQL
>            Reporter: Alexis
>         Attachments: gora.patch
>
>
> Once again, whatever the capacity of our system, we have a limited amount of RAM. Sooner
or later, we will eventually run out of memory.
> Please refer to http://techvineyard.blogspot.com/2010/12/build-nutch-20.html#Gora for
the description of the issue:
> When using MySQL as Gora backend, with the parse command, the execution hangs then crashes
because it runs out of memory, because of this query:
> SELECT id,content,status,outlinks,baseUrl,typ,parseStatus,metadata,signature,markers
FROM webpage;
> We are running exactly into the same issue that GORA-20. Except that we are not writing
to the store, but reading it. Currently the code loads the entire webpage table into memory.
We want to set a limit to the system call that pulls data from the database.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message