beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-1018) getEstimatedSizeBytes fails with large MongoDB collection sizes
Date Mon, 21 Nov 2016 16:13:58 GMT

    [ https://issues.apache.org/jira/browse/BEAM-1018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15683974#comment-15683974
] 

ASF GitHub Bot commented on BEAM-1018:
--------------------------------------

GitHub user crcsmnky opened a pull request:

    https://github.com/apache/incubator-beam/pull/1394

    BEAM-1018: updated getEstimatedSizeBytes to use Number.longValue()

    Updated BoundedMongoDbSource.getEstimatesSizeBytes to use more generic `Number` class
and then return `longValue()`. For smaller collections the `size` is returned as Long but
for larger collections, the `size` can be returned using scientific notation.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/crcsmnky/incubator-beam BEAM-1018

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-beam/pull/1394.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1394
    
----
commit a9e3e2928bb05672d8af950237e9fe4d96acbbf5
Author: Sandeep Parikh <sandeep@clusterbeep.org>
Date:   2016-11-21T16:05:36Z

    BEAM-1018: updated getEstimatedSizeBytes to use Number.longValue()

----


> getEstimatedSizeBytes fails with large MongoDB collection sizes
> ---------------------------------------------------------------
>
>                 Key: BEAM-1018
>                 URL: https://issues.apache.org/jira/browse/BEAM-1018
>             Project: Beam
>          Issue Type: Bug
>    Affects Versions: 0.4.0-incubating
>            Reporter: Sandeep Parikh
>            Assignee: Jean-Baptiste Onofré
>
> When running against large collections sizes (20M+ documents), MongoDbIO fails to correctly
parse the {{size}} element in the document returned by 
> {code:javascript}
> db.runCommand({'collStats', 'collectionName'})
> {code}
> As the collection sizes grow larger, the returned value is in scientific notation which
cannot be parsed as a Long. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message