hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anna Guan <aguan0...@gmail.com>
Subject How to increase buffer size so that hdfsRead() reads entire tif file into a buffer
Date Tue, 15 Mar 2016 17:34:25 GMT
I am new to Hadoop, and trying to run legacy c++ program to process geotiff
file using Hadoop Streaming. The version of Hadoop is 2.6.2.
I 'd like to open an image in hdfs file system and use hdfsRead to read the
file into a memory buffer and then use GDAL library to create a virtual
memory file so that I can create GDALDataset from it. I'd like to read the
whole file into the buffer however hdfsRead() only read 65536 bytes every
time. Is there any way to read the entire file into the buffer? I also
set dfs.image.transfer.chunksize
in the config file but it did not help. When I run it I got ERROR 4:
`/vsimem/l1' not recognised as a supported file format. I think this is
because I did not set the buffer properly. Can anyone kindly tell me if it
is possible or not?
many thanks!
 Anna Guan

    // open  hdfs file
    hdfsFile lfs = hdfsOpenFile(fs, "/input/L1.tif", O_RDONLY, 134217728,
0, 0);
    int size =  hdfsAvailable(fs, lfs) ;
    char * data_buffer = (char*)CPLMalloc(size);
    int hasdata = -1;
    tOffset offset = hdfsTell(fs, lfs);

        hasdata =  hdfsRead(fs, lfs, data_buffer, size) ;
    hdfsSeek(fs, lfs, offset);
    VSIFCloseL(VSIFileFromMemBuffer( "/vsimem/l1", (GByte*)data_buffer,
size, FALSE ));
    GDALDataset* readDS = (GDALDataset*)GDALOpen("/vsimem/l1",GA_ReadOnly);

View raw message