manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shigeki Kobayashi <shigeki.kobayas...@g.softbank.co.jp>
Subject Re: Google native documents are not crawled
Date Mon, 11 Aug 2014 10:06:42 GMT
Hi Karl,


The documents are saved as Google Spreadsheet in Google Docs, which is also
managed in Google Drive.

As MCF documentation says "native Google documents such as spreadsheets and
word documents are exported to PDF and then ingested", those Google
Spreadsheets should be crawled and indexed.


Shigeki

2014-08-07 21:05 GMT+09:00 Karl Wright <daddywri@gmail.com>:

> Hi Shigeki,
>
> The javadoc says the following about this method:
>
> "The size of the file in bytes. This is only populated for files with
> content stored in Drive."
>
> Are these documents stored in Drive, or somewhere else?
>
> Karl
>
>
> On Thu, Aug 7, 2014 at 8:02 AM, Karl Wright <daddywri@gmail.com> wrote:
>
>> Hi Shigeki,
>>
>> The connector tries to get the length of the file, using the googledocs
>> API:
>>
>>             // Get the file length
>>             Long fileLength = googleFile.getFileSize();
>>             if (fileLength != null) {
>>
>> ... where googleFile is a com.google.api.services.drive.model.File object.
>>
>> But, the file length is coming back as null, which the connector assumes
>> means that the file is unreadable somehow.
>>
>> Can you open a ticket, so that we can look into this in more detail?
>>
>> Karl
>>
>
>

Mime
View raw message