hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das" <d...@yahoo-inc.com>
Subject RE: C API for Hadoop DFS
Date Wed, 03 May 2006 06:19:17 GMT
In our case, the components involved are the C API library, JNI layer and
Java APIs. In all these, we have control over errno. For example, if a
particular C API uses a third party library function that might return error
and hence set errno, we know about it already. Depending on the error, we
take a decision whether to proceed further in the API implementation code or
return an error to the client invoking the API. This includes the functions
in the JNI library which the API implementation calls. In the Java world, we
deal with exceptions and don't bother about errno. So for example, if a Java
method, invoked through JNI from a C API, throws an exception, the C API
implementation will get the exception object and depending on that the API
implementation will set a meaningful errno and return a (-1) or NULL to
signify that an error occurred. As I said earlier, this includes the case
where the JNI function itself fails (for some reason like out-of-memory or
something).
As an aside, the JNI layer doesn't generate errno-s.

-----Original Message-----
From: Konstantin Shvachko [mailto:shv@yahoo-inc.com] 
Sent: Wednesday, May 03, 2006 2:40 AM
To: hadoop-dev@lucene.apache.org
Subject: Re: C API for Hadoop DFS

Don't think errno is a particularly good idea for several reasons.
It is not common to set errno codes.
If a system library function uses errno, and we overwrite its value to 
return
something dfs related, the library function behavior becomes unpredictable.
This could be hard to debug.
We have a JNI layer between our C library and Java, which also might 
generate
errno-s overwriting the values we were trying to bring back from Java.

--Konstantin

Doug Cutting wrote:

> The spec says:
>
> /** All APIs set errno to meaningful values */
>
> So callers should always check errno after each call.  Whether this is 
> the best way to handle errors in C can be debated, but an error 
> mechanism was in fact specified.
>
> Doug
>
> Konstantin Shvachko wrote:
>
>> I think this a very important issue raised by David.
>>
>> IMO __ALL__ functions should return an integer value indicating 
>> success (=0) or failure (<0).
>> Unless we want to use C style Exceptions, otherwise we won't be able 
>> to identify what went
>> wrong if anything.
>> NULL or bool is not enough in most cases, since we need to 
>> distinguish e.g. between
>> timeout (when we retry) and "file not found" cases.
>> The actual return objects should be passed as outputs parameters.
>> E.g.
>> dfsFS dfsConnect(char *host, tPort port);
>> will become
>> tCompletionCode dfsConnect(char *host, tPort port, dfsFS fileSystem );
>> where tCompletionCode could be integer for now. Or we can define a 
>> structure
>> { int errCode; char *errDescription; }
>> to return the actual error descriptions along with the error code.
>>
>> --Konstantin
>>
>> Devaraj Das wrote:
>>
>>>> Do dfsConnect and dfsOpenFile return NULL on failure?
>>>>   
>>>
>>>
>>> Yes.
>>>
>>>  
>>>
>>>> Shouldn't dfsSeek, dfsRename, dfsCreateDirectory and 
>>>> dfsSetWorkingDirectory each have a return value to indicate success 
>>>> or failure?  Or are they assumed to never fail?
>>>>   
>>>
>>>
>>> Yes these functions should have return values. I will update the API 
>>> spec.
>>> Thanks for pointing this out.
>>>
>>> -----Original Message-----
>>> From: David Bowen [mailto:dbowen@yahoo-inc.com] Sent: Monday, May 
>>> 01, 2006 8:13 AM
>>> To: hadoop-dev@lucene.apache.org
>>> Subject: Re: C API for Hadoop DFS
>>>
>>>
>>> I'm curious about error handling.
>>> Do dfsConnect and dfsOpenFile return NULL on failure?
>>>
>>> Shouldn't dfsSeek, dfsRename, dfsCreateDirectory and 
>>> dfsSetWorkingDirectory each have a return value to indicate success 
>>> or failure?  Or are they assumed to never fail?
>>>
>>> - David
>>>
>>>
>>>
>>>
>>>
>>>  
>>>
>>
>
>
>



Mime
View raw message