harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oliver Deakin <oliver.dea...@googlemail.com>
Subject Re: Platform dependent code placement (was: Re: repo layout again)
Date Fri, 17 Feb 2006 14:55:00 GMT
Tim Ellison wrote:
> Andrey Chernyshev wrote:
>   
>   
<snip>
>> On the other hand, having a separate source trees like linux32.sparc,
>> solaris64.sparc, win.IA32 for each specific platform combination may
>> lead to a huge code duplication. We may need to be able to share the
>> code through the certain, but not through all platform combinations.
>>     
>
> Agreed.  The existing code layout for the classlib natives is certainly
> not a viable way to scale across multiple platforms.
>
> (The 'in-house' mechanism for managing multi-platform code is particular
> to IBM so not of great interest here, suffice to say that the win.IA32
> and linux.IA32 trees in classlib/trunk/native-code are the product of
> that mechanism with some manual tidy-up).
>
>   
Also agree. The current layout will not scale well when we move to a 
broader range of platforms.

>> To address that issue, I can suggest a pretty straightforward scheme
>> for platform-dependent code placement which looks as follows:
>>
>> 1. There is a fixed set of attributes which denotes a specific target
>> configuration. As a starter set, we may have OS (for operating system)
>> and, say ARCH (for architecture) attributes. This set can be extended
>> later, but, as it was suggested, let's don't cross that bridge if we
>> come to it.
>>     
>
> Yes, the principal distinction is probably on OS & ARCH.
>
>   
>> 2. Files in the source tree are selected for compilation based on the
>> OS or ARCH attribute values which may (or may not appear) in a file or
>> directory name.
>> Some examples are:
>>
>> src\main\native\solaris\foo.cpp
>>     - means file is applicable for whatever system running Solaris;
>>     
>
> yep (that was foo.c, right ;-) -- only teasing)
>
>   
>> src\main\native\win\foo_ia32.cpp
>>     - file is applicable only for  Windows / IA32;
>>     
>
> why has the ARCH flipped onto the file name?  why not win_ia32 ?
>
>   
>> src\main\native\foo_ia32_em64t.cpp
>>     - file can be compiled for whatever OS on either IA32 or EM64T
>> architecture, but nothing else.
>>     
>
> I agree with the approach, but left wondering why it is not something like:
>    src\main\native\
>                    common\
>                    unix\
>                    windows\
>                    zos\
>                    solaris\
>                    solaris_x86\
>                    solaris_sparc\
>                    windows_ifp\
>
> i.e. a taxonomy covering families of code (common, unix-like,
> windows-like) and increasingly specific discriminators.
>   
The idea is good, however I think including both the OS and arch in the 
directory name is preferable.
It is just as simple a convention, gives the coder an at-a-glance view 
of which OS/arch's have platform specific code associated with them
and keeps the actual source filenames consistent across platforms.

Was there a particular reason for attaching the architecture to the 
filename and not the directory Andrey?
>   
>> The formal file selection rule may look like:
>>
>> (1) File is applicable for the given OS value if its pathname contains regexp
>> [\W_]${OS}[\W_], or pathname doesn't contain any OS value;
>>
>> (2) File is applicable for the given ARCH value if its pathname contains regexp
>> [\W_]${ARCH}[\W_], or pathname doesn't contain any ARCH value;
>>
>> (3) File is selected for a compilation if it satisfies both (1) and
>> (2) criteria.
>>     
>
> If we restrict the OS and ARCH identifiers to directories then it will
> allow us to use the gmake VPATH functionality to select the right file,
> so compiling on solaris x86 will have a
> VPATH='solaris_x86:solaris:unix:common' and so on.
>   
I agree that is a perfect scenario to use VPATH for. I think this would 
probably be a simpler solution
than using ant (as suggested later) and also would not require you to 
have a JVM to build the native code.

>   
>> One can see that this naming convention gives developers enough
>> freedom to layout their code in a most convenient way (actually,
>> experience shows that the meaning of "convenient" may differ
>> significantly depending on a component type :). On the other hand, it
>> gives well defined (and hopefully intuitive enough) rule showing
>> whether the particular file is picked up by the compiler or not,
>> depending on a configuration.
>>     
>
> I like the idea -- if we agree to use gmake throughout then I think we
> get this functionality 'for free'.
>
>   
>> In addition to the above, the code could also be selected for
>> compilation by means of #defines directives in C/C++ files (it is
>> convenient when the most of a file is platform-independent, with the
>> exception of just a few lines of code). The building system could set
>> up the OS and ARCH attributes as appropriate defines for the C/C++
>> code.
>> For example, for Windows/IA32 config, the following defines could be set:
>>
>>      #define OS WIN
>>      #define WIN
>>      #define ARCH IA32
>>      #define IA32
>>
>> Then the platform-dependent code sections may look like:
>>
>> #ifdef WIN
>> ….
>> #endif
>>
>> which is essentially same as:
>>
>> #if OS == WIN
>> ….
>> #endif
>>
>> It is important that OS/ARCH (or whatever additional) attribute names
>> and values are used consistently in the file names and define
>> directives.
>>     
>
> Using the names consistently will definitely help, but choosing whether
> to create a separate copy of the file in a platform-specific
> sub-directory, or to use #define's within a file in a shared-family
> sub-directory will likely come down to a case by case decision.  For
> example, 32-bit vs. 64-bit code may be conveniently #ifdef'ed in some .c
> files, but a .h file that defines pointer types etc. may need different
> versions of the entire file to keep things readable.
>   
This is a tricky one. I think in most cases the difference between 
32/64bit code should be minor and
mostly confined to header defines as Tim suggests. For this ifdef's will 
be sufficient. I would simply suggest
that we adopt a policy of always marking all #else and #endif's clearly 
to indicate which condition
they relate to.
However, there may be instances where using ifdef's obfuscates the code. 
I think most of the time this
will be a judgement call on the part of the coder - if you look at a 
piece of code and cannot tell what
the preprocessor is going to give you on a particular platform, you're 
probably looking at a candidate
for code separation.
>   
>> Finally, I'd suggest that the platform dependent code can be organized
>> in 3 different ways:
>>
>> (1) Explicitly, via defining the appropriate file list. For example, 
>> Ant xml file may choose either one or another fileset, depending on
>> the current OS and ARCH property values. This approach is most
>> convenient, for example,  whenever a third-party code is compiled or
>> the file names could not be changed for some reason.
>>     
>
> Ant ?!  ;-)  or platform-specific makefile #includes?
>
>   
>> (2) Via the file path naming convention. This is the preferred
>> approach and works well whenever distinctive files for different
>> platforms can be identified.
>>     
>
> yep (modulo discussion of filenames vs. dir names to enable vpath)
>
>   
>> (3) By means of the preprocessor directives. This could be convenient
>> if only few lines of code need to vary across the platforms. However,
>> preprocessor directives would make the code less readable, hence this
>> should be used with care.
>>
>> In terms of building process, it means that the code has to pass all 3
>> stages of filtering before it is selected for the compilation.
>>     
>
> I like it.  Let's just discuss what tools do the selection -- but I
> agree with the approach.
>
>   
>> The point is that components at Harmony could be very different,
>> especially if we take into account that they may belong both to Class
>> Libraries and VM world.
>>     
>
> There will be files that it makes sense to share for sure (like vmi.h
> and jni.h etc.) but they should be stable-API types that can be
> refreshed across the boundary as required if necessary.
>
>   
>> Hence, the most efficient (in terms of code
>> sharing and readability) code placement would require a maximum
>> flexibility, though preserving some well-defined rules. The scheme
>> based on file dir/name matching seems to be flexible enough.
>>
>> How does the above proposal sound?
>>     
>
>   
Sounds good :) It makes a lot of sense to organise the code in a way 
that promotes reuse across platforms.
+1 from me


-- 
Oliver Deakin
IBM United Kingdom Limited



> Cool, perhaps we can discuss if it should be gmake + vpath or ant.
>
> Thanks for resurrecting this thread.
>
> Regards,
> Tim
>
>
>   
>>>> Maybe in some components we would want to include a window manager
>>>> family too, though let's cross that bridge...
>>>>
>>>> I had a quick hunt round for a recognized standard or convention for OS
>>>> and CPU family names, but it seems there are enough subtle differences
>>>> around that we should just define them for ourselves.
>>>>
>>>>         
>>> My VM's config script maintains CPU type, OS name, and word size as three
>>> independent values.  These are combined in various ways in the source code
>>> and support scripts depending on the particular need.  The distribution script
>>> names the 'tar' files for the binaries with all three as a part of the file name
>>> as, "...-CPU-OS-WORD.tar" as the tail end of the file name.  (NB:  I am going
>>> to simplify the distribution scripts shortly into a single script that creates
the
>>> various pieces, binaries, source, and documentation.  This will be out soon.)
>>>
>>> Does this help?
>>>
>>> Dan Lydick
>>>
>>>       
>>>> Regards,
>>>> Tim
>>>>
>>>>
>>>> --
>>>>
>>>> Tim Ellison (t.p.ellison@gmail.com)
>>>> IBM Java technology centre, UK.
>>>>         
>>>
>>> Dan Lydick
>>>
>>>       
>
>   


Mime
View raw message