Hi Janardhan,

We support Kepler and higher. Our kernels are compiled with sm_30 and we depend on CUDA 8 and CuDNN 5. Both support kepler and higher. Please see https://en.wikipedia.org/wiki/CUDA#GPUs_supported

>>  1. we have tuned a kernel only for a specific hw configuration(eg. Maxwell)
with `asm()` code, can we integrate it into our project
>> 2. And can we ignore this hw specific optimized kernel for other
incompatible hardware and simply use the present kernels.

I would recommend that you first attempt to implement the kernel in .cu and move to specific hardware optimization only if necessary. If you plan to do latter because its absolutely necessary, I recommend that
- you optimize newer generations (i.e. Pascal) rather than older generations
- and ensure that you optimize common cases (in a typical DL networks) first.

That will make a better case to integrate hardware specific kernels into SystemML.

For the second question, I refer you to http://www.jcuda.org/jcuda/doc/index.html.

Thanks,

Niketan Pansare
IBM Almaden Research Center
E-mail: npansar At us.ibm.com
http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar

Inactive hide details for Janardhan Pulivarthi ---03/28/2018 08:57:27 AM---Greetings, What are the hardware(hw) configuration tJanardhan Pulivarthi ---03/28/2018 08:57:27 AM---Greetings, What are the hardware(hw) configuration that SystemML (an opensource

From: Janardhan Pulivarthi <janardhan.pulivarthi@gmail.com>
To: dev@systemml.apache.org, Niketan Pansare <npansar@us.ibm.com>, Nakul Jindal <nakul02@gmail.com>, reinwald@us.ibm.com, Matthias Boehm <mboehm7@googlemail.com>
Date: 03/28/2018 08:57 AM
Subject: GPU hardware support for SystemML bleeding edge.





Greetings,

What are the hardware(hw) configuration that SystemML (an opensource
project) needs to support?

i.e., do we have support for all of these?
- Kepler
- Maxwell
- Pascal
- Volta

Lets say,
1. we have tuned a kernel only for a specific hw configuration(eg. Maxwell)
with `asm()` code, can we integrate it into our project
2. And can we ignore this hw specific optimized kernel for other
incompatible hardware and simply use the present kernels.

@Niketan - If the answer for the 1 & 2 is yes, then we first tune for
maxwell and then we support others, case by case as they (openai) have done.

Thanks,
Janardhan