I can see from profiling that a lot of the time in both reading and writing are spend on ByteBuffer compare on the column names (for long rows with many columns)

I looked at the ByteBufferUtil.unsignedCompareByteBuffer() , it's basically the same structure as standard JVM ByteBuffer.compare() 
looping over each byte doing a ByteBuffer.get() 

is there a faster (probably hardware-based) compare ? I tried doing 8 bytes at a time by doing getLong() and it actually seems slower