Remove CUDA specific hacks
We introduced artificial intermediate function to accommodate CUDA limitations. One of them that kernel could not be declared in ctors and/or private function (?).
This might not be the case anymore.So maybe we can simplify things.