In all the GPU powered apps I have seen running the problem was not in speed of the GPU (although for things like video transcoding I have not seen a marked improvement by using GPU to do it)
the problem came that with GPU rendering the flexibility of the software disappeared. The best video card transcoder I found was a $3K app that I forget the name of CCA springs to find but thats not right. Best because it worked on either AMD or nVidia and offered a lot more features than your typical GPGPU software.
It still did not compete with handbrake in either functionality or performance.
It boils down to this...
GPU can process large amounts of in order data. However doing anything complicated it actually become a very poor performer.
Most of a CPUs construction is actually cache. For instance an Athlon II x4 CPU uses about 300 milllion transistors the Phenom II x4 chip uses over 750 million. The difference? double the L1 and L2 cache and 6MB of L3. This is because a CPU swaps things in and out of memory to perform its functions which a GPU is not capable of doing as effectively.
TBH I do not really understand it in great detail... I assumed the 200 GBps of bandwidth of VRAM was to make up for not having cache.
Also CPU assisted GPGPU is going to drop soon with AMD powering it. With GPUs built into the CPU, and a shared cache system, you should theoretically have the flexibility of CPU execution, the power of GPU parallel processing and a cache size that can handle both.
http://www.bit-tech.net/news/hardwar...rmance-boost/1