You are currently browsing the monthly archive for December 2008.

A fellow at the german hadoop user meeting (Thanks to Isabel that organized that again) pointed me to the fact that GPUs on a graphic cards basically working like server grids.
He mentioned there are some research papers in this field. I spend some time to read through what I could found and it was quite interesting. Let me citate some of the facts from the two most interesting papers:

+ “A Map Reduce Framework for Programming Graphics Processors” by Bryan Catanzaro, Narayanan Sundaram and Kurt Keutzer UC, Berkeley
+ “Mars: A MapReduce Framework on Graphics Processors” by Bingsheng He, Wenbin Fang, Qiong Luo, Naga K. Govindaraju, Tuyong Wang

First lets compare some facts (via Wenbin Fang, ppt).

--------------------------------------------------------------------------------------------
What                       |           GPU                |               CPU             |
--------------------------------------------------------------------------------------------
Memory Bandwidth           | ~80 GB/s                      | ~10 GB/s                      |
Floating point performance | ~500 GFLOPS                   | ~50 GFLOPS                    |
Parallelism                | ~10, 000 light weight threads |Optimized for sequential code. |
Performance improvement    |~2.5x ~ 3x per year            |~1.5 per year                  |
--------------------------------------------------------------------------------------------

Obviously GPUs provide a lot of horsepower. The problem so far was that programming for GPUs is quite difficult. For example higher level language constructs like variable-length data types and recursion do not exist. Also all GPUs API are highly vendor specific, but things moving forward, as I found out.
Both papers try to validate there statements by implementing map reduce but looks like Mars is much further and stil under development (Last Mars release August 2008). Both parties uses NVIDIA CUDA as development platform, means they require a Nvidia graphic

GPUs looks quite a lot like hadoop cluster for me, from my limited perspective.

GpuGrid.jpg

Both paper trying do some performance comparison to cpu based processing. The results have be take with the required salt but sounds pretty impressive.

gputrainingstime.jpg

gpuclassificationtime.jpg

marsspeedup.jpg

 

Of course I checked if there are any java bindings for CUDA and the great news is yes there is JCublas. Though I didnt have time to try it out.

Unfortunately there is too much on my todo list but it would be very interesting to eater integrate Mars with hadoop to allow running computing intense maps and reduce on a gpu or port lucene indexing to mars.

I will keep watching this topic and who know maybe there is a project coming up where I can find time to investigate more in this field.

Advertisements

RSS events

  • An error has occurred; the feed is probably down. Try again later.

some photos