Monday 2 July 2012

Getting Started with A GPU Cluster Instance On AWS

I recently received some free credit for Amazon Web Services (AWS) so I thought I may as well try it out! AWS is an absolutely incredible resource at a ridiculously low price. More specifically, I mean EC2 (Elastic Compute Cloud) The stats that Amazon come out with about it are insane, for example this snippet from the HPC page:
...a 1064 instance (17024 cores) cluster of cc2.8xlarge instances was able to achieve 240.09 TeraFLOPS for the High Performance Linpack benchmark, placing the cluster at #42 in the November 2011 Top500 list.
Pretty impressive!

I decided that it would be interesting to have a play with the GPU instances that one can use on AWS EC2. With each of these you get 2xTesla M2050 GPUs for very good value for money (though relatively expensive compared to other instance types) at $2 per hour.

So, I created my instance using the wizard, most of this is click through common sense stuff. The only thing to be careful with is to select the correct AMI on the first page: chose the one with GPU in the title!
Then the rest is just a matter of clicking through the wizard and downloading your *.pem file for passwordless ssh login (you need to do chmod 400 file.pem  to the newly downloaded file to be able to use it):  
ssh -i file.pem ec2-user@server-ip

You can get your server IP from the EC2 management console interface under "Public DNS".

OK, so at this stage I'm logged into my GPU instance. The first thing I run is nvidia-smi whereby I am greeted with the following message:
NVIDIA: could not open the device file /dev/nvidiactl (No such file or directory).
Nvidia-smi has failed because it couldn't communicate with NVIDIA driver. Make sure that latest NVIDIA driver is installed and running.

Hmm, that's not very friendly!
After some digging, it turns out I had been using the wrong instance type! During my rapid clicking, I should have selected cg1.4xlarge rather than cc1.4xlarge in one of the dialogs:



Rookie error.
After that, everything seems to be running as normal. Now to run some code.

 

Compiling

So once we're logged in with nvidia-smi up and running it's time to start compiling some code. This requires a couple of things such as the headers and the libs!
After some scraping around, the CUDA headers and libs can be found here and here:
/opt/nvidia/cuda/include/ 
/opt/nvidia/cuda/lib64/

While the OpenCL headers and libs can be found here and here:
/opt/nvidia/cuda/include/CL/
/usr/lib64/

From there, it's plain sailing. The performance is as good as I've ever seen it (For example, I don't know if they are perhaps virtualising the GPUs across multiple users? It seems not.)
Happy GPGPUing!



2 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Thank you, this post was very useful. I made the same mistake by selecting the incorrect instance type. I was still unable to build ImageMagick with OpenCL support. What a shame.

    ReplyDelete

Leave a comment!

Can we just autofill city and state? Please!

Coming from a country that is not the US where zip/postal codes are hyper specific, it always drives me nuts when you are filling in a form ...