With the growing number of GPU-based supercomputing platforms and GPU-enabled applications, the ability to accurately model the performance of such applications is becoming increasingly important. Most current performance models for GPU-enabled applications are limited to single node performance. In this project, we are working on developing performance models that are both accurate and easily applicable to any distributed GPU application. We will use both analytical and empirical approaches to build either qualitative models, which aim at pointing out the bottleneck of an application for further optimization, or quantitative models which can predict the elapse time of applications on given hardware platform.