Optimizing Collectives with Large Payloads on GPU-based Supercomputers

Updated: