Skip links

  • Skip to primary navigation
  • Skip to content
  • Skip to footer
  • About
  • Research
  • Publications
  • Software
  • Blog
  • Team

    A Hybrid Tensor-Expert-Data Parallelism Approach to Optimize Mixture-of-Experts Training

    Tags: expert parallelism, mixture-of-experts, parallel deep learning, tensor parallelism

    Updated: June 1, 2023

    Previous Next
    Brendan Iribe Center for Computer Science and Engineering
        8125 Paint Branch Drive, College Park, MD 20742
    Web Accessibility
    • Email
    • hpc_group
    • hpcgroup
    • Feed
    © 2025 PSSG at the University of Maryland.
    Powered by Jekyll & Minimal Mistakes.