🛠️ Steven Gong

Search

SearchSearch

Mar 23, 2025, 1 min read

Model Parallel

Tensor Parallelism

Tensor parallelism is a technique used to fit a large model in multiple GPUs.

https://huggingface.co/docs/text-generation-inference/en/conceptual/tensor_parallelism

Graph View

Backlinks

  • Model Parallelism

Created with Quartz, © 2025

  • Blog
  • LinkedIn
  • Twitter
  • GitHub