
Colossal-AI
Learn about the distributed techniques of Colossal-AI to maximize the runtime performance of your large neural networks.
Colossal-AI 总览
Colossal-AI 是一个集成的系统,为用户提供一套综合的训练方法。 您可以找到常见的训练方法,如混合精度训练和梯度累积。 此外,我们提供了一系列的并行技术,包括数据并行、张量 …
快速演示 - Colossal-AI
Colossal-AI 是一个集成的大规模深度学习系统,具有高效的并行化技术。该系统可以通过应用并行化技术在具有多个 GPU 的分布式系统上加速模型训练。该系统也可以在只有一个 GPU 的系 …
Quick Demo - Colossal-AI
Colossal-AI is an integrated large-scale deep learning system with efficient parallelization techniques. The system can accelerate model training on distributed systems with multiple …
Introduction - Colossal-AI
The Colossal-AI system uses a device-mesh, similar to PyTorch's latest DTensor release, to manage its cluster. Colossal-AI uses a sharding-spec to annotate the storage status of each …
Colossal-AI Overview
Colossal-AI is designed to be a unified system to provide an integrated set of training skills and utilities to the user. You can find the common training utilities such as mixed precision training …
Setup - Colossal-AI
The version of Colossal-AI will be in line with the main branch of the repository. Feel free to raise an issue if you encounter any problem. git clone https://github.com/hpcaitech/ColossalAI.git
Reading Roadmap - Colossal-AI
These tutorials will cover the basic usage of Colossal-AI to realize simple functions such as data parallel and mixed precision training. Lastly, if you wish to apply more complicated techniques …
Colossal-AI
了解Colossal-AI内置的分布式技术以充分优化您的大型神经网络的运行性能。
Distributed Training - Colossal-AI
Distributed training is definitely a common practice when researchers and engineers develop AI models. There are several reasons behind this trend. Model size increases rapidly.