Requirements
1. Minimum of 5 years of experience in C++ and Python programming.
2. Strong understanding of CPU, GPU, and custom ASICs (NPU, TPU, etc.) architectures, along with low-level optimization techniques.
3. Demonstrated expertise with deep learning frameworks like PyTorch or TensorFlow and familiarity with deep learning models.
4. Proven experience in the training and deployment of ML models.
5. Background in distributed systems development, parallel programs, or distributed ML workloads.
6. Knowledge of software development best practices, including testing, profiling, debugging, documentation, version control, and issue tracking.
Desirable
1. Familiarity with emerging AI trends and the latest advancements in AI technology.
2. Experience contributing to open-source AI projects or technologies.
3. Proficiency in tuning and optimizing AI hardware accelerators or custom chips.