Plan du cours

Introduction to Biren GPU Architecture

  • Biren overview and use cases
  • Hardware layout: cores, memory, compute clusters
  • Comparison with NVIDIA and AMD GPUs

Setting Up the Biren Programming Environment

  • Installing Biren SDK and runtime
  • Understanding the toolchain and compiler model
  • Basic project structure and build process

GPU Programming with the Biren Stack

  • Thread and block models
  • Memory management and data transfers
  • Kernel development and launch patterns

Porting from CUDA to Biren

  • Translation techniques for CUDA code
  • Common API mappings and adaptations
  • Code conversion labs and practice

Debugging and Profiling

  • Using Biren’s debugger and profiler
  • Identifying bottlenecks
  • Memory access patterns and optimization

Optimization Techniques

  • Thread scheduling and instruction pipelining
  • Loop unrolling and shared memory use
  • Advanced kernel tuning for throughput

Case Study and Application Examples

  • Training a model with Biren accelerators
  • Porting and profiling a vision or NLP model
  • Comparing performance vs CUDA/NVIDIA

Summary and Next Steps

Pré requis

  • An understanding of GPU architecture and parallel processing
  • Experience with CUDA, OpenCL, or similar GPU programming environments
  • Familiarity with deep learning frameworks such as PyTorch or TensorFlow

Audience

  • HPC developers
  • AI infrastructure engineers
  • Performance optimization specialists
 21 Heures

Nombre de participants


Prix ​​par Participant

Cours à venir

Catégories Similaires