Kubeflow on AWS Training Course
Kubeflow is a framework for running Machine Learning workloads on Kubernetes. TensorFlow is a machine learning library and Kubernetes is an orchestration platform for managing containerized applications.
This instructor-led, live training (online or onsite) is aimed at engineers who wish to deploy Machine Learning workloads to an AWS EC2 server.
By the end of this training, participants will be able to:
- Install and configure Kubernetes, Kubeflow and other needed software on AWS.
- Use EKS (Elastic Kubernetes Service) to simplify the work of initializing a Kubernetes cluster on AWS.
- Create and deploy a Kubernetes pipeline for automating and managing ML models in production.
- Train and deploy TensorFlow ML models across multiple GPUs and machines running in parallel.
- Leverage other AWS managed services to extend an ML application.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Course Outline
Introduction
- Kubeflow on AWS vs on-premise vs on other public cloud providers
Overview of Kubeflow Features and Architecture
Activating an AWS Account
Preparing and Launching GPU-enabled AWS Instances
Setting up User Roles and Permissions
Preparing the Build Environment
Selecting a TensorFlow Model and Dataset
Packaging Code and Frameworks into a Docker Image
Setting up a Kubernetes Cluster Using EKS
Staging the Training and Validation Data
Configuring Kubeflow Pipelines
Launching a Training Job using Kubeflow in EKS
Visualizing the Training Job in Runtime
Cleaning up After the Job Completes
Troubleshooting
Summary and Conclusion
Requirements
- An understanding of machine learning concepts.
- Knowledge of cloud computing concepts.
- A general understanding of containers (Docker) and orchestration (Kubernetes).
- Some Python programming experience is helpful.
- Experience working with a command line.
Audience
- Data science engineers.
- DevOps engineers interesting in machine learning model deployment.
- Infrastructure engineers interesting in machine learning model deployment.
- Software engineers wishing to integrate and deploy machine learning features with their application.
Runs with a minimum of 4 + people. For 1-to-1 or private group training, request a quote.
Kubeflow on AWS Training Course - Booking
Kubeflow on AWS Training Course - Enquiry
Kubeflow on AWS - Consultancy Enquiry
Consultancy Enquiry
Testimonials (1)
The quality of the explanations and the large number of topics covered
Hugo SECHIER - Expleo France
Course - Kubeflow on AWS
Machine Translated
Upcoming Courses
Related Courses
DeepSeek: Advanced Model Optimization and Deployment
14 HoursThis instructor-led, live training in Canada (online or onsite) is aimed at advanced-level AI engineers and data scientists with intermediate-to-advanced experience who wish to enhance DeepSeek model performance, minimize latency, and deploy AI solutions efficiently using modern MLOps practices.
By the end of this training, participants will be able to:
- Optimize DeepSeek models for efficiency, accuracy, and scalability.
- Implement best practices for MLOps and model versioning.
- Deploy DeepSeek models on cloud and on-premise infrastructure.
- Monitor, maintain, and scale AI solutions effectively.
AWS IoT Core
14 HoursThis instructor-led, live training in Canada (onsite or remote) is aimed at engineers who wish to deploy and manage IoT devices on AWS.
By the end of this training, participants will be able to build an IoT platform that includes the deployment and management of a backend, gateway, and devices on top of AWS.
Amazon Web Services (AWS) IoT Greengrass
21 HoursThis instructor-led, live training in Canada (online or onsite) is aimed at developers who wish to install, configure, and manage AWS IoT Greengrass capabilities to create applications for various devices.
By the end of this training, participants will be able to use AWS IoT Greengrass to build, deploy, manage, secure, and monitor applications on intelligent devices.
AWS Lambda for Developers
14 HoursThis instructor-led, live training in Canada (onsite or remote) is aimed at developers who wish to use AWS Lambda to build and deploy services and applications to the cloud, without needing to worry about provisioning the execution environment (servers, VMs and containers, availability, scalability, storage, etc.).
By the end of this training, participants will be able to:
- Configure AWS Lambda to execute a function.
- Understand FaaS (Functions as a Service) and the advantages of serverless development.
- Build, upload and execute AWS Lambda functions.
- Integrate Lambda functions with different event sources.
- Package, deploy, monitor and troubleshoot Lambda based applications.
Mastering DevOps with AWS Cloud9
21 HoursThis instructor-led, live training in Canada (online or onsite) is aimed at advanced-level professionals who wish to deepen their understanding of DevOps practices and streamline development processes using AWS Cloud9.
By the end of this training, participants will be able to:
- Set up and configure AWS Cloud9 for DevOps workflows.
- Implement continuous integration and continuous delivery (CI/CD) pipelines.
- Automate testing, monitoring, and deployment processes using AWS Cloud9.
- Integrate AWS services such as Lambda, EC2, and S3 into DevOps workflows.
- Utilize source control systems like GitHub or GitLab within AWS Cloud9.
Docker for MLOps: End-to-End Pipeline Containerization
21 HoursDocker is a containerization platform used to build reproducible, portable, and scalable environments for ML systems.
This instructor-led, live training (online or onsite) is aimed at intermediate-level to advanced-level technical professionals who wish to containerize and operationalize complete ML pipelines using Docker.
Upon completion of this training, participants will be able to:
- Containerize ML training, validation, and inference workloads.
- Design and orchestrate end-to-end ML pipelines using Docker and supporting tools.
- Implement versioning, reproducibility, and CI/CD for ML components.
- Deploy, monitor, and scale ML services in containerized environments.
Format of the Course
- Interactive lectures supported by practical demonstrations.
- Hands-on exercises focused on building real ML pipeline components.
- Live-lab implementation for end-to-end containerized workflows.
Course Customization Options
- For customized training aligned with specific ML infrastructure needs, please contact us to discuss options.
Developing Serverless Applications on AWS Cloud9
14 HoursThis instructor-led, live training in Canada (online or onsite) is aimed at intermediate-level professionals who wish to learn how to effectively build, deploy, and maintain serverless applications on AWS Cloud9 and AWS Lambda.
By the end of this training, participants will be able to:
- Understand the fundamentals of serverless architecture.
- Set up AWS Cloud9 for serverless application development.
- Develop, test, and deploy serverless applications using AWS Lambda.
- Integrate AWS Lambda with other AWS services such as API Gateway and S3.
- Optimize serverless applications for performance and cost efficiency.
Industrial Training IoT (Internet of Things) with Raspberry PI and AWS IoT Core 「4 Hours Remote」
4 HoursSummery:
- Basics of IoT architecture and functions
- “Things”, “Sensors”, Internet and the mapping between business functions of IoT
- Essential of all IoT software components- hardware, firmware, middleware, cloud and mobile app
- IoT functions- Fleet manager, Data visualization, SaaS based FM and DV, alert/alarm, sensor onboarding, “thing” onboarding, geo-fencing
- Basics of IoT device communication with cloud with MQTT.
- Connecting IoT devices to AWS with MQTT (AWS IoT Core).
- Connecting AWS IoT core with AWS Lambda function for computation and data storage.
- Connecting Raspberry PI with AWS IoT core and simple data communication.
- Alerts and events
- Sensor calibration
Industrial Training IoT (Internet of Things) with Raspberry PI and AWS IoT Core 「8 Hours Remote」
8 HoursSummary:
- Basics of IoT architecture and functions
- “Things”, “Sensors”, Internet and the mapping between business functions of IoT
- Essential of all IoT software components- hardware, firmware, middleware, cloud and mobile app
- IoT functions- Fleet manager, Data visualization, SaaS based FM and DV, alert/alarm, sensor onboarding, “thing” onboarding, geo-fencing
- Basics of IoT device communication with cloud with MQTT.
- Connecting IoT devices to AWS with MQTT (AWS IoT Core).
- Connecting AWS IoT core with AWS Lambda function for computation and data storage using DynamoDB.
- Connecting Raspberry PI with AWS IoT core and simple data communication.
- Hands on with Raspberry PI and AWS IoT Core to build a smart device.
- Sensor data visualization and communication with web interface.
Kubeflow
35 HoursThis instructor-led, live training in Canada (online or onsite) is aimed at developers and data scientists who wish to build, deploy, and manage machine learning workflows on Kubernetes.
By the end of this training, participants will be able to:
- Install and configure Kubeflow on premise and in the cloud using AWS EKS (Elastic Kubernetes Service).
- Build, deploy, and manage ML workflows based on Docker containers and Kubernetes.
- Run entire machine learning pipelines on diverse architectures and cloud environments.
- Using Kubeflow to spawn and manage Jupyter notebooks.
- Build ML training, hyperparameter tuning, and serving workloads across multiple platforms.
Kubeflow on Azure
28 HoursThis instructor-led, live training in Canada (online or onsite) is aimed at engineers who wish to deploy Machine Learning workloads to Azure cloud.
By the end of this training, participants will be able to:
- Install and configure Kubernetes, Kubeflow and other needed software on Azure.
- Use Azure Kubernetes Service (AKS) to simplify the work of initializing a Kubernetes cluster on Azure.
- Create and deploy a Kubernetes pipeline for automating and managing ML models in production.
- Train and deploy TensorFlow ML models across multiple GPUs and machines running in parallel.
- Leverage other AWS managed services to extend an ML application.
MLflow
21 HoursThis instructor-led, live training in (online or onsite) is aimed at data scientists who wish to go beyond building ML models and optimize the ML model creation, tracking, and deployment process.
By the end of this training, participants will be able to:
- Install and configure MLflow and related ML libraries and frameworks.
- Appreciate the importance of trackability, reproducability and deployability of an ML model
- Deploy ML models to different public clouds, platforms, or on-premise servers.
- Scale the ML deployment process to accommodate multiple users collaborating on a project.
- Set up a central registry to experiment with, reproduce, and deploy ML models.
MLOps: CI/CD for Machine Learning
35 HoursThis instructor-led, live training in Canada (online or onsite) is aimed at engineers who wish to evaluate the approaches and tools available today to make an intelligent decision on the path forward in adopting MLOps within their organization.
By the end of this training, participants will be able to:
- Install and configure various MLOps frameworks and tools.
- Assemble the right kind of team with the right skills for constructing and supporting an MLOps system.
- Prepare, validate and version data for use by ML models.
- Understand the components of an ML Pipeline and the tools needed to build one.
- Experiment with different machine learning frameworks and servers for deploying to production.
- Operationalize the entire Machine Learning process so that it's reproduceable and maintainable.
MLOps for Azure Machine Learning
14 HoursThis instructor-led, live training in (online or onsite) is aimed at machine learning engineers who wish to use Azure Machine Learning and Azure DevOps to facilitate MLOps practices.
By the end of this training, participants will be able to:
- Build reproducible workflows and machine learning models.
- Manage the machine learning lifecycle.
- Track and report model version history, assets, and more.
- Deploy production ready machine learning models anywhere.
MLOps on Kubernetes: CI/CD Pipelines for Machine Learning
14 HoursMLOps on Kubernetes is a framework for automating the training, validation, packaging, and deployment of machine learning models using containerized pipelines and GitOps workflows.
This instructor-led, live training (online or onsite) is aimed at intermediate-level practitioners who wish to build automated, scalable MLOps pipelines on Kubernetes.
After completing this training, participants will be equipped to:
- Design end-to-end CI/CD pipelines for machine learning.
- Implement GitOps workflows for model deployment and versioning.
- Automate training, testing, and packaging of ML models.
- Integrate monitoring, alerting, and rollback strategies.
Format of the Course
- Instructor-guided presentations and technical deep dives.
- Hands-on exercises that build real-world CI/CD workflows.
- Live-lab practice deploying ML workloads to Kubernetes.
Course Customization Options
- Organizations may request tailored content aligned with their internal MLOps tools and infrastructure.