Full Stack Machine Learning Engineer (Datacentre AI Engineering) - Riyadh, KSA📣 Job Ad

20 days ago

	Contract Type	Full-time
	Workplace type	On-site
	Location	Riyadh

Job Description

About the Role

Qualcomm Middle East Information Technology Company LLC is seeking a skilled Full Stack Machine Learning Engineer to join its team in Riyadh, Saudi Arabia. This role focuses on developing and engineering AI solutions for Qualcomm's AI Inference Suite and large-scale data center deployments. As Saudi Arabia advances its digital transformation, Qualcomm is investing in computing and data center capabilities to support AI, cloud, and advanced connectivity. This position offers an opportunity to contribute to a technology hub supporting critical environments and shaping data center operations.

In this role, you will be responsible for designing, delivering, and supporting end-to-end AI services, agentic workflows, and fine-tuning pipelines. You will enable lifecycle automation, orchestration, and observability for data center environments, utilizing full-stack engineering, machine learning, and infrastructure knowledge to build scalable AI systems.

Key Responsibilities

Build and optimize API serving layers for AI inference workloads, focusing on model and hardware efficiency.
Develop intelligent agents and retrieval-augmented generation (RAG) workflows using frameworks like LangChain and *****
Implement production-grade bring-your-own-model and fine-tuning flows, covering dataset ingestion, orchestration, evaluation, and deployment.
Integrate with LLM runtimes such as vLLM, Dynamo, and llm-d, and apply inference optimization techniques.
Contribute to the development of AI Inference Suite SDKs (Python/TypeScript/Java/Rust), command-line interface (CLI) tools, and reference applications.
Design and maintain AI cluster management software for provisioning, orchestration, and monitoring.
Integrate out-of-band management via Redfish/IPMI and in-band telemetry using Prometheus/OpenTelemetry.
Develop workflows using MAAS, Terraform, and Ansible for bare-metal and containerized deployments.
Enable Kubernetes and Helm-based orchestration for inference clusters and multi-tenancy environments.
Build dashboards for monitoring rack health, inventory, and Service Level Agreement (SLA) compliance.
Stay informed about Generative AI trends, rack-scale AI orchestration, and data center best practices.

Qualifications and Requirements

Bachelor's degree in Computer Science, Engineering, or a related field.
A minimum of 5 years of software engineering experience, with at least 3 years focused on Machine Learning (ML) or High-Performance Computing (HPC) environments.
Strong programming proficiency in Python, Rust/Go, and TypeScript, with a solid understanding of software development fundamentals.
Deep comprehension of data structures and algorithms within distributed systems and high-performance computing.
Hands-on experience with Kubernetes, Helm, Prometheus/OpenTelemetry, and Ansible/Terraform.
Practical experience with LLM runtimes, agent frameworks, and rack-scale orchestration.
Master's degree in Computer Science, Machine Learning, or a related field is preferred.
Experience building inference and fine-tuning pipelines, as well as agentic workflows, is preferred.
Knowledge of data center resource lifecycle management, out-of-band protocols (Redfish/IPMI), and MAAS/OpenStack is preferred.
Exposure to scale-up data center networking technologies such as RoCE, RDMA, and NVLink is preferred.
Contributions to inference and Generative AI model performance optimization are preferred.

Required Skills

API Development & Optimization
Agentic Workflows & RAG Pipelines
Model Lifecycle Management
LLM Runtime Integration
SDK & Tooling Contributions
Cluster Management
Telemetry & Observability
Infrastructure-as-Code
Kubernetes Orchestration
Monitoring & Dashboards
Continuous Innovation
Programming Languages: Python, Rust, Go, TypeScript
Orchestration & Automation Tools: Kubernetes, Helm, Ansible, Terraform, MAAS, OpenStack
Monitoring & Telemetry: Prometheus, OpenTelemetry, Redfish, IPMI
ML Frameworks & Runtimes: LLM runtimes, Agent frameworks, LangChain, ****, vLLM, Dynamo, llm-d
Networking Technologies: RoCE, RDMA, NVLink

Work Environment and Location

This is a full-time position based in Riyadh, Saudi Arabia. The role is part of Qualcomm's expanding team focused on AI engineering within data centers.