{"id":16088,"date":"2025-10-14T11:15:49","date_gmt":"2025-10-14T11:15:49","guid":{"rendered":"https:\/\/dmsretail.com\/RetailNews\/gpuaas-on-cisco-ai-pods-with-rafay\/"},"modified":"2025-10-14T11:15:49","modified_gmt":"2025-10-14T11:15:49","slug":"gpuaas-on-cisco-ai-pods-with-rafay","status":"publish","type":"post","link":"https:\/\/dmsretail.com\/RetailNews\/gpuaas-on-cisco-ai-pods-with-rafay\/","title":{"rendered":"GPUaaS on Cisco AI PODs with Rafay"},"content":{"rendered":"<p> <p><a href=\"https:\/\/dmsretail.com\/online-workshops-list\/\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-496\" src=\"https:\/\/dmsretail.com\/RetailNews\/wp-content\/uploads\/2022\/05\/RETAIL-ONLINE-TRAINING-728-X-90.png\" alt=\"Retail Online Training\" width=\"729\" height=\"91\" srcset=\"https:\/\/dmsretail.com\/RetailNews\/wp-content\/uploads\/2022\/05\/RETAIL-ONLINE-TRAINING-728-X-90.png 729w, https:\/\/dmsretail.com\/RetailNews\/wp-content\/uploads\/2022\/05\/RETAIL-ONLINE-TRAINING-728-X-90-300x37.png 300w\" sizes=\"auto, (max-width: 729px) 100vw, 729px\" \/><\/a><\/p><br \/>\n<\/p>\n<div>\n<p>Enterprises are making bold moves into AI, and Cisco AI PODs provide a powerful, pre-validated foundation for deploying AI infrastructure at scale. They bring together compute, storage, and networking in a modular design that simplifies procurement and deployment. However, deploying hardware is only the beginning. The next critical step is making this powerful infrastructure consumable as a service.<\/p>\n<p>This is where Rafay complements Cisco AI PODs. Rafay\u2019s GPU Platform as a Service (PaaS) adds the critical consumption layer, turning the hardware into a governed, self-service GPU cloud. Together, Cisco and Rafay enable organizations to operationalize AI faster by offering secure, multi-tenant access, standardized workload SKUs, and policy-driven governance.<\/p>\n<p>This post explores how this joint solution transforms raw GPU power into a production-ready AI platform, enabling developer self-service while maintaining enterprise-grade control.<\/p>\n<h2><strong>From Infrastructure to Consumption: The Platform Challenge<\/strong><\/h2>\n<p>Organizations have accelerated investments in AI infrastructure, deploying platforms like Cisco AI PODs with the latest NVIDIA hardware to enable generative AI, Retrieval-Augmented Generation (RAG), and large-scale inference. As adoption grows, a new challenge emerges: how to enable multiple teams to safely and efficiently consume this shared infrastructure.<\/p>\n<p>Platform teams must balance access across different groups, each with unique needs and security requirements. Without a standardized consumption layer, this leads to several problems:<\/p>\n<ul>\n<li><strong>Underutilized GPUs:<\/strong> Industry benchmarks report average GPU utilization rates often fall below 30%. This is partly because AI workloads are \u201cbursty\u201d and most environments lack the mechanisms to slice and share GPU resources efficiently. When expensive GPUs sit idle, it represents a significant opportunity cost.<\/li>\n<li><strong>Manual Provisioning:<\/strong> Platform teams often rely on manual configurations, ad-hoc scripts, and service tickets to manage access. These workflows slow down delivery, introduce inconsistencies, and make it difficult to enforce governance.<\/li>\n<li><strong>Siloed Resources:<\/strong> Without a unified platform, GPU infrastructure often becomes siloed by team, limiting sharing and preventing a holistic view of utilization and costs. Developers and researchers must navigate complex internal processes just to run a job.<\/li>\n<\/ul>\n<p>To solve this, enterprises need to operate their GPU infrastructure as a service\u2014one that supports shared resources, multitenant isolation, and automated policy enforcement.<\/p>\n<h2>The Joint Solution: Cisco AI PODs + Rafay GPU PaaS<\/h2>\n<p>Cisco and Rafay have collaborated to deliver a modular, fully validated GPU cloud architecture. This solution combines Cisco\u2019s best-in-class AI POD infrastructure with Rafay\u2019s GPU Platform as a Service, transforming GPU hardware into a secure, self-service, multitenant cloud.<\/p>\n<ul>\n<li><strong>Cisco AI PODs<\/strong> provide the compute, fabric, storage, and pre-validated design. Based on Cisco Validated Designs (CVDs), they integrate next-generation Cisco UCS platforms (like the C885A M8 Server) and the latest NVIDIA GPUs to power the entire AI lifecycle.<\/li>\n<li><strong>Rafay GPU PaaS<\/strong> delivers the orchestration, policy enforcement, and developer abstraction layer. It transforms the foundational hardware into a production-grade GPU cloud that is simple to consume.<\/li>\n<\/ul>\n<p>This combined architecture enables organizations to rapidly launch and operate GPU clouds with full-stack orchestration, declarative SKU provisioning, and built-in cost attribution.<\/p>\n<h2><strong>Developer Self-Service Through a Curated Catalog<\/strong><\/h2>\n<p>At the core of Rafay\u2019s platform is the <strong>SKU Studio<\/strong>, a purpose-built catalog system that empowers platform teams to deliver AI-ready infrastructure and applications as reusable SKUs.<\/p>\n<h3><strong>Each SKU is a modular abstraction that bundles:<\/strong><\/h3>\n<ul>\n<li><strong>Compute Configuration:<\/strong> GPU\/MIG profiles, CPU, memory, and storage.<\/li>\n<li><strong>Application Stack:<\/strong> Pre-integrated tools like vLLM, Triton, or Jupyter Notebooks.<\/li>\n<li><strong>Policy Controls:<\/strong> Time-to-Live (TTLs), RBAC, multitenancy, and quotas.<\/li>\n<li><strong>Billing Metadata:<\/strong> Usage units and cost attribution.<\/li>\n<\/ul>\n<p>Developers can access GPU environments instantly through a self-service portal (GUI, API, or CLI) without needing to file support tickets. For example, a data scientist can select an \u201cH100-Inference-vLLM\u201d SKU, which automatically provisions a specific GPU slice, deploys a secure container, and applies a 48-hour TTL. This streamlines workflows and ensures security best practices are applied consistently.<\/p>\n<h2><strong>Secure Multi-Tenancy and Governance<\/strong><\/h2>\n<p>Sharing expensive GPU resources requires strict isolation and governance. Rafay provides native, secure multi-tenancy that allows teams to safely share infrastructure without interference.<\/p>\n<h3><strong>Key security controls are automatically enforced:<\/strong><\/h3>\n<ul>\n<li><strong>Hierarchical RBAC:<\/strong> Defines permissions and access scope for tenants, projects, and workspaces.<\/li>\n<li><strong>Namespace Isolation:<\/strong> Ensures workloads are separated at the cluster and network level.<\/li>\n<li><strong>Resource Quotas:<\/strong> Prevents any single team or job from monopolizing resources.<\/li>\n<li><strong>Centralized Audit Logs:<\/strong> Provides a complete audit trail of user actions for compliance.<\/li>\n<\/ul>\n<p>These built-in protections allow platform teams to maintain complete oversight and control while empowering developers with the freedom they need to innovate.<\/p>\n<h2><strong>Comprehensive GPU Management and Visibility<\/strong><\/h2>\n<p>To maximize ROI, you need to know how your GPUs are being used. Rafay provides end-to-end visibility, metering, and cost attribution tailored for multitenant environments.<\/p>\n<p>Platform teams can use declarative blueprints to standardize GPU operator configurations and slicing strategies (like MIG) across all clusters. Multi-tenant dashboards offer detailed insights into:<\/p>\n<ul>\n<li>GPU inventory and allocation<\/li>\n<li>SKU usage patterns<\/li>\n<li>Instance-level activity and user attribution<\/li>\n<li>Health status and uptime trends<\/li>\n<\/ul>\n<p>A billing metrics API aggregates usage data, calculates billable compute, and generates auditable reports, enabling chargebacks and financial accountability.<\/p>\n<h2>Who Benefits from a Unified GPU Cloud?<\/h2>\n<p>This jointly validated solution is designed for a diverse range of customers who need to operationalize GPU infrastructure with security, speed, and scale.<\/p>\n<ul>\n<li><strong>Enterprise IT Teams:<\/strong> Gain federated self-service, quota enforcement, and centralized visibility. This reduces infrastructure duplication and embeds governance into daily operations.<\/li>\n<li><strong>Sovereign &amp; Public Sector Organizations:<\/strong> Meet compliance needs in air-gapped environments with secure multitenancy, policy enforcement, and centralized audit logging.<\/li>\n<li><strong>Cloud &amp; Managed Service Providers:<\/strong> Monetize GPU infrastructure with a white-labeled, multitenant platform that includes automated tenant onboarding and built-in chargeback metering.<\/li>\n<li><strong>Existing Cisco Customers:<\/strong> Extend the ROI of current UCS deployments by adding GPU orchestration as a seamless overlay with no re-architecture required.<\/li>\n<li><strong>Greenfield AI Builders:<\/strong> Start fresh with a pre-validated, fully integrated solution that reduces the time from procurement to operational AI services from months to weeks.<\/li>\n<\/ul>\n<h2><strong>Operationalize Your AI Infrastructure Today<\/strong><\/h2>\n<p>Pairing Cisco\u2019s validated AI infrastructure with Rafay\u2019s GPU PaaS control plane allows organizations to transform GPU systems into fully governed internal platforms. The result is a consumption-driven architecture where developers gain self-service access, operators enforce quotas and track consumption, and the business maximizes the value of its AI investments.<\/p>\n<p>This architecture offers a clear path forward: deliver GPU infrastructure as a service, enable secure and compliant multitenancy, and make consumption predictable and cost-aligned from day one.<\/p>\n<p>To see this powerful solution in action, join our upcoming webinar. Experts from Cisco and Rafay will demonstrate how to transform your GPU infrastructure into a production-ready AI service.<\/p>\n<p><strong>Live Webinar: From AI PODs to GPU Cloud<\/strong><br \/>October 21, 2025 at 8:00 a.m. PST \/ 3:00 p.m. GMT<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<hr\/>\n<p style=\"text-align: center;\" data-ttstextid=\"38\">We\u2019d love to hear what you think. Ask a Question, Comment Below, and Stay Connected with #CiscoPartners on social!<\/p>\n<p style=\"text-align: center;\" data-ttstextid=\"39\">Cisco Partners Facebook\u00a0\u00a0| \u00a0<a href=\"https:\/\/twitter.com\/CiscoPartners?dtid=oblgzzz001087\" target=\"_blank\" rel=\"noopener noreferrer\">@CiscoPartners X\/Twitter<\/a>\u00a0\u00a0| \u00a0Cisco Partners LinkedIn<\/p>\n<\/p><\/div>\n<p><script async src=\"\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><br \/>\n<br \/><p><a href=\"https:\/\/dmsretail.com\/online-workshops-list\/\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-496\" src=\"https:\/\/dmsretail.com\/RetailNews\/wp-content\/uploads\/2022\/05\/RETAIL-ONLINE-TRAINING-728-X-90.png\" alt=\"Retail Online Training\" width=\"729\" height=\"91\" srcset=\"https:\/\/dmsretail.com\/RetailNews\/wp-content\/uploads\/2022\/05\/RETAIL-ONLINE-TRAINING-728-X-90.png 729w, https:\/\/dmsretail.com\/RetailNews\/wp-content\/uploads\/2022\/05\/RETAIL-ONLINE-TRAINING-728-X-90-300x37.png 300w\" sizes=\"auto, (max-width: 729px) 100vw, 729px\" \/><\/a><\/p><br \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Enterprises are making bold moves into AI, and Cisco AI PODs provide a powerful, pre-validated foundation for deploying AI infrastructure at scale. They bring together [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":16089,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[],"class_list":["post-16088","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/posts\/16088","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/comments?post=16088"}],"version-history":[{"count":0,"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/posts\/16088\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/media\/16089"}],"wp:attachment":[{"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/media?parent=16088"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/categories?post=16088"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dmsretail.com\/RetailNews\/wp-json\/wp\/v2\/tags?post=16088"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}