SAN JOSE, California (Mar 2026) — Running AI in a business used to mean sending everything to the cloud and hoping the costs stayed manageable. Lenovo and NVIDIA are pushing hard against that model with a new wave of hardware and software designed to bring production-grade AI directly into company infrastructure.
The announcements came at NVIDIA GTC 2026 in San Jose and span everything from individual workstations to data centers to what Lenovo is calling gigawatt-scale AI factories — massive cloud deployments built around NVIDIA’s next-generation Vera Rubin platform.
The driving force behind all of it is inferencing, the process by which AI models generate real-time responses and decisions. Training AI models gets most of the attention, but inferencing is where the day-to-day business value actually lives, and it is the bottleneck most enterprises are now trying to solve.
According to a 2026 IDC study commissioned by Lenovo, 84% of organizations expect to run AI across on-premises or edge environments alongside the cloud, which is pushing demand for validated hybrid platforms that work reliably outside of hyperscaler infrastructure.
AI that runs on your desk or in your server room
On the device side, Lenovo is rolling out a new generation of workstations powered by NVIDIA RTX Pro Blackwell GPUs. The lineup includes the ultralight ThinkPad P14s Gen 7, the ThinkPad P16s Gen 5, and the premium ThinkPad P1 Gen 9 on the laptop side, and the ThinkStation P5 Gen 2 desktop with support for up to two NVIDIA RTX PRO 6000 Blackwell Max-Q GPUs.
The standout is the ThinkStation PGX, which Lenovo positions as an AI developer device capable of running models with up to 200 billion parameters at 1 petaflop of AI compute. For organizations that need secure, private, on-premises AI development without routing data through external servers, that is a meaningful capability.
Lenovo also claims its on-premises Hybrid AI platforms are now delivering return on investment in under six months, with cost per token running up to eight times lower compared to equivalent cloud infrastructure as a service.
Bigger deployments for bigger workloads
For enterprises running larger AI operations, Lenovo expanded its ThinkSystem and ThinkEdge server lineup with inferencing-optimized configurations. Two new Hybrid AI platforms cover different use cases: one built around NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs for multi-modal inferencing at scale, and another powered by NVIDIA Blackwell Ultra for model training, fine-tuning, and large-scale inference.
A starter inferencing platform using NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs is also available, claiming up to three times better performance for vision AI and four times better content generation performance compared to the previous NVIDIA L4 generation.
The gigafactory tier
At the top end, Lenovo is a launch partner for the NVIDIA Vera Rubin NVL72, a fully liquid-cooled rack-scale AI system aimed at hyperscale and sovereign AI cloud providers. Lenovo says the platform delivers up to ten times higher throughput and up to ten times lower cost per token compared to the previous generation, numbers that matter significantly at the scale these deployments operate.
Lenovo is also collaborating with Nscale to power hyperscale AI deployments optimized for large-scale inference and agentic workloads.
Lenovo and NVIDIA are demonstrating the full Hybrid AI Advantage portfolio at NVIDIA GTC 2026, booth 431, at the San Jose McEnery Convention Center through March 19.
