AWS and NVIDIA Extend Collaboration to Advance Generative AI Innovation

AWS and NVIDIA Extend Collaboration to Advance Generative AI Innovation


  • AWS to offer Amazon EC2 instances based on NVIDIA Grace Blackwell GPUs and NVIDIA DGX Cloud to accelerate performance of building and running inferences on LLM with multi-million dollar parameters
  • Integrating AWS Nitro System, Elastic Fabric Adapter encryption, and AWS Key Management Service with Blackwell encryption gives customers end-to-end control of their training data and model weights for even greater security. robust for customers' AI applications on AWS.
  • AWS and NVIDIA are contributing 20,736 GB200 superchips capable of processing 414 exaflops to Project Ceiba, a collaboration to build one of the fastest AI supercomputers exclusively on AWS in the DGX Cloud for NVIDIA's own AI research and development.
  • Project Ceiba, an AI supercomputer built exclusively on AWS with DGX Cloud, will feature 20,736 GB200 Superchips capable of 414 exaflops for NVIDIA's own AI R&D
  • Amazon SageMaker's integration with NVIDIA NIM inference microservices helps customers further optimize price performance for entry-level models running on GPUs.
  • AWS and NVIDIA Collaboration Accelerates AI Innovation in Healthcare and Life Sciences

GTCAmazon Web Services (AWS), an (NASDAQ: AMZN) company, and NVIDIA (NASDAQ: NVDA) today announced that the new NVIDIA Blackwell GPU platform – sleepless by NVIDIA at GTC 2024, it will arrive on AWS. AWS will offer NVIDIA GB200 Grace Blackwell Superchip and B100 Tensor Core GPUs, expanding the companies' long-standing strategic collaboration to deliver the most secure and advanced infrastructure, software and services to help customers unlock new AI capabilities Generative (AI).

NVIDIA and AWS continue to bring together the best of their technologies, including NVIDIA's newest multi-node systems featuring the next-generation NVIDIA Blackwell platform and AI software, AWS Nitro System advanced security, and AWS Key Management Service (AWS KMS). ), Elastic Fabric Adapter. (EFA) petabit-scale networks and UltraCluster hyperscale clusters from Amazon Elastic Compute Cloud (Amazon EC2). Together, they deliver the infrastructure and tools that enable customers to build and run real-time inference on multi-million-dollar parameter large language models (LLMs) faster, at massive scale, and at a lower cost than previous generation NVIDIA GPUs in Amazon EC2. .

“The deep collaboration between our two organizations dates back more than 13 years, when together we launched the world's first GPU cloud instance on AWS, and today we offer the broadest range of NVIDIA GPU solutions to customers,” he said Adam Selipsky, CEO of AWS. . “NVIDIA's next-generation Grace Blackwell processor marks a significant step forward in AI and GPU generative computing. When combined with the powerful AWS Elastic Fabric adapter network, the hyperscale clustering of Amazon EC2 UltraClusters, and the advanced virtualization and security capabilities of our unique Nitro system, we make it possible for customers to create and run large language models with multi-million dollar parameters faster, on a massive scale and more securely than anywhere else. “Together, we continue to innovate to make AWS the best place to run NVIDIA GPUs in the cloud.”

“AI is driving advances at an unprecedented pace, leading to new applications, business models and innovation across industries,” said Jensen Huang, founder and CEO of NVIDIA. “Our collaboration with AWS is accelerating new generative AI capabilities and giving customers unprecedented computing power to push the boundaries of what's possible.”

Latest innovations from AWS and NVIDIA accelerate training of cutting-edge LLMs that can surpass trillion parameters
AWS will offer the NVIDIA Blackwell platform, with GB200 NVL72, with 72 Blackwell GPUs and 36 Grace CPUs interconnected by 5th generation NVIDIA NVLink™. When you connect to Amazon's powerful network (
FEP), and supported by advanced virtualization (AWS Nitro System) and hyperscale clustering (Amazon EC2 Ultraclusters), customers can scale to thousands of GB200 Superchips. NVIDIA Blackwell on AWS delivers a breakthrough in accelerating inference workloads for multi-trillion-parameter, resource-intensive language models.

Building on the success of NVIDIA H100-powered EC2 P5 instances, which are available to customers for short periods through Amazon EC2 Capacity Blocks for Machine Learning, AWS plans to offer EC2 instances with the new B100 GPUs deployed in EC2 UltraClusters to accelerate generative AI training and inference at massive scale. The GB200 will also be available in NVIDIA DGX™ Cloud, an AI platform co-engineered on AWS, giving enterprise developers dedicated access to the infrastructure and software needed to build and deploy advanced generative AI models. DGX Cloud instances powered by Blackwell on AWS will accelerate the development of cutting-edge generative AI and LLM that can reach over 1 trillion parameters.

Increase AI security with AWS Nitro System, AWS KMS, EFA encryption, and Blackwell encryption
As customers move rapidly to implement AI in their organizations, they need to know that their data is handled securely throughout their training workflow. Securing model weights (the parameters a model learns during training that are critical to its ability to make predictions) is critical to protecting customers' intellectual property, preventing model tampering, and maintaining the integrity of the model. model.

AWS AI infrastructure and services already have security features in place to give customers control over their data and ensure it is not shared with third-party model providers. The combination of the AWS Nitro system and NVIDIA GB200 takes AI security even further by preventing unauthorized people from accessing the models' weights. The GB200 enables physical encryption of NVLink connections between GPUs and encrypts data transfer from the Grace CPU to the Blackwell GPU, while EFA encrypts data between servers for distributed training and inference. The GB200 will also benefit from the AWS Nitro system, which offloads I/O for functions from the host CPU/GPU to specialized AWS hardware to deliver more consistent performance, while its enhanced security protects customer code and data during operation. processing, both on the client. side and side AWS. This capability, available only on AWS, has been independently verified by NCC Groupa leading cybersecurity firm.

With GB200 on Amazon EC2, AWS will enable customers to create a trusted execution environment alongside their EC2 instance, using AWS Nitro Enclaves and AWS KMS. Nitro Enclaves allows clients to encrypt their training data and weights with KMS, using key material under their control. The enclave can be loaded from the GB200 instance and can communicate directly with the GB200 Superchip. This allows KMS to communicate directly with the enclave and pass key material to it in a cryptographically secure manner. The enclave can then pass that material to the GB200, protected from the customer instance and preventing AWS operators from accessing the key or decrypting the training data or model weights, giving customers unparalleled control over their data. .

Project Ceiba Taps Blackwell to Power NVIDIA's Future Generative AI Innovation on AWS
Announced at AWS re:Invent 2023, Project Ceiba is a collaboration between NVIDIA and AWS to build one of the world's fastest AI supercomputers. Hosted exclusively on AWS, the supercomputer is available for NVIDIA's own research and development. This first-of-its-kind supercomputer with 20,736 B200 GPUs is being built using the new NVIDIA GB200 NVL72, a system featuring fifth-generation NVLink, which scales to 20,736 B200 GPUs connected to 10,368 NVIDIA Grace CPUs. The system scales using fourth-generation EFA networks, providing up to 800 Gbps per Superchip of high-bandwidth, low-latency network performance, capable of processing a massive 414 exaflops of AI, a 6x performance increase over previous plans. to build Ceiba. Hopper architecture. NVIDIA R&D teams will use Ceiba to advance AI for LLM, graphics (image/video/3D generation) and simulation, digital biology, robotics, autonomous vehicles, NVIDIA Earth-2 climate prediction and more to help NVIDIA to drive the future. Generative AI innovation.

AWS and NVIDIA collaboration accelerates the development of generative AI applications and advanced use cases in healthcare and life sciences.
AWS and NVIDIA have joined forces to deliver high-performance, low-cost inference for generative AI with the integration of Amazon SageMaker with NVIDIA NIM™ inference microservices, available with NVIDIA AI Enterprise. Customers can use this combination to quickly deploy pre-built FMs optimized to run on NVIDIA GPUs in SageMaker, reducing time to market for generative AI applications.

AWS and NVIDIA have partnered to expand computer-aided drug discovery with new FM NVIDIA BioNeMo™ for generative chemistry, predicting protein structures, and understanding how drug molecules interact with their targets. These new models will soon be available on AWS HealthOmics, a purpose-built service that helps healthcare and life sciences organizations store, query, and analyze genomic, transcriptomic, and other omics data.

The AWS HealthOmics and NVIDIA Healthcare teams are also working together to launch generative AI microservices to advance drug discovery, medical technology, and digital health, offering a new catalog of GPU-accelerated cloud endpoints for data from biology, chemistry, imaging, and healthcare so healthcare companies can take advantage of the latest advances in generative AI on AWS.


Leave a Comment


No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *