Intel : and Aible Accelerate GenAI Workloads

June 26, 2024 at 10:36 pm IST

What's New: Intel and Aible, an end-to-end serverless generative AI (GenAI) and augmented analytics enterprise solution, now offer solutions to shared customers to run advanced GenAI and retrieval-augmented generation (RAG) use cases on multiple generations of Intel® Xeon® CPUs. The collaboration, which includes engineering optimizations and a benchmarking program, enhances Aible's ability to deliver GenAI results at a low cost for enterprise customers and helps developers embed AI intelligence into applications. Together, the companies offer scalable and efficient AI solutions that draw on high-performing hardware to help customers solve challenges with AI and Intel.

"Customers are looking for efficient, enterprise-grade solutions to harness the power of AI. Our collaboration with Aible shows how we're closely working with the industry to deliver innovation in AI and lowering the barrier to entry for many customers to run the latest GenAI workloads using Intel Xeon processors."

-Mishali Naik, Intel senior principal engineer, Data Center and AI Group

About Xeon's GenAI Performance: Aible's solutions demonstrate how CPUs can significantly enhance performance across a range of the latest AI workloads, from running language models to RAG. Optimized for Intel processors, Aible's technology utilizes an efficient serverless end-to-end approach for AI, consuming resources only when there are active user requests. For example, the vector database activates for just a few seconds to retrieve information relevant to a user query, and the language model similarly powers up briefly to process and respond to the request. This on-demand operation helps reduce the total cost of ownership (TCO).

While RAG is often implemented using GPUs (graphics processing units) and accelerators to leverage their parallel processing capabilities, Aible's serverless technique, combined with Intel® Xeon® Scalable processors, allows RAG use cases to be powered entirely by CPUs. The performance data shows that multiple generations of Intel Xeon processors can run RAG workloads efficiently.

Results may vary. Configuration details below.

Why It Matters: Aible enables customers to lower the operational costs of GenAI projects by exclusively utilizing CPUs in serverless form to share the same underlying compute resources more securely across multiple customers. As a comparison, the lowered operational costs can be compared to buying electricity when it's used rather than renting an electricity generator. Moreover, as demand for generative AI grows, the need to optimize both performance and energy consumption becomes more crucial. Aible's CPU-based services offer customers a cost-effective and energy-efficient solution.

How Aible Solutions Help Customers Lower Costs: According to Aible's benchmark analysis, customers can realize up to a 55x cost saving when running RAG models on their CPU-based serverless solutions¹. This cost reduction is a testament to the effectiveness of Aible's CPU-exclusive approach, which sidesteps the need for more expensive GPU-based infrastructures with shared services or dedicated servers.

How Intel Collaborates with Aible: Intel - including Intel Labs - has worked with Aible to optimize AI workloads on Xeon processors. Notably, by optimizing Aible's code for AVX-512, Aible saw significant performance gains and improved its throughput on Xeon processors, highlighting the impact of strategic software optimizations on overall efficiency.

The combination of RAG models with Intel Xeon processors, facilitated by platforms like Aible, can enable applications such as:

Natural language processing (NLP)
Recommendation systems
Decision support systems
Content generation

Intel's collaboration with Aible began with the launch of 4th Gen Xeon processors. The two companies have since optimized AI workloads, code and libraries for Xeon processors to increase performance for Aible's product offerings.

What's Next: Intel and Aible will demonstrate their solutions at the Amazon Web Services Summit in Washington, D.C., on June 26 and 27. Aible's solutions run on AWS Lambda and are available in the AWS Marketplace.

More Context: Read the full report (Aible.com) | 30 Days to AI Value: Development Best Practices from Intel and Aible (Intel.com) | Impact from AI in 30 Days (Aible Case Study) | Intel AI Analytics Toolkit

_{The Small Print:}

¹_{Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy.}

_{Configuration details:}

_{1-node, 2x Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz, 28 cores, HT On, Turbo On, NUMA 2, Integrated Accelerators Available [used]: DLB 0 [0], DSA 0 [0], IAA 0 [0], QAT 0 [0], Total Memory 384GB (12x32GB DDR4 2933 MT/s [2934 MT/s]), BIOS SE5C620.86B.02.01.0017.110620230543, microcode 0x5003604, 2x Ethernet Connection X722 for 10GBASE-T, 1x 894.3G INTEL SSDSC2KB96, 1x 1.8T INTEL SSDPE2KX020T8, 2x 3.7T INTEL SSDPE2KX040T8, Red Hat Enterprise Linux 8.9 (Ootpa), 4.18.0-513.18.1.el8_9.x86_64, WORKLOAD=Aible End-to-end RAG-LLM, Model=Mistral-7B-OpenOrca-GGUF, all-MiniLM-L6-v2, gcc 12.2.0, IntelLLVM 2024.0.2, llama.cpp, ChromaDB, Langchain, oneAPI base container 2024.0.1-devel-ubuntu22.04. Tested by Intel on 03/07/24.}

_{1-node, 2x Intel(R) Xeon(R) Platinum 8462Y+, 32 cores, HT On, Turbo On, NUMA 2, Integrated Accelerators Available [used]: DLB 2 [0], DSA 2 [0], IAA 2 [0], QAT 2 [0], Total Memory 512GB (16x32GB DDR5 4800 MT/s [4800 MT/s]), BIOS 05.12.00, microcode 0x2b0004d0, 2x BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller, 2x Ethernet Controller E810-C for QSFP, 2x 3.5T SAMSUNG MZQL23T8HCLS-00B7C, 1x 1.8T SAMSUNG MZ1L21T9HCLS-00A07, Red Hat Enterprise Linux 8.9 (Ootpa), 4.18.0-513.18.1.el8_9.x86_64, WORKLOAD=Aible End-to-end RAG-LLM, Model=Mistral-7B-OpenOrca-GGUF, all-MiniLM-L6-v2, gcc 12.2.0, IntelLLVM 2024.0.2, llama.cpp, ChromaDB, Langchain, oneAPI base container 2024.0.1-devel-ubuntu22.05. Tested by Intel on 03/07/24.}

_{1-node, 2x INTEL(R) XEON(R) PLATINUM 8562Y+, 32 cores, HT On, Turbo On, NUMA 2, Integrated Accelerators Available [used]: DLB 2 [0], DSA 2 [0], IAA 2 [0], QAT 2 [0], Total Memory 512GB (16x32GB DDR5 5600 MT/s [5600 MT/s]), BIOS 3B05.TEL4P1, microcode 0x21000161, 2x Ethernet Controller X710 for 10GBASE-T, 2x Ethernet Controller E810-C for QSFP, 1x 894.3G INTEL SSDSC2KG96, 1x 3.5T SAMSUNG MZQL23T8HCLS-00A07, 3x 3.5T SAMSUNG MZQL23T8HCLS-00B7C, Red Hat Enterprise Linux 8.9 (Ootpa), 4.18.0-513.18.1.el8_9.x86_64, WORKLOAD=Aible End-to-end RAG-LLM, Model=Mistral-7B-OpenOrca-GGUF, all-MiniLM-L6-v2, gcc 12.2.0, IntelLLVM 2024.0.2, llama.cpp, ChromaDB, Langchain, oneAPI base container 2024.0.1-devel-ubuntu22.06. Tested by Intel on 03/07/24.}

Attachments

Original Link
Permalink

Disclaimer

Intel Corporation published this content on 26 June 2024 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 26 June 2024 17:05:48 UTC.

	1st Jan change	Capi.
INTEL CORPORATION	-38.37%	132B
NVIDIA CORPORATION	+149.32%	3,034B
TSMC (TAIWAN SEMICONDUCTOR MANUFACTURING COMPANY)	+62.90%	771B
BROADCOM INC.	+43.83%	747B
AMD (ADVANCED MICRO DEVICES)	+10.04%	262B
QUALCOMM, INC.	+37.72%	222B
TEXAS INSTRUMENTS INCORPORATED	+14.12%	177B
ARM HOLDINGS PLC	+117.74%	170B
MICRON TECHNOLOGY, INC.	+54.12%	146B
SK HYNIX INC.	+67.14%	118B

1st Jan change

Capi.

INTEL CORPORATION

-38.37%

132B

NVIDIA CORPORATION

+149.32%

3,034B

TSMC (TAIWAN SEMICONDUCTOR MANUFACTURING COMPANY)

+62.90%

771B

BROADCOM INC.

+43.83%

747B

AMD (ADVANCED MICRO DEVICES)

+10.04%

262B

QUALCOMM, INC.

+37.72%

222B

TEXAS INSTRUMENTS INCORPORATED

+14.12%

177B

ARM HOLDINGS PLC

+117.74%

170B

MICRON TECHNOLOGY, INC.

+54.12%

146B

SK HYNIX INC.

+67.14%

118B

Market Closed - Nasdaq Other stock markets 01:30:00 29/06/2024 am IST			5-day change	1st Jan Change
30.97 ^USD	+1.24%		-0.39%	-38.37%

28/06	S&P 500, Nasdaq hit all-time highs; Nike slides after bleak forecast	RE
28/06	Microsoft: partnership with Proximus in the sovereign cloud	CF

S&P 500, Nasdaq hit all-time highs; Nike slides after bleak forecast	28/06	RE
Microsoft: partnership with Proximus in the sovereign cloud	28/06	CF
Keysight Technologies, Inc. Selects as Test Partner by Allion Labs for Thunderbolt 5 Product Certification Testing	27/06	CI
North American Morning Briefing : Disappointing Micron Outlook Weighs on Chip Stocks	27/06	DJ
Micron falls as AI revenue surge falls short of lofty expectations	27/06	RE
Ansys Enters Intel Foundry's USMAG Alliance for Chip Designs	24/06	MT
Synopsys Multi-Die Chip Fabrication Automation Technology Available for Intel	24/06	MT
Intel Foundry to Collaborate With Cadence Design Systems, Ansys, Synopsys	24/06	DJ
Silicon Box to pick Piedmont for $3.4 bln Italian chip plant, sources say	20/06	RE
Wolfspeed plant delayed as EU's chipmaking plans flounder	20/06	RE
Onsemi to invest up to $2 bln in Czech semiconductor plant	19/06	RE
Future prospects for AI	18/06
Tesla, Inc. : Musk moves a step closer to the $56 billion prize and the transfer of Tesla's headquarters to Texas	13/06
Transcript : Intel Corporation Presents at The Mizuho Technology Conference 2024, Jun-12-2024 09:55 AM	12/06
MediaTek designs Arm-based chip for Microsoft's AI laptops, say sources	12/06	RE
MediaTek designs Arm-based chip for Microsoft's AI laptops, say sources	12/06	RE
White House Considering More Limits on Chinese Access to AI Chips	11/06	MT
US weighs more limits on China's access to AI chips, Bloomberg News reports	11/06	RE
Global markets live: GSK, Tesla, Intel, General Motors, Gamestop...	11/06
Intel: Silicon Mobility launches new System-on-Chip	11/06	CF
Deutsche Telekom wins legal dispute with EU over interest payments	11/06	RE
Social Buzz: Wallstreetbets Stocks Decline Pre-Bell Tuesday; GameStop, AMC Entertainment to Open Lower	11/06	MT
US plans to award $23.9 mln to Rocket Lab to boost chips for satellites, spacecraft	11/06	RE
Deutsche Telekom wins EU interest fight, bodes well for Intel	11/06	RE
Intel Reportedly Stopping Work on New Chip Factory in Israel	11/06	MT

Intel Corporation

Equities

INTC

US4581401001

Semiconductors

Intel : and Aible Accelerate GenAI Workloads

Latest news about Intel Corporation

Chart Intel Corporation

Company Profile

Income Statement Evolution

Ratings for Intel Corporation

Analysts' Consensus

EPS Revisions

Quarterly earnings - Rate of surprise

Sector Other Semiconductors