onpointchs.com Pickup or delivery?

Departments Services Savings Grocery & Essentials Pickup & Delivery Pharmacy Careers My Items

AI Performance Engineering: From GPU Kernels to LLM Inference

★★★★☆ 4.0 98 reviews

US$11.58

Price when purchased online

Free shipping Free 30-day returns

Sold and shipped by onpointchs.com

We aim to show you accurate product information. Manufacturers, suppliers and others provide what you see here.

US$11.58

Price when purchased online

Free shipping Free 30-day returns

How do you want your item?

I want shipping & delivery savings with Walmart+✦

You get 30 days free! Choose a plan at checkout.

Shipping

Arrives Jun 28

Free

Pickup

Check nearby

Delivery

Not available

Sold and shipped by onpointchs.com

Free 30-day returns Details

Product details

Management number	231975251	Release Date	2026/06/18	List Price	US$11.58	Model Number	231975251
Category	Books Computers & Technology Computer Science AI & Machine Learning Machine Theory

A hands-on guide to making AI systems fast — from GPU kernels to production LLM inference.Most AI systems run well below the speed their hardware allows — GPUs idle waiting on data, LLMs serve a fraction of their throughput, and adding hardware sometimes makes things slower. AI Performance Engineering: From GPU Kernels to LLM Inference is a practitioner's guide to diagnosing, profiling, and fixing those bottlenecks — systematically, with real tools and runnable code, from hardware first principles to production LLM serving.What You Will LearnGPU architecture and the roofline model — classify any kernel as compute- or memory-bound, from first principles.Professional profiling — Nsight Systems and Compute, torch.profiler, Linux perf, eBPF, and CPU flame graphs.PyTorch optimization — mixed precision, quantization, torch.compile, CUDA Graphs, and DataLoader tuning.LLM inference — prefill vs decode, the KV cache and grouped-query attention, PagedAttention, continuous batching, and speculative decoding.Distributed inference and training — tensor and pipeline parallelism, NCCL cost, FSDP, Mixture-of-Experts, and disaggregated serving.Honest benchmarking — avoid the five common mistakes and build throughput-latency curves that survive review.2024-2026 hardware — NVIDIA Blackwell, AMD MI300X/ROCm, Intel Gaudi 3, AWS Trainium, Apple Silicon, and CXL memory.Production operation — vLLM serving, observability with DCGM/Prometheus/Grafana, multi-GPU scaling, and cost per token.Hands-On From Start to FinishEvery chapter pairs concepts with runnable Python — no toy examples. Seven end-to-end capstone projects mirror real production work, and the companion repository ships 82 runnable exercises, most with CPU fallbacks.Interview PreparationAppendix C provides 50 interview questions with model answers across GPU architecture, profiling, LLM inference, distributed systems, and benchmarking — organized by domain for targeted study.Inside the BookNine parts, 31 chapters, seven capstone projects, six appendices, and a glossary — roughly 330 pages, from CPU caches and NUMA through the CUDA execution model and LLM inference internals to production fleet economics.Who This Book Is ForML engineers, AI infrastructure and platform engineers, and senior software and systems engineers who profile and optimize AI workloads in Python and PyTorch. GPU experience helps but is not required; a CUDA-capable GPU is needed for the GPU-programming chapters, and the rest run on CPU. Read more

ASIN	B0H2ZC9JGM
ISBN13	979-8198692480
Language	English
Publisher	Independently published
Dimensions	7 x 0.76 x 10 inches
Item Weight	1.61 pounds
Print length	336 pages
Publication date	May 26, 2026

Correction of product information

If you notice any omissions or errors in the product information on this page, please use the correction request form below.

Correction Request Form

Customer ratings & reviews

4 out of 5

★★★★☆

98 ratings | 40 reviews

How item rating is calculated

View all reviews

5 stars

75% (74)

4 stars

8% (8)

3 stars

4% (4)

2 stars

2% (2)

1 star

11% (11)

Sort by

There are currently no written reviews for this product.

Shipping Rates

Order Amount	Shipping Fee	Handling Fee
Under $99	$12.99	$24.00
$99 - $499	FREE	$24.00
$500 and above	FREE	FREE

Delivery Time

Standard Shipping: 5-7 business days
Express Shipping: 2-3 business days (additional $15)
Overnight Shipping: Next business day (additional $35)

Available Regions

We ship to all 50 US states, Canada, and select international destinations through our partner Neokyo.

Diameter	12 feet (3.66m)
Height	30 inches (76cm)
Water Capacity	1,718 gallons (6,500L)
Weight (Empty)	42 lbs (19kg)

AI Performance Engineering: From GPU Kernels to LLM Inference

Product details

Bestseller ranking

Machine Theory

Advances in Computing and Information - ICCI '90: International Conference on Computing and Information Niagara Falls, Canada, May 23-26, 1990. Proceedings (Lecture Notes in Computer Science, 468)

Deep Learning with TensorFlow and Keras: Build and deploy supervised, unsupervised, deep, and reinforcement learning models, 3rd Edition

Foundations of Software Science and Computation Structures: Second International Conference, FOSSACS'99, Held as Part of the Joint European ... (Lecture Notes in Computer Science, 1578)

Theoretical Aspects of Computing - ICTAC 2015: 12th International Colloquium, Cali, Colombia, October 29-31, 2015, Proceedings (Theoretical Computer Science and General Issues)

Real-Time: Theory in Practice: REX Workshop, Mook, The Netherlands, June 3-7, 1991. Proceedings (Lecture Notes in Computer Science, 600)

Mathematical Foundations of Computer Science 1990: Banska Bystrica, Czechoslovakia, August 27-31, 1990 Proceedings (Lecture Notes in Computer Science, 452)

Customers who viewed this product also viewed

Shop Lights

Medinah Power 100-70000115-1 2 x 4 ft. 30, 40 & 50W 3000K-6500K Correlated Temperature Select Back-Lit Panel LED Light

4 Pack 4FT Linkable LED Linear Light, 3CCT Dimmable Suspension Office Lighting Fixture, 40W, Matte Silver Aluminum Housing, ETL Listed

Chiuer 130W LED Warehouse Lights 5000K 18600LM 120-277V 1-10V Dimmable

2Pcs ABS LED Garage Light Adjustable Brightness LED Garage Ceiling Light Foldable Ceiling Shop Lighting Lamp for Workshop

ETI 42” Ultra Bright Bluetooth LED Shop Light with Pull Chain, 3713 Lumens, 4000K Cool White Light, 120V, 80CRI, 55703142

RGHJAR-LED Garage Lights 2 Pack, 50W LED Shop Light with 4+1 Multi-Position Panels Deformable E26/E27 Ceiling Light, 6500k 5000LM Ceiling Led Daylight for Shop, Basement, Workshop, Attic, Barn

Correction of product information

Customer ratings & reviews