Get Free Shipping on orders over $89
Practical GPU Programming : High-performance computing with CUDA, CuPy, and Python on modern GPUs - Maris Fenlor

Practical GPU Programming

High-performance computing with CUDA, CuPy, and Python on modern GPUs

By: Maris Fenlor

Paperback | 20 February 2025

At a Glance

Paperback


RRP $109.99

$90.75

17%OFF

or 4 interest-free payments of $22.69 with

 or 

Ships in 5 to 7 business days

If you're a Python pro looking to get the most out of your code with GPUs, then Practical GPU Programming is the right book for you. This book will walk you through the basics of GPU architectures, show you hands-on parallel programming techniques, and give you the know-how to confidently speed up real workloads in data processing, analytics, and engineering.

The first thing you'll do is set up the environment, install CUDA, and get a handle on using Python libraries like PyCUDA and CuPy. You'll then dive into memory management, kernel execution, and parallel patterns like reductions and histogram computations. Then, we'll dive into sorting and search techniques, but with a focus on how GPU acceleration transforms business data processing. We'll also put a strong emphasis on linear algebra to show you how to supercharge classic vector and matrix operations with cuBLAS and CuPy. Plus, with batched computations, efficient broadcasting, custom kernels, and mixed-library workflows, you can tackle both standard and advanced problems with ease.

Throughout, we evaluate numerical accuracy and performance side by side, so you can understand both the strengths and limitations of GPU-based solutions. The book covers nearly every essential skill and modern toolkit for practical GPU programming, but it's not going to turn you into a master overnight.

Key Learnings

  • Boost processing speed and efficiency for data-intensive tasks.
  • Use CuPy and PyCUDA to write and execute custom CUDA kernels.
  • Maximize GPU occupancy and throughput efficiency by using optimal thread block and grid configuration.
  • Reduce global memory bottlenecks in kernels by using shared memory and coalesced access patterns.
  • Perform dynamic kernel compilation to ensure tailored performance.
  • Use CuPy to carry out custom, high-speed elementwise GPU operations and expressions.
  • Implement bitonic and radix sort algorithms for large or batch integer datasets.
  • Execute parallel linear search kernels to detect patterns rapidly.
  • Scale matrix operations using Batched GEMM and high-level cuBLAS routines.

Table of Content

  1. Introduction to GPU Fundamentals
  2. Setting up GPU Programming Environment
  3. Basic Data Transfers and Memory Types
  4. Simple Parallel Patterns
  5. Introduction to Kernel Optimization
  6. Working with PyCUDA and CuPy Features
  7. Practical Sorting and Search
  8. Linear Algebra Essentials on GPU

More in Graphical & Digital Media Applications

Modeling and Simulation of Mineral Processing Systems - R.P. King
Python All-in-One For Dummies : 3rd Edition - John C. Shovic

RRP $74.95

$49.99

33%
OFF
Web Engineering : Theory and Practice - Jeremiah Downey
Microsoft Power BI For Dummies : For Dummies (Computer/Tech) - Jack A. Hyman
Rosalina's Storybook - Nintendo

RRP $45.00

$35.75

21%
OFF
Canva For Dummies : For Dummies (Computer/Tech) - Jesse Stay

RRP $49.95

$38.75

22%
OFF
What Did You Hear? : The Music of Bob Dylan - Steven Rings

RRP $49.95

$38.75

22%
OFF
Movie Making For Kids For Dummies - Nick Willoughby

RRP $49.95

$38.75

22%
OFF
Computer Coding Python Games for Kids : DK Help Your Kids With - Carol Vorderman
Canon EOS R50 For Dummies : For Dummies (Computer/Tech) - Julie Adair  King
Photoshop Elements 2025 For Dummies : For Dummies (Computer/Tech) - Barbara Obermeier
Famesick - Lena Dunham

Hardcover

$39.75

3D Printing For Dummies : For Dummies (Computer/Tech) - Richard Horne
The Art of Destiny, Volume 3

RRP $85.00

$61.99

27%
OFF
Getting Started with 3D Printing : 2nd Edition - Liza Wallach Kloski

RRP $38.00

$30.40

20%
OFF