Get Free Shipping on orders over $89
Serverless ETL and Analytics with AWS Glue : Design scalable data lakes, optimize ETL pipelines, and accelerate analytics on AWS - Noritaka Sekiyama

Serverless ETL and Analytics with AWS Glue

Design scalable data lakes, optimize ETL pipelines, and accelerate analytics on AWS

By: Noritaka Sekiyama, Albert Quiroga, Tomohiro Tanaka, Subramanya Vajiraya, Ishan Gaur

eBook | 11 September 2026

At a Glance

eBook


RRP $54.99

$49.49

10%OFF

or 4 interest-free payments of $12.37 with

 or 

Available: 11th September 2026

Preorder. Download available after release.

Build scalable, cost-efficient data platforms with AWS Glue and automate complex ETL pipelines to power modern analytics and governed data lakes

Key Features

  • Master production-grade AWS Glue jobs on serverless Spark for enterprise-scale ETL
  • Design governed data lakes using Glue Data Catalog and Lake Formation
  • Written by AWS Glue community members, this practical guide shows you how to implement AWS Glue in no time

Book Description

Building a modern data platform is no longer just about moving data. Organizations must scale reliably, control costs, enforce governance, and accelerate analytics. This book shows you how to design and operate production-grade data platforms using AWS Glue and related AWS analytics services. You will begin with core data management concepts before moving into ingestion from diverse sources, data preparation strategies, metadata management, security controls, and cross-account data sharing. Learn how to design efficient data layouts, orchestrate pipelines, implement CI CD practices, and manage the full lifecycle of data integration workloads. This updated edition expands coverage of open table formats such as Apache Hudi, Delta Lake, and Apache Iceberg, along with performance tuning, observability, cost optimization, and real-world troubleshooting. You will also explore integrations with machine learning and generative AI workflows powered by Glue and SageMaker. Written by AWS engineers and architects with deep hands-on experience in large-scale enterprise data lakes, this guide blends architecture principles with real-world implementation insight. By the end of this book, you will be able to design, deploy, monitor, and optimize scalable serverless ETL pipelines and governed data platforms on AWS.

What you will learn

  • Design scalable serverless ETL pipelines with AWS Glue
  • Ingest data from files, streams, SaaS, and JDBC sources
  • Optimize data layout and storage for faster analytics
  • Implement metadata governance and lineage tracking
  • Secure data with access control, encryption, and auditing
  • Automate CI CD for production data pipelines
  • Tune performance and troubleshoot Glue workloads
  • Apply open table formats for modern data lakes

Who this book is for

This book is for data engineers, ETL developers, cloud architects, and analytics professionals building scalable data platforms on AWS. If you are designing serverless data lakes, optimizing Spark-based ETL pipelines, or implementing governance and cost controls across analytics workloads, this guide is for you. A foundational understanding of AWS services and data processing concepts will help you get the most from this book.

on

More in Data Capture & Analysis

China's Megatrends : The 8 Pillars of a New Society - John Naisbitt

eBOOK

Tech Burnout Recovery - Denis Cullen

eBOOK