Description
Introduction to Apache Spark
What is Apache Spark?
Evolution from Hadoop to Spark
Spark vs. Traditional Data Processing
Spark Ecosystem Overview
Cluster Architecture (Driver, Executors, Tasks)
Setting Up Databricks & Spark
Introduction to Databricks Platform
Creating a Databricks Workspace
Databricks Notebooks & Commands
Clusters: Types, Configurations & Autoscaling
Databricks File System (DBFS)
Connecting Databricks with Cloud Storage (Azure)
Spark DataFrames & Transformations
Introduction to Spark DataFrames
Schema Inference & Explicit Schema Definition
Transformations & Actions (map, filter, reduce, groupBy, etc.)
Joins, Aggregations & Window Functions
Handling Missing Data & Data Cleansing
Caching & Performance Optimization
Spark SQL & Delta Lake
Writing SQL Queries in Spark
Spark DataFrames vs. Spark SQL
Hive Metastore Integration
Introduction to Delta Lake
Schema Evolution & Time Travel
ACID Transactions in Delta Lake
Optimizing Delta Tables (Z-Ordering, Auto-Optimize, Vacuum)
Spark Streaming on Databricks
Introduction to Spark Streaming
Structured Streaming Architecture
Streaming Sources (Kafka, Event Hub, Delta Tables)
Stream Processing & Window Operations
Stateful Processing & Checkpointing
Writing Streaming Data to Delta Lake
Advanced Spark Optimization
Lazy Evaluation & Catalyst Optimizer
Tungsten Execution Engine
Broadcast Joins & Skew Handling
Adaptive Query Execution (AQE)
Partitioning & File Format Optimization
Best Practices for Spark Job Performance
Microsoft Fabric and DP-700 Certification Preparation
Microsoft Certified: Fabric Data Engineer Associate
Implement and Manage an Analytics Solution
Configure Microsoft Fabric workspace settings
Configure Spark workspace settings
Configure domain workspace settings
Configure OneLake workspace settings
Configure data workflow workspace settings
Implement lifecycle management in Fabric
Configure version control
Implement database projects
Create and configure deployment pipelines
Configure Security and Governance
Implement workspace-level access controls
Implement item-level access controls
Implement row-level, column-level, object-level, and file-level access controls
Implement dynamic data masking
Apply sensitivity labels to items
Endorse items
Orchestrate Processes
Choose between a pipeline and a notebook
Design and implement schedules and event-based triggers
Implement orchestration patterns with notebooks and pipelines, including parameters and dynamic expressions
Ingest and Transform Data
Design and implement loading patterns
Design and implement full and incremental data loads
Prepare data for loading into a dimensional model
Design and implement a loading pattern for streaming data
Monitor and Optimize an Analytics Solution
Monitor Fabric items
Monitor data ingestion
Monitor data transformation
Monitor semantic model refresh
Configure alerts
Identify and resolve errors
Optimize performance across lakehouses, pipelines, and datawarehouses
DP-700 Exam Preparation
Comprehensive Review of Course Content
Practice Tests & Tips for Exam Success
Q&A Sessions with Instructors
Mock Exams and Real-World Problem Solving
Duration & Timings :
Duration – 75 Hours.
Training Type: Instructor Led Live Interactive Sessions.
Faculty: Experienced.
Weekday Session – Mon – Thu 8:30 PM to 10:30 PM (EST) – 10 Weeks. April 28, 2025.
Inquiry Now Discount Offer
USA: +1 734 418 2465 | India: +91 40 4018 1306
Reviews
There are no reviews yet.