A single Fabric Capacity is easy to set up. But what about the risk of a single user taking down your entire data estate with a poorly written query? In this session, we’ll explore how to split and configure Fabric Capacities for different workloads. Autoscaling, auto pause, bursting, smoothing, monitoring, and more, we’ll cover it all. By the end of this talk, you’re ready to build a scalable and robust design Fabric Workspace and Capacity design for the entire organization.
In these slides
Intro
Why multiple Workspaces
Capacities
Throttling
Avoiding throttling
Conclusion
From the event
7 photos
FABRIC CAPACITIES Sam Debruyn Data Saturday Holland October 2024 DIVIDE & CONQUER
Thank you sponsors!
Rate Data Saturday Holland 1 review = 1 € Towards beating pancreatic cancer
Who am I? Sam Debruyn 📍 Heist-op-den-Berg, BE 💼 Consultant / Data & Cloud Architect 5⃣ years in data 🔟 years in software / architecture / cloud 🫶 Fabric, Azure, modern data stack
What we'll talk about Intro Why multiple Workspaces Capacities Throttling Avoiding throttling Conclusion SLIDES AVAILABLE AT THE END
Setting the stage… Fabric medallion architecture example
The 3 Layers of the Medallion Architecture
The 3 Layers of the Medallion Architecture
The 3 Layers of the Medallion Architecture Curated/gold Purpose : high-quality data supporting business reporting, advanced analytics. Pre-aggregated and tailored to analytical needs.
Overview Overview: entire platform (example)
Did I invent this? No, this is also how Microsoft recommends it
Easy to extend
Workspaces & Capacities
Fabric concepts: Workspaces & Capacities Capacity • pool of Capacity Units • matches a certain amount of compute power • to be spread amongst one or more Workspaces Workspace • logical grouping of items • Lakehouses, Warehouses, Reports, KQL, … • possible access control boundary
Why should you create separate Workspaces?
Workspace Configuration
Workspace Configuration
Workspace Configuration
Workspace Configuration
Workspace Configuration
Why should you create separate Workspaces?
Capacities
Capacity SKUs
Bursting & smoothing
Example background Example interactive
Bursting & smoothing SKU CU’s Available CUs for interactive 10min workloads Available CUs for background 24h workloads Actual workload duration & consumption F2 2 1.200 172.800 ASAP* F4 4 2.400 345.600 ASAP* F8 8 4.800 691.200 ASAP* F16 16 9.600 1.382.400 ASAP* F32 32 19.200 2.764.800 ASAP* F64 64 38.400 5.529.600 ASAP* F128 128 76.800 11.059.200 ASAP* … … … … …
Bursting & smoothing
Impact of SKU choice Capacities determine feature availability E.g. CoPilot, Power BI only F64 or higher Capacities determine how features are available Nodes and cores/node in Spark (2 vCores per CU – burst factor 3 | 0.25 nodes per CU) Compute nodes in Data Warehouse
supported regions Capacity level settings
Bursting & smoothing
Throttling
How to think of a Capacity
What do when throttled
What do when throttled F Capacity only!
What do when throttled F Capacity only!
What do when throttled
What do when throttled P Capacity only!
What do when throttled
Optimizing CUs usage and cost
Optimizing CUs usage and cost Reservations <> Capacities! Reservation : pool of available CU’s to be used in one or more Capacities E.g. Reservation of 64 CU’s: • 1x F64 • 2x F32 • 2x F16 + 1x F32 • 4x F4 + 2x F8 + 2x F16
Optimizing CUs usage and cost Reservation <> max amount of CU’s 💡 HINT : Reserve what you’re using on a daily base Combine with PAYG for infrequent usage Create Logic App to auto-pause PAYG Capacity Resume PAYG when needed E.g. can even be through Fabric API from a Notebook in the Capacity using reserved CU’s
Optimizing CUs usage and cost Require Power BI Premium? PPU is still available! Move Power BI reports to separate Workspace(s) + activate PPU on these Workspace(s)
More ideas Use data from Capacity Metrics App or time-based triggers to: • Automatically scale up/down Capacity • Automatically move Workspaces to different Capacities • Automatically pause/resume Capacities • Automatically create new Capacities
Capacity Metrics App
Capacity Metrics App
Capacity Metrics App
Announced at FabConEurope…
Why should you create separate Workspaces?
How access can be managed in Fabric Workspace level roles: Admin, Member, Contributor, Viewer Item sharing: Read, Edit, Share Data sharing: Read, ReadData, ReadAll OneLake RBAC (preview) Note: this will probably be improved with the introduction of OneSecurity
Quiz
Recap
RECAP: Medallion layers: bronze, silver, gold
Recap Split Workspaces by type of workload and role in the data fabric
Recap Single Capacities are good for trials, but we should avoid them for actual implementations
Recap Access control can be complex, start by managing access on the Workspace level
Sam’s 5 golden rules for Workspace & Capacity design in Fabric
Sam’s 5 golden rules for Workspace & Capacity design in Fabric
Sam’s 5 golden rules for Workspace & Capacity design in Fabric
Sam’s 5 golden rules for Workspace & Capacity design in Fabric
Sam’s 5 golden rules for Workspace & Capacity design in Fabric
Slides Slides available at https://debruyn.dev/ dsh24
Thank you sponsors!
Rate Data Saturday Holland 1 review = 1 € Towards beating pancreatic cancer
Rate this session
Questions? sam@debruyn.dev https://debruyn.dev
Stay in the loop
See you at the next one?
I announce upcoming talks on LinkedIn — that's also where most of the conference chatter happens. Slides and recordings land right here on the speaking page. If you'd rather follow along quietly, the RSS feed has every new post and talk.