Home About Services Speaking Blog
← All speaking
Microsoft Fabric Medallion Data Engineering Data Platform Architecture Capacity Management OneLake Power BI Security

Fabric Capacities: Divide and Conquer

Data Saturday Holland — Kinepolis Jaarbeurs, Utrecht, the Netherlands
About this talk

A single Fabric Capacity is easy to set up. But what about the risk of a single user taking down your entire data estate with a poorly written query? In this session, we’ll explore how to split and configure Fabric Capacities for different workloads. Autoscaling, auto pause, bursting, smoothing, monitoring, and more, we’ll cover it all. By the end of this talk, you’re ready to build a scalable and robust design Fabric Workspace and Capacity design for the entire organization.

In these slides
  1. Intro
  2. Why multiple Workspaces
  3. Capacities
  4. Throttling
  5. Avoiding throttling
  6. Conclusion
From the event
7 photos
FABRIC CAPACITIES Sam Debruyn Data Saturday Holland October 2024 DIVIDE & CONQUER Thank you sponsors! Rate Data Saturday Holland 1 review = 1 € Towards beating pancreatic cancer Who am I? Sam Debruyn 📍 Heist-op-den-Berg, BE 💼 Consultant / Data & Cloud Architect 5⃣ years in data 🔟 years in software / architecture / cloud 🫶 Fabric, Azure, modern data stack What we'll talk about Intro Why multiple Workspaces Capacities Throttling Avoiding throttling Conclusion SLIDES AVAILABLE AT THE END Setting the stage… Fabric medallion architecture example The 3 Layers of the Medallion Architecture The 3 Layers of the Medallion Architecture The 3 Layers of the Medallion Architecture Curated/gold Purpose : high-quality data supporting business reporting, advanced analytics. Pre-aggregated and tailored to analytical needs. Overview Overview: entire platform (example) Did I invent this? No, this is also how Microsoft recommends it Easy to extend Workspaces & Capacities Fabric concepts: Workspaces & Capacities Capacity • pool of Capacity Units • matches a certain amount of compute power • to be spread amongst one or more Workspaces Workspace • logical grouping of items • Lakehouses, Warehouses, Reports, KQL, … • possible access control boundary Why should you create separate Workspaces? Workspace Configuration Workspace Configuration Workspace Configuration Workspace Configuration Workspace Configuration Why should you create separate Workspaces? Capacities Capacity SKUs Bursting & smoothing Example background Example interactive Bursting & smoothing SKU CU’s Available CUs for interactive 10min workloads Available CUs for background 24h workloads Actual workload duration & consumption F2 2 1.200 172.800 ASAP* F4 4 2.400 345.600 ASAP* F8 8 4.800 691.200 ASAP* F16 16 9.600 1.382.400 ASAP* F32 32 19.200 2.764.800 ASAP* F64 64 38.400 5.529.600 ASAP* F128 128 76.800 11.059.200 ASAP* … … … … … Bursting & smoothing Impact of SKU choice Capacities determine feature availability E.g. CoPilot, Power BI only F64 or higher Capacities determine how features are available Nodes and cores/node in Spark (2 vCores per CU – burst factor 3 | 0.25 nodes per CU) Compute nodes in Data Warehouse supported regions Capacity level settings Bursting & smoothing Throttling How to think of a Capacity What do when throttled What do when throttled F Capacity only! What do when throttled F Capacity only! What do when throttled What do when throttled P Capacity only! What do when throttled Optimizing CUs usage and cost Optimizing CUs usage and cost Reservations <> Capacities! Reservation : pool of available CU’s to be used in one or more Capacities E.g. Reservation of 64 CU’s: • 1x F64 • 2x F32 • 2x F16 + 1x F32 • 4x F4 + 2x F8 + 2x F16 Optimizing CUs usage and cost Reservation <> max amount of CU’s 💡 HINT : Reserve what you’re using on a daily base Combine with PAYG for infrequent usage Create Logic App to auto-pause PAYG Capacity Resume PAYG when needed E.g. can even be through Fabric API from a Notebook in the Capacity using reserved CU’s Optimizing CUs usage and cost Require Power BI Premium? PPU is still available! Move Power BI reports to separate Workspace(s) + activate PPU on these Workspace(s) More ideas Use data from Capacity Metrics App or time-based triggers to: • Automatically scale up/down Capacity • Automatically move Workspaces to different Capacities • Automatically pause/resume Capacities • Automatically create new Capacities Capacity Metrics App Capacity Metrics App Capacity Metrics App Announced at FabConEurope… Why should you create separate Workspaces? How access can be managed in Fabric Workspace level roles: Admin, Member, Contributor, Viewer Item sharing: Read, Edit, Share Data sharing: Read, ReadData, ReadAll OneLake RBAC (preview) Note: this will probably be improved with the introduction of OneSecurity Quiz Recap RECAP: Medallion layers: bronze, silver, gold Recap Split Workspaces by type of workload and role in the data fabric Recap Single Capacities are good for trials, but we should avoid them for actual implementations Recap Access control can be complex, start by managing access on the Workspace level Sam’s 5 golden rules for Workspace & Capacity design in Fabric Sam’s 5 golden rules for Workspace & Capacity design in Fabric Sam’s 5 golden rules for Workspace & Capacity design in Fabric Sam’s 5 golden rules for Workspace & Capacity design in Fabric Sam’s 5 golden rules for Workspace & Capacity design in Fabric Slides Slides available at https://debruyn.dev/ dsh24 Thank you sponsors! Rate Data Saturday Holland 1 review = 1 € Towards beating pancreatic cancer Rate this session Questions? sam@debruyn.dev https://debruyn.dev