<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Data Lake on Sam Debruyn</title><link>https://debruyn.dev/tags/data-lake/</link><description>Recent content in Data Lake on Sam Debruyn</description><generator>Hugo</generator><language>en-us</language><copyright>© Copyright Debruyn Consultancy</copyright><lastBuildDate>Fri, 30 May 2025 12:04:58 +0200</lastBuildDate><atom:link href="https://debruyn.dev/tags/data-lake/index.xml" rel="self" type="application/rss+xml"/><item><title>Fabric: Lakehouse or Data Warehouse?</title><link>https://debruyn.dev/2023/fabric-lakehouse-or-data-warehouse/</link><pubDate>Thu, 19 Oct 2023 10:24:43 +0200</pubDate><guid>https://debruyn.dev/2023/fabric-lakehouse-or-data-warehouse/</guid><description>&lt;p&gt;There are 2 kinds of companies currently active in the Microsoft data space: those who are migrating to Microsoft Fabric, and those who will &lt;em&gt;soon&lt;/em&gt; be planning their migration to Microsoft Fabric. 😅 One question that often comes back is&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Should I focus on the Lakehouse or the Data Warehouse?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Let&amp;rsquo;s answer that in this post. I can already tell you this: you&amp;rsquo;re asking the wrong question 😉&lt;/p&gt;</description></item><item><title>Is Microsoft Fabric just a rebranding?</title><link>https://debruyn.dev/2023/is-microsoft-fabric-just-a-rebranding/</link><pubDate>Mon, 02 Oct 2023 10:53:24 +0200</pubDate><guid>https://debruyn.dev/2023/is-microsoft-fabric-just-a-rebranding/</guid><description>&lt;p&gt;It&amp;rsquo;s a question I see popping up every now and then. Is Microsoft Fabric just a rebranding of existing Azure services like Synapse, Data Factory, Event Hub, Stream Analytics, etc.? Is it something more? Or is it something entirely new?&lt;/p&gt;
&lt;p&gt;I hate clickbait titles as much as you do. So, before we dive in, let me answer the question right away. &lt;strong&gt;No, Fabric is not just a rebranding.&lt;/strong&gt; I would not even describe Fabric as an &lt;em&gt;evolution&lt;/em&gt; (as Microsoft often does), but rather as a &lt;em&gt;&lt;strong&gt;revolution&lt;/strong&gt;&lt;/em&gt;! Now, let&amp;rsquo;s find out why.&lt;/p&gt;</description></item><item><title>My take-aways from Big Data London: Delta Lake &amp; the open lakehouses</title><link>https://debruyn.dev/2023/my-take-aways-from-big-data-london-delta-lake-the-open-lakehouses/</link><pubDate>Mon, 25 Sep 2023 10:20:35 +0100</pubDate><guid>https://debruyn.dev/2023/my-take-aways-from-big-data-london-delta-lake-the-open-lakehouses/</guid><description>Last week I attended Big Data London. Both days were filled with interesting sessions, mostly focussing on one of the vendors also exhibiting at the conference. There are 2 things I am taking away from this conference: Delta Lake has won the data format wars, and your next data platform is either Snowflake, either an open Lakehouse.</description></item><item><title>How to use service principal authentication to access Microsoft Fabric's OneLake</title><link>https://debruyn.dev/2023/how-to-use-service-principal-authentication-to-access-microsoft-fabrics-onelake/</link><pubDate>Tue, 01 Aug 2023 10:28:11 +0200</pubDate><guid>https://debruyn.dev/2023/how-to-use-service-principal-authentication-to-access-microsoft-fabrics-onelake/</guid><description>&lt;p&gt;Microsoft recently added support to authenticate to OneLake using service principals and managed identities. This allows users to access OneLake from applications without having to use a user account. Let&amp;rsquo;s see how this works.&lt;/p&gt;</description></item><item><title>Microsoft Fabric's Auto Discovery: a closer look</title><link>https://debruyn.dev/2023/microsoft-fabrics-auto-discovery-a-closer-look/</link><pubDate>Wed, 28 Jun 2023 09:00:00 +0200</pubDate><guid>https://debruyn.dev/2023/microsoft-fabrics-auto-discovery-a-closer-look/</guid><description>&lt;p&gt;In &lt;a
 href="https://debruyn.dev/tags/fabric/"
 &gt;previous posts&lt;/a&gt;
, I dug deeper into Microsoft Fabric&amp;rsquo;s SQL-based features and we even &lt;a
 href="https://debruyn.dev/2023/exploring-onelake-with-microsoft-azure-storage-explorer/"
 &gt;explored OneLake using Azure Storage Explorer&lt;/a&gt;
. In this post, I&amp;rsquo;ll take a closer look at Fabric&amp;rsquo;s &lt;strong&gt;auto-discovery&lt;/strong&gt; feature using Shortcuts. Auto-discovery, what&amp;rsquo;s that?&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Fabric&amp;rsquo;s Lakehouses can automatically discover all the datasets already present in your data lake and expose these as tables in Lakehouses (and Warehouses).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Cool, right? At the time of writing, there is a single condition: the tables must be stored in the Delta Lake format. Let&amp;rsquo;s take a closer look.&lt;/p&gt;</description></item><item><title>Exploring OneLake with Microsoft Azure Storage Explorer</title><link>https://debruyn.dev/2023/exploring-onelake-with-microsoft-azure-storage-explorer/</link><pubDate>Tue, 20 Jun 2023 09:30:57 +0200</pubDate><guid>https://debruyn.dev/2023/exploring-onelake-with-microsoft-azure-storage-explorer/</guid><description>&lt;h2 id="recap-onelake--delta-lake"&gt;Recap: OneLake &amp;amp; Delta Lake&lt;/h2&gt;
&lt;p&gt;One of the coolest things about &lt;a
 href="https://www.microsoft.com/en-us/microsoft-fabric/" data-umami-event="outbound_link_click" data-umami-event-url="https://www.microsoft.com/en-us/microsoft-fabric/" target="_blank" rel="noreferrer noopener"
 &gt;Microsoft Fabric&lt;/a&gt;
 is that it nicely decouples storage and compute and it is very transparent about the storage: everything ends up in the OneLake. This is a huge advantage over other data platforms since you don&amp;rsquo;t have to worry about moving data around, it is always available, wherever you need it.&lt;/p&gt;</description></item><item><title>Welcome to the 3rd generation: SQL in Microsoft Fabric</title><link>https://debruyn.dev/2023/welcome-to-the-3rd-generation-sql-in-microsoft-fabric/</link><pubDate>Thu, 15 Jun 2023 20:15:11 +0200</pubDate><guid>https://debruyn.dev/2023/welcome-to-the-3rd-generation-sql-in-microsoft-fabric/</guid><description>&lt;p&gt;&lt;img src="fabric_header.jpeg" alt="Fabric header"&gt;&lt;/p&gt;
&lt;p&gt;While typing this blog post, I&amp;rsquo;m flying back from the &lt;a
 href="https://dataplatformnextstep.com/" data-umami-event="outbound_link_click" data-umami-event-url="https://dataplatformnextstep.com/" target="_blank" rel="noreferrer noopener"
 &gt;Data Platform Next Step&lt;/a&gt;
 conference where I gave a talk about using &lt;a
 href="https://www.getdbt.com/" data-umami-event="outbound_link_click" data-umami-event-url="https://www.getdbt.com/" target="_blank" rel="noreferrer noopener"
 &gt;dbt&lt;/a&gt;
 with &lt;a
 href="https://learn.microsoft.com/en-us/fabric/" data-umami-event="outbound_link_click" data-umami-event-url="https://learn.microsoft.com/en-us/fabric/" target="_blank" rel="noreferrer noopener"
 &gt;Microsoft Fabric&lt;/a&gt;
. DP Next Step was the first conference focussed on Microsoft data services right after the announcement of Microsoft Fabric so a lot of speakers were Microsoft employees and most of the talks had some Fabric content.&lt;/p&gt;
&lt;p&gt;&lt;img src="fabric.png" alt="Microsoft Fabric logo"&gt;&lt;/p&gt;
&lt;p&gt;Fabric Fabric Fabric, what is it all about? In this post I&amp;rsquo;ll go deeper into what it is, why you should care and focus specifically on the SQL aspect of Fabric.&lt;/p&gt;</description></item><item><title>dbt &amp; Fabric: better together</title><link>https://debruyn.dev/speaking/data-platform-next-step-dbt-fabric/</link><pubDate>Fri, 09 Jun 2023 00:00:00 +0000</pubDate><guid>https://debruyn.dev/speaking/data-platform-next-step-dbt-fabric/</guid><description>&lt;p&gt;I gave a talk at the &lt;a
 href="https://dataplatformnextstep.com/breakout-sessions/" data-umami-event="outbound_link_click" data-umami-event-url="https://dataplatformnextstep.com/breakout-sessions/" target="_blank" rel="noreferrer noopener"
 &gt;Data Platform Next Step&lt;/a&gt;
 conference in Billund, Denmark. The conference was the first conference with sessions about &lt;a
 href="https://www.microsoft.com/microsoft-fabric" data-umami-event="outbound_link_click" data-umami-event-url="https://www.microsoft.com/microsoft-fabric" target="_blank" rel="noreferrer noopener"
 &gt;Microsoft Fabric&lt;/a&gt;
 right after the launch of the public preview at &lt;a
 href="https://build.microsoft.com/" data-umami-event="outbound_link_click" data-umami-event-url="https://build.microsoft.com/" target="_blank" rel="noreferrer noopener"
 &gt;Microsoft Build&lt;/a&gt;
.&lt;/p&gt;
&lt;p&gt;dbt is the new data transformation tool taking the world by storm. It lowers the barrier of entry into the world of data analytics to everyone who ever wrote a line of SQL. Did you know it integrates quite well with all Microsoft SQL products and even with Fabric? Join this session to follow in the footsteps of thousands of analytics engineers and fall in love with dbt. Learn more about how dbt works with Fabric and Azure SQL from the maintainer of the official dbt adapter! We’ll use Fabric and VS Code to build our first Hello Fabric project.&lt;/p&gt;</description></item><item><title>Deploy a data lake on Azure in less than an hour</title><link>https://debruyn.dev/speaking/deploy-a-data-lake-on-azure-in-less-than-an-hour/</link><pubDate>Fri, 05 Jun 2020 00:00:00 +0000</pubDate><guid>https://debruyn.dev/speaking/deploy-a-data-lake-on-azure-in-less-than-an-hour/</guid><description>&lt;p&gt;This talk was given on multiple occasions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;During a &lt;a
 href="https://www.meetup.com/dataroots-research/events/270741567/" data-umami-event="outbound_link_click" data-umami-event-url="https://www.meetup.com/dataroots-research/events/270741567/" target="_blank" rel="noreferrer noopener"
 &gt;lunch webinar&lt;/a&gt;
 at &lt;a
 href="https://dataroots.io" data-umami-event="outbound_link_click" data-umami-event-url="https://dataroots.io" target="_blank" rel="noreferrer noopener"
 &gt;dataroots&lt;/a&gt;
.&lt;/li&gt;
&lt;li&gt;At Data Science Leuven (&lt;a
 href="https://www.meetup.com/data-science-leuven/events/273372837/" data-umami-event="outbound_link_click" data-umami-event-url="https://www.meetup.com/data-science-leuven/events/273372837/" target="_blank" rel="noreferrer noopener"
 &gt;event details&lt;/a&gt;
).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A data lake is fundamental to a modern big data approach so it’s important to set it up the right way. But how can you do that without having to spend hours on research and then losing days configuring every component of the data lake? How can you gain a lot of time, while still deploying a fully functional data lake with all the necessary components?&lt;/p&gt;</description></item><item><title>Fabric training programs</title><link>https://debruyn.dev/services/fabric-training/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://debruyn.dev/services/fabric-training/</guid><description>&lt;script type="text/javascript"&gt;
 (function (C, A, L) { let p = function (a, ar) { a.q.push(ar); }; let d = C.document; C.Cal = C.Cal || function () { let cal = C.Cal; let ar = arguments; if (!cal.loaded) { cal.ns = {}; cal.q = cal.q || []; d.head.appendChild(d.createElement("script")).src = A; cal.loaded = true; } if (ar[0] === L) { const api = function () { p(api, arguments); }; const namespace = ar[1]; api.q = api.q || []; if(typeof namespace === "string"){cal.ns[namespace] = cal.ns[namespace] || api;p(cal.ns[namespace], ar);p(cal, ["initNamespace", namespace]);} else p(cal, ar); return;} p(cal, ar); }; })(window, "https://app.cal.com/embed/embed.js", "init");
Cal("init", "15", {origin:"https://cal.com"});

 Cal.ns["15"]("floatingButton", {"calLink":"debruyn/15","config":{"layout":"month_view","theme":"light"},"buttonText":"Discuss your Fabric training","hideButtonIcon":false,"buttonTextColor":"#FFFFFF","buttonColor":"#06324d"}); 
 Cal.ns["15"]("ui", {"theme":"light","cssVarsPerTheme":{"light":{"cal-brand":"#06324d"}},"hideEventTypeDetails":true,"layout":"month_view"});
 &lt;/script&gt;
 
&lt;p&gt;I&amp;rsquo;ve been using Microsoft Fabric since early 2023 and have been awarded the Microsoft Most Valuable Professional (MVP) award specifically for Fabric. I have a deep understanding of the platform and its capabilities, and I&amp;rsquo;m excited to share my knowledge with you. I&amp;rsquo;ve worked out several training programs that can help you and your team get started with Fabric, or to deepen your knowledge of the platform.&lt;/p&gt;</description></item></channel></rss>