📐 Implement & Manage Semantic Models

Based on: Microsoft Fabric documentation (Microsoft Learn)

📁 ← Back to Home

Domain 3 accounts for 25–30 % of the DP-600 exam. It spans semantic-model design (storage modes, star schemas, relationships, DAX, calculation groups, composite models) and enterprise-scale optimization (DAX tuning, Direct Lake configuration, incremental refresh).

Table of contents

🏗️ Design and Build Semantic Models
⚡ Optimize Enterprise-Scale Semantic Models
📋 Scenario-Based Quick Reference

🏗️ Design and Build Semantic Models

💾 Choose a Storage Mode

Power BI semantic models support three storage modes. The right choice depends on data volume, latency requirements, and whether Microsoft Fabric is in play.

Aspect	Import	DirectQuery	Direct Lake
Data location	Compressed in-memory (VertiPaq)	Stays in source; queries sent live	Delta tables in OneLake, loaded on demand
Performance	Fastest queries	Slower — depends on source	Near-Import speed, no copy
Data freshness	Stale until refresh	Real-time	Near-real-time (framing)
Model size limit	SKU RAM / Premium capacity	No hard limit	SKU guardrails apply
Transformation layer	Power Query (M) in dataset	Limited PQ; views in source	Notebooks / Dataflows → Lakehouse
Best for	Small–mid datasets needing speed	Real-time on relational sources	Fabric-native analytics at scale

flowchart TD
    A[Start: Choose Storage Mode] --> B{Data in OneLake<br/>Delta tables?}
    B -- Yes --> C{Need real-time<br/>to the second?}
    C -- No --> D[Direct Lake]
    C -- Yes --> E[Direct Lake +<br/>DirectQuery fallback]
    B -- No --> F{Data volume<br/>> capacity RAM?}
    F -- Yes --> G[DirectQuery]
    F -- No --> H{Need real-time?}
    H -- Yes --> G
    H -- No --> I[Import]

🎯 Exam Tip: Direct Lake is the preferred mode for Fabric workloads. It reads Parquet files directly from OneLake — no data copy, no scheduled refresh in the Import sense. Know that it requires Delta tables in a Lakehouse or Warehouse.

⚠️ Exam Caveat: Direct Lake is only available in Microsoft Fabric capacities (F SKUs) and Power BI Premium (P SKUs). It is not available in Pro-only workspaces or shared capacity.

⭐ Implement a Star Schema

A star schema organises the semantic model around fact tables (events / measures) surrounded by dimension tables (descriptive attributes). This is the foundation for performant DAX and clean reports.

erDiagram
    DIM_DATE ||--o{ FACT_SALES : "DateKey"
    DIM_PRODUCT ||--o{ FACT_SALES : "ProductKey"
    DIM_CUSTOMER ||--o{ FACT_SALES : "CustomerKey"
    DIM_STORE ||--o{ FACT_SALES : "StoreKey"

    FACT_SALES {
        int DateKey
        int ProductKey
        int CustomerKey
        int StoreKey
        decimal SalesAmount
        int Quantity
    }
    DIM_DATE {
        int DateKey
        date FullDate
        string MonthName
        int Year
    }
    DIM_PRODUCT {
        int ProductKey
        string ProductName
        string Category
    }
    DIM_CUSTOMER {
        int CustomerKey
        string CustomerName
        string Region
    }
    DIM_STORE {
        int StoreKey
        string StoreName
        string City
    }

Star vs Snowflake:

Star Schema	Snowflake Schema
Dimensions fully denormalised	Dimensions normalised into sub-tables
Fewer joins → faster VertiPaq scans	More joins → harder for the engine to optimise
Recommended for Power BI	Acceptable at source; flatten before model

🎯 Exam Tip: The exam strongly favours star schemas. If a question describes a normalised or snowflake source, the correct answer usually involves flattening dimensions in Power Query or the Lakehouse layer before loading into the model.

🔗 Implement Relationships

Core Relationship Properties

Property	Options	Notes
Cardinality	One-to-many (1:), Many-to-one (:1), One-to-one (1:1), Many-to-many (:)	1:* is the default and preferred
Cross-filter direction	Single, Both (bi-directional)	Both enables filtering from fact → dimension; use sparingly
Active / Inactive	One active per path; others inactive	Invoke inactive relationships with `USERELATIONSHIP`

Bridge Tables and Many-to-Many

When a fact table has multiple values per dimension row (e.g., a patient with many diagnoses), insert a bridge table between them. Set the bridge-to-fact side as many-to-many and enable bi-directional filtering — or better, use DAX measures with CALCULATE + CROSSFILTER.

Role-Playing Dimensions with USERELATIONSHIP

A Date dimension often plays multiple roles (Order Date, Ship Date, Due Date). Only one relationship can be active. Use USERELATIONSHIP in measures for the others:

Ship Date Sales =
CALCULATE(
    SUM( Sales[SalesAmount] ),
    USERELATIONSHIP( Sales[ShipDateKey], DimDate[DateKey] )
)

⚠️ Exam Caveat: USERELATIONSHIP only works inside CALCULATE (or CALCULATETABLE). It cannot be used standalone. Expect questions that test whether you know this constraint.

✍️ Write DAX Calculations

Variables and CALCULATE

Variables improve readability and prevent repeated evaluation:

Profit Margin % =
VAR _Revenue = SUM( Sales[SalesAmount] )
VAR _Cost    = SUM( Sales[CostAmount] )
RETURN
    IF(
        _Revenue = 0,
        BLANK(),
        DIVIDE( _Revenue - _Cost, _Revenue )
    )

CALCULATE is the most important DAX function — it evaluates an expression under modified filter context:

All-Region Sales =
CALCULATE(
    SUM( Sales[SalesAmount] ),
    ALL( DimStore[Region] )
)

Region % of Total =
VAR _RegionSales = SUM( Sales[SalesAmount] )
VAR _TotalSales  = CALCULATE( SUM( Sales[SalesAmount] ), ALL( DimStore[Region] ) )
RETURN
    DIVIDE( _RegionSales, _TotalSales )

Iterator Functions (SUMX, AVERAGEX, MAXX)

Iterators evaluate an expression row by row over a table, then aggregate:

Weighted Avg Price =
SUMX(
    Sales,
    Sales[Quantity] * RELATED( DimProduct[UnitPrice] )
) / SUM( Sales[Quantity] )

Max Line Total =
MAXX( Sales, Sales[Quantity] * Sales[UnitPrice] )

🎯 Exam Tip: Know the difference between SUM (aggregator — works on a single column) and SUMX (iterator — can evaluate an expression per row). Questions often test whether a scenario needs an iterator.

Table Filtering: FILTER, ALL, ALLEXCEPT

Function	Purpose
`ALL( table/column )`	Removes all filters from the specified table or columns
`ALLEXCEPT( table, col1, col2 )`	Removes filters from all columns except those listed
`FILTER( table, expression )`	Returns a table of rows that satisfy the expression (row context)
`KEEPFILTERS`	Adds filters without overriding existing context inside CALCULATE

Top Category Sales =
CALCULATE(
    SUM( Sales[SalesAmount] ),
    FILTER(
        ALL( DimProduct[Category] ),
        [Total Sales] > 1000000
    )
)

⚠️ Exam Caveat: Using FILTER on a large table with millions of rows is a performance anti-pattern. The exam may present this as the “correct but slow” option — prefer column predicates inside CALCULATE directly when possible.

Windowing Functions (OFFSET, WINDOW, INDEX)

These functions (introduced in late 2022) enable row-relative calculations without complex earlier workarounds:

Previous Month Sales =
CALCULATE(
    SUM( Sales[SalesAmount] ),
    OFFSET(
        -1,
        ALLSELECTED( DimDate[MonthYear] ),
        ORDERBY( DimDate[MonthYear], ASC )
    )
)

Function	Use Case
`OFFSET`	Access a value N rows before/after in a sorted partition
`WINDOW`	Define a sliding or absolute range of rows
`INDEX`	Access a specific ordinal row position

Information Functions

Function	Returns	Common Use
`ISBLANK( value )`	TRUE if value is BLANK	Guard against division errors
`HASONEVALUE( column )`	TRUE if exactly one value in filter context	Conditional headers / formatting
`SELECTEDVALUE( column, alt )`	The single value in context, or alt	Dynamic titles, parameter captures

🧮 Calculation Groups, Dynamic Format Strings, and Field Parameters

Calculation Groups

Calculation groups let you define reusable DAX transformations that apply to any measure at evaluation time. They are created through Tabular Editor or XMLA endpoints.

Example — a Time Intelligence calculation group:

-- Calculation Group: Time Intelligence
-- Calculation Item: YTD
CALCULATE(
    SELECTEDMEASURE(),
    DATESYTD( DimDate[FullDate] )
)

-- Calculation Item: PY (Prior Year)
CALCULATE(
    SELECTEDMEASURE(),
    SAMEPERIODLASTYEAR( DimDate[FullDate] )
)

-- Calculation Item: YoY %
VAR _Current = SELECTEDMEASURE()
VAR _PY = CALCULATE(
    SELECTEDMEASURE(),
    SAMEPERIODLASTYEAR( DimDate[FullDate] )
)
RETURN
    DIVIDE( _Current - _PY, _PY )

Calculation Groups	Individual Measures
One definition applies to all measures	Each measure needs its own YTD, PY, YoY copy
Maintained centrally via Tabular Editor / XMLA	Maintained individually in Power BI Desktop
Can include dynamic format strings	Format strings are per-measure
Reduces measure proliferation	Can explode to hundreds of measures

Dynamic Format Strings

Attached to a calculation item, a format string expression changes the display format contextually:

-- Format string for YoY % item
IF(
    ISSELECTEDMEASURE( [Total Sales] ),
    "#,##0.0%",
    "#,##0"
)

Field Parameters

Field parameters allow report consumers to swap dimensions or measures on a visual dynamically. Created in Power BI Desktop via the modelling ribbon, they generate a disconnected table with a DAX expression listing the fields.

🎯 Exam Tip: Calculation groups require the model to be at compatibility level 1500+ and are created outside Power BI Desktop (Tabular Editor, XMLA). Field parameters, by contrast, are created directly in Desktop.

📦 Large Semantic Model Storage Format

When enabled, the semantic model can exceed the default per-dataset size limit by storing segments on Premium capacity storage rather than solely in memory.

When to enable:

Models approaching or exceeding the default size limit (e.g., > 10 GB on P1/F64)
Incremental refresh with many partitions
Need XMLA read/write endpoint access for third-party tooling

Implications:

Requires Premium Per User, Premium capacity (P SKU), or Fabric capacity (F SKU)
Once enabled, the model can be managed through the XMLA endpoint (Tabular Editor, SSMS, ALM Toolkit)
Enables features such as object-level security, calculation groups via XMLA, and metadata-only deployments

⚠️ Exam Caveat: Enabling large model storage format is a one-way setting — once turned on it cannot be reverted to small format without recreating the dataset.

🧩 Design and Build Composite Models

Composite models combine multiple storage modes in a single semantic model. Tables can individually be set to Import, DirectQuery, or Direct Lake.

flowchart LR
    subgraph Composite Model
        direction TB
        A[DimDate<br/>Import] --- F[Fact_Sales<br/>Direct Lake]
        B[DimProduct<br/>Import] --- F
        F --- G[Fact_Budget<br/>DirectQuery<br/>SQL DB]
    end
    F -. reads .-> DL[(OneLake<br/>Delta Tables)]
    G -. queries .-> SQL[(Azure SQL<br/>Database)]

Key design rules:

Import + DirectQuery — classic composite model; aggregation tables in Import accelerate DQ queries.
Direct Lake + DirectQuery — Fabric scenario; core facts from Lakehouse, supplemental data from external SQL.
Relationships across storage modes form a limited relationship (DirectQuery semantics apply to that join).

🎯 Exam Tip: In a composite model, any relationship that crosses storage-mode boundaries is evaluated using DirectQuery semantics, even if one side is Import. This can affect performance — the exam tests awareness of this.

⚡ Optimize Enterprise-Scale Semantic Models

🚀 Improve Query and Visual Performance

Technique	Detail
Reduce visual count	Aim for ≤ 8 visuals per page; each visual fires a separate query
Avoid high-cardinality columns in visuals	Showing millions of distinct values forces large result sets
Use aggregation tables	Pre-aggregated Import tables sit in front of DirectQuery detail tables
Set report page type to Tooltip or Drillthrough	Reduces default-load query count
Use Performance Analyzer	Capture DAX queries generated by each visual; identify slow ones

⚙️ Improve DAX Performance

Best Practice	Anti-Pattern
Use variables — evaluated once, reused	Repeating the same sub-expression multiple times
Column predicates in `CALCULATE` directly	Wrapping large tables in `FILTER()`
Use `KEEPFILTERS` to intersect, not override	Using `FILTER( ALL(...) )` when intersection is intended
Avoid `DISTINCTCOUNT` on very high-cardinality columns	—
Use `DIVIDE( a, b )` instead of `a / b`	Manual `IF` checks for zero
Minimise row-level iteration with `SUMX` over huge tables	Nested iterators (iterator inside iterator)

-- Anti-pattern: FILTER on entire table
Bad Example =
CALCULATE(
    SUM( Sales[Amount] ),
    FILTER( Sales, Sales[Region] = "West" )
)

-- Better: direct column predicate
Good Example =
CALCULATE(
    SUM( Sales[Amount] ),
    Sales[Region] = "West"
)

🎯 Exam Tip: The Performance Analyzer in Power BI Desktop shows three timings for each visual: DAX query, visual rendering, and other. For DAX tuning, copy the DAX query and test it in DAX Studio or the Fabric portal query view.

🔷 Configure Direct Lake

Architecture Overview

flowchart LR
    LH[(Lakehouse<br/>Delta Tables<br/>V-Order Parquet)] -- framing --> DL[Direct Lake<br/>Semantic Model]
    DL -- query --> PBI[Power BI<br/>Reports]
    DL -. fallback .-> DQ[DirectQuery<br/>SQL Endpoint]

Framing and V-Order

Framing is the process by which the Direct Lake model takes a snapshot of the current Delta table version. A new frame is triggered on refresh or automatically when the model detects new data (depending on settings).
V-Order is a write-time optimisation applied to Parquet files in the Lakehouse that aligns data for fast VertiPaq reads. Always ensure V-Order is enabled for Direct Lake tables.

Fallback Behaviour

Setting	Behaviour
Automatic fallback (default)	If data exceeds guardrails or unsupported DAX is used, the engine silently falls back to DirectQuery via the SQL endpoint
Manual / disabled fallback	Queries that cannot be served from Direct Lake will fail instead of falling back — useful when you want to guarantee in-memory speed

Guardrails (per SKU) define thresholds for row count per table, column count, and model size. Exceeding them triggers fallback or failure.

⚠️ Exam Caveat: If Direct Lake fallback is disabled and a query exceeds guardrails, users will see an error — not a slow result. The exam may test your understanding of when to enable vs disable fallback.

Direct Lake on OneLake vs SQL Endpoint

Aspect	Direct Lake on OneLake (default)	Direct Lake via SQL Endpoint
Data source	Delta Parquet files directly	SQL analytical endpoint of Lakehouse/Warehouse
Performance	Best — no translation layer	Slight overhead from SQL layer
Use case	Standard Fabric analytics	When you need SQL views, security, or transformations
Fallback target	SQL endpoint (DirectQuery)	Same SQL endpoint

🎯 Exam Tip: Direct Lake always reads the Delta files for primary queries. The SQL endpoint is used only as a fallback target when DirectQuery mode kicks in. Questions may try to confuse these two paths.

🔄 Implement Incremental Refresh

Incremental refresh partitions a table by date so that only recent data is refreshed, while historical partitions are untouched.

Setup steps:

Create RangeStart and RangeEnd parameters (type DateTime) in Power Query.
Filter the source table to rows between these parameters.
Define the refresh policy: archive period (e.g., 3 years), incremental period (e.g., 30 days).
Optionally enable real-time data with DirectQuery — this adds a DirectQuery partition for the latest data that is always live.

Configuration	Effect
Archive period = 3 years	Historical partitions covering 3 years are loaded once and not refreshed
Incremental window = 30 days	Only the most recent 30 days of partitions are refreshed each cycle
Detect data changes	Only refresh incremental partitions where source rows changed (requires a `LastModified` column)
Real-time + DirectQuery	A DQ partition covers data newer than the latest Import partition

⚠️ Exam Caveat: The RangeStart and RangeEnd parameters must be of type DateTime and must be named exactly RangeStart and RangeEnd (case-sensitive). This is a frequent exam trick.

🎯 Exam Tip: Incremental refresh combined with the large model storage format is required when the number of partitions grows large. Without large format, you may hit partition-count limits.

📋 Scenario-Based Quick Reference

#	Scenario	Answer
1	Data sits in a Fabric Lakehouse and you want the fastest query speed without copying data	Direct Lake storage mode
2	External Azure SQL DB must show real-time data in reports	DirectQuery to the SQL DB
3	Need both Lakehouse facts and live Azure SQL budget data in one model	Composite model — Direct Lake + DirectQuery
4	Date dimension plays Order Date and Ship Date roles	One active relationship; use USERELATIONSHIP in measures for the inactive one
5	Hundreds of measures each need YTD, PY, and YoY variants	Create a calculation group with three calculation items
6	Report visual shows “query exceeded guardrails” error	Direct Lake fallback is disabled; either enable fallback or reduce data/columns below SKU limits
7	Model size approaching 10 GB on P1 / F64	Enable large semantic model storage format
8	Need to refresh only the last 7 days of a 5-year sales table	Configure incremental refresh with 5-year archive, 7-day incremental window
9	DAX measure runs slowly — wraps entire Sales table in FILTER	Replace `FILTER( Sales, ... )` with a column predicate inside `CALCULATE`
10	Report users want to switch between Revenue, Cost, and Profit on one visual	Implement a field parameter
11	Bridge table connects patients to multiple diagnoses	Many-to-many relationship through bridge; consider bi-directional filter or DAX with CROSSFILTER
12	You want Prior Year to show as a percentage format but Current Year as currency	Use dynamic format strings on the calculation group items
13	Need third-party tool (Tabular Editor) to deploy model metadata	Enable XMLA read/write endpoint (requires Premium / Fabric capacity)
14	Direct Lake model must guarantee no silent performance degradation	Disable automatic fallback — queries that exceed guardrails will error instead of falling back to DQ
15	Incremental refresh partitions keep growing and model won’t publish	Enable large model storage format to support higher partition counts

These notes cover the “Implement and manage semantic models” domain of the DP-600 exam. For full coverage, pair these notes with hands-on practice in a Fabric trial or capacity environment.

← 02 — Prepare Data

04 — Quick Reference Cheatsheet →