⚡ Quick Reference Cheatsheet

Based on: Microsoft Fabric documentation (Microsoft Learn)

📁 ← Back to Home

Last-minute review sheet for the DP-600 Microsoft Fabric Analytics Engineer Associate exam. Print it, bookmark it, or skim it in the parking lot.

Table of contents

🔢 1 — Key Numbers & Limits
💾 2 — Storage Mode Decision Matrix
📐 3 — DAX Quick Reference
🔒 4 — Security Layers Cheatsheet
🗄️ 5 — Data Store Selection
📥 6 — Ingestion Method Selection
🔷 7 — Direct Lake Deep Dive
- 📌 Key Facts
⚠️ 8 — Common Exam Traps
✅ 9 — Pre-Exam Checklist

🔢 1 — Key Numbers & Limits

Item	Value / Threshold
Max tables per Direct Lake semantic model (F2)	500
Max tables per Direct Lake semantic model (F64+)	1,000
Max rows per table (Direct Lake, F2)	300 million
Max rows per table (Direct Lake, F64)	1.5 billion
Max columns per table (Direct Lake)	500
XMLA read/write endpoint requirement	Premium / Fabric F64+ capacity (read available at lower SKUs)
Deployment pipeline stages (max)	10 (Dev, Test, Prod, plus up to 7 custom stages)
Workspace roles	4: Admin, Member, Contributor, Viewer
Sensitivity labels — who can apply	Users with Information Protection license + label publishing policy
Incremental refresh — max partitions (Import)	10,000 per table
Max measures in a single DAX query	Depends on memory; no hard cap but keep below 100 for performance
OneLake file format for Direct Lake	Delta / Parquet (V-Order optimized)
Max scheduled refreshes per day (Pro)	8
Max scheduled refreshes per day (Premium)	48

Exam Tip: SKU-specific guardrails (row limits, table counts) are the most common “trick number” questions. Know at least the F2 and F64 tiers.

💾 2 — Storage Mode Decision Matrix

Criteria	Import	DirectQuery	Direct Lake	Composite
Data freshness	Scheduled / on-demand refresh	Real-time	Near real-time (framing)	Mixed
Query performance	Fastest (in-memory)	Depends on source	Fast (columnar from OneLake)	Varies by table
Data volume	Limited by capacity memory	No size limit	Guardrails per SKU	Mixed
Data source	Any supported	Any supported	Delta tables in OneLake only	Any mix
DAX support	Full	Full (slower)	Full (with fallback)	Full
Use case	Small-to-mid datasets, max perf	Enforce source RLS, large data	Fabric-native analytics	Migration, mixed sources
Refresh needed?	Yes	No	Framing only	Partial

flowchart TD
    A[Choose Storage Mode] --> B{Data in OneLake<br/>as Delta tables?}
    B -- Yes --> C{Need real-time<br/>to the second?}
    C -- No --> D["**Direct Lake**<br/>Best of both worlds"]
    C -- Yes --> E["**DirectQuery**<br/>to Lakehouse SQL endpoint"]
    B -- No --> F{Data source<br/>supports DirectQuery?}
    F -- No --> G["**Import**<br/>Schedule refresh"]
    F -- Yes --> H{Need max<br/>performance?}
    H -- Yes --> G
    H -- No --> I["**DirectQuery**<br/>or Composite"]

Exam Caveat: Direct Lake is only available for Delta tables stored in OneLake (Lakehouse or Warehouse). If data is in Azure SQL or Synapse, you cannot use Direct Lake.

📐 3 — DAX Quick Reference

🔧 CALCULATE & Filter Modifiers

Pattern	What It Does	Example
`CALCULATE(expr, filter)`	Evaluates expression in modified filter context	`CALCULATE(SUM(Sales[Amount]), Product[Color]="Red")`
`ALL(table/col)`	Removes all filters from table or column	`CALCULATE(SUM(Sales[Amount]), ALL(Product))`
`ALLEXCEPT(table, col)`	Removes all filters except specified columns	`CALCULATE(…, ALLEXCEPT(Product, Product[Category]))`
`REMOVEFILTERS(col)`	Alias for ALL — clearer intent	`CALCULATE(…, REMOVEFILTERS(Date[Year]))`
`KEEPFILTERS(filter)`	Adds filter without overriding existing context	`CALCULATE(…, KEEPFILTERS(Product[Color]="Red"))`

🔄 Iterators

Function	Purpose
`SUMX(table, expr)`	Row-by-row sum
`AVERAGEX(table, expr)`	Row-by-row average
`COUNTX(table, expr)`	Count rows where expr is not blank
`RANKX(table, expr)`	Rank each row by expression
`MAXX / MINX`	Row-by-row max/min

📊 Table Functions

Function	Purpose
`FILTER(table, condition)`	Returns filtered table (iterator — slower)
`ADDCOLUMNS(table, name, expr)`	Adds calculated columns to a table expression
`SUMMARIZE(table, groupCol)`	Groups by column(s) — avoid adding measures here
`SUMMARIZECOLUMNS(col, filter, name, expr)`	Preferred for grouping + measures (optimized)
`SELECTCOLUMNS(table, name, expr)`	Projects specific columns
`DISTINCT(col)`	Unique values of a column
`VALUES(col)`	Unique values including blank row from broken relationships

⏰ Time Intelligence

Function	Purpose
`TOTALYTD(expr, dateCol)`	Year-to-date total
`TOTALQTD / TOTALMTD`	Quarter / month-to-date
`SAMEPERIODLASTYEAR(dateCol)`	Shifts dates back one year
`DATEADD(dateCol, -1, YEAR)`	Shifts dates by interval
`DATESYTD(dateCol)`	Returns dates from start of year to current date
`PARALLELPERIOD(dateCol, -1, QUARTER)`	Full parallel period

Exam Caveat: Time intelligence functions require a contiguous Date table marked as a date table. Gaps in dates break these functions.

🪟 Windowing Functions (New in Fabric)

Function	Purpose
`OFFSET(n, relation, orderBy)`	Access row N positions away
`WINDOW(from, to, relation, orderBy)`	Define a sliding window
`INDEX(n, relation, orderBy)`	Access the Nth row

📌 Variables Pattern & USERELATIONSHIP

-- VAR / RETURN pattern (always prefer for readability & performance)
Profit Margin =
VAR TotalRevenue = SUM(Sales[Revenue])
VAR TotalCost   = SUM(Sales[Cost])
RETURN
    DIVIDE(TotalRevenue - TotalCost, TotalRevenue, 0)

-- USERELATIONSHIP for role-playing dimensions
Ship Date Sales =
CALCULATE(
    SUM(Sales[Amount]),
    USERELATIONSHIP(Sales[ShipDateKey], Date[DateKey])
)

Exam Tip: USERELATIONSHIP activates an inactive relationship for that measure only. The relationship must already exist in the model as inactive.

🔒 4 — Security Layers Cheatsheet

Layer	RLS (Row-Level Security)	CLS (Column-Level Security)	OLS (Object-Level Security)
What it restricts	Rows visible to user	Columns visible to user	Tables or columns hidden from user
Defined in	Semantic model (DAX filter)	Semantic model (role config)	Semantic model (role config)
Implementation	DAX expression on table	Column permissions in role	`None` or `Read` metadata permission
Tooling	Power BI Desktop / XMLA / Tabular Editor	XMLA / Tabular Editor only	XMLA / Tabular Editor only
Works with Direct Lake?	Yes	Yes	Yes
Test method	“View as Role” in service	XMLA query as role	XMLA query as role

flowchart LR
    subgraph "Security Enforcement Stack"
        A["Workspace Roles<br/>(Admin, Member,<br/>Contributor, Viewer)"] --> B["Semantic Model<br/>RLS / CLS / OLS"]
        B --> C["Sensitivity Labels<br/>(Microsoft Purview)"]
    end
    style A fill:#1a3a5c,stroke:#4a9eff
    style B fill:#1a3a5c,stroke:#4a9eff
    style C fill:#1a3a5c,stroke:#4a9eff

Exam Caveat: RLS is defined in the semantic model, NOT in the Lakehouse or Warehouse. Users with Admin/Member roles bypass RLS when browsing data in the service.

🗄️ 5 — Data Store Selection

Criteria	Lakehouse	Warehouse	KQL Database	Eventhouse
Query language	Spark SQL, PySpark	T-SQL	KQL (Kusto)	KQL
File format	Delta (Parquet)	Managed tables	Columnar time-series	Columnar time-series
Best for	Data engineering, ML, ELT	Enterprise BI, complex joins, stored procs	Log/telemetry analytics	Streaming + real-time analytics
Schema	Schema-on-read & schema-on-write	Schema-on-write (strict)	Semi-structured, schemaless	Semi-structured
Transactions	ACID via Delta	Full T-SQL transactions	Append-optimized	Append-optimized
SQL endpoint?	Yes (read-only auto-generated)	Native	No (KQL only)	No (KQL only)
Direct Lake support	Yes	Yes	No	No

Exam Tip: If the question mentions T-SQL stored procedures or views with complex joins, the answer is Warehouse. If it mentions streaming or IoT telemetry, think Eventhouse / KQL Database.

📥 6 — Ingestion Method Selection

Criteria	Dataflow Gen2	Notebook (Spark)	Copy Activity (Pipeline)	Shortcut
Skill level	Low-code (Power Query)	Pro-code (Python/Scala)	Low-code (config)	No-code
Transformation	Yes (M / Power Query)	Yes (full Spark)	Minimal (mapping only)	None (pass-through)
Best for	Simple transforms, small data	Complex ETL, ML prep, large data	Bulk copy between stores	Virtualized access, zero-copy
Scheduling	Pipeline or standalone	Pipeline	Pipeline	Always live
Compute	Fabric Dataflow compute	Spark pool	Pipeline IR	None
Writes Delta?	Yes (to Lakehouse)	Yes	Yes	No (reads in-place)

Exam Caveat: Shortcuts do not copy data — they create a virtualized reference. Data stays in the source (ADLS, S3, or another OneLake location). This is critical for questions about data residency and latency.

🔷 7 — Direct Lake Deep Dive

📌 Key Facts

Fact	Detail
Required format	Delta tables with V-Order optimization
Where data lives	OneLake (Lakehouse or Warehouse)
Framing	Snapshot of Delta log metadata; determines which Parquet files to read
Refresh type	Framing (lightweight metadata update, not data copy)
Automatic framing	Occurs when Fabric detects schema/data changes
Fallback behavior	Falls back to DirectQuery when guardrails are exceeded
Fallback triggers	Row count exceeds SKU limit, column count exceeds limit, unsupported DAX, or timeout
Disable fallback	`DirectLakeBehavior = DontFallback` (query fails instead of going to DQ)
V-Order	Special sort order applied at write time; improves read performance dramatically

flowchart TD
    Q["DAX Query Arrives"] --> G{Within SKU<br/>guardrails?}
    G -- Yes --> DL["Direct Lake<br/>(read Parquet via memory)"]
    G -- No --> FB{Fallback<br/>enabled?}
    FB -- Yes --> DQ["DirectQuery<br/>(SQL endpoint)"]
    FB -- No --> ERR["Query Error"]
    DL --> R["Return Results"]
    DQ --> R

Exam Caveat: Direct Lake does not import data into the model. It reads Parquet files directly from OneLake into memory on demand. Framing is NOT the same as a refresh — it only updates the Delta log pointer.

⚠️ 8 — Common Exam Traps

RLS is defined in the semantic model, not in the Lakehouse, Warehouse, or data source.
Direct Lake requires Delta tables in OneLake — no CSV, no Azure SQL, no external Parquet.
Viewers cannot build reports on shared semantic models without Build permission.
Workspace Admin and Member roles bypass RLS when browsing data directly in the service.
XMLA read/write requires at minimum Power BI Premium Per User (PPU) or Fabric F64+ for write; read is available at lower SKUs.
Deployment pipelines compare content between stages — they do not version-control code (use Git integration for that).
Incremental refresh in Import mode creates partitions automatically — you configure the RangeStart and RangeEnd parameters, not partition logic.
Sensitivity labels require Microsoft Purview Information Protection; they propagate downstream when data is exported.
SUMMARIZE should not be used to add new measure columns — use SUMMARIZECOLUMNS or ADDCOLUMNS(SUMMARIZE(...)) instead.
Composite models can mix Import and DirectQuery tables, but Direct Lake tables cannot be mixed with Import tables in the same model.
Shortcuts are read-only virtual references — they do not move, copy, or transform data.
CLS and OLS cannot be configured in Power BI Desktop — they require XMLA endpoint or Tabular Editor.
Time intelligence functions require a Date table marked as a date table with no gaps.
V-Order is a write-time optimization — it must be applied when data is written to Delta, not at query time.
KEEPFILTERS does not remove context; it intersects with existing filters. Contrast with CALCULATE which overrides.
Direct Lake fallback to DirectQuery uses the SQL analytics endpoint — if that endpoint is down, queries fail.
Git integration in Fabric works at the workspace level, not at individual item level.
Dataflow Gen2 outputs to Lakehouse/Warehouse; it does not load directly into a semantic model like legacy Power BI dataflows could.
USERELATIONSHIP only works with inactive relationships — you cannot reference a relationship that does not exist.
Large format datasets (over 10 GB) require Premium or Fabric capacity — Pro licenses are capped.

Exam Tip: When two answers both seem correct, ask yourself: “Where is this configured?” The exam loves testing whether something is set in the semantic model, the workspace, the data source, or the admin portal.

✅ 9 — Pre-Exam Checklist

🔒 Domain 1 — Maintain a Data Analytics Solution (25-30%)

I can explain the four workspace roles and their permissions
I know how deployment pipelines compare content across stages
I understand Git integration scope (workspace-level, supported item types)
I can configure sensitivity labels and explain downstream inheritance
I know when to use XMLA endpoints and which SKUs support read vs. read/write
I can describe the difference between RLS, CLS, and OLS and where each is configured
I understand how workspace Admin/Member roles interact with RLS

🔄 Domain 2 — Prepare Data (45-50%)

I can choose between Lakehouse, Warehouse, KQL Database, and Eventhouse
I know when to use Dataflow Gen2 vs. Notebook vs. Pipeline Copy Activity vs. Shortcut
I can design a star schema with fact and dimension tables
I understand how Shortcuts work and their limitations (read-only, no transformation)
I can write basic Spark SQL, T-SQL, and KQL queries
I know the incremental refresh parameter pattern (RangeStart / RangeEnd)
I can describe Delta Lake format, V-Order, and their role in Direct Lake

📐 Domain 3 — Implement and Manage Semantic Models (25-30%)

I can choose between Import, DirectQuery, Direct Lake, and Composite storage modes
I know the Direct Lake guardrails, framing process, and fallback behavior
I can write CALCULATE with ALL, ALLEXCEPT, REMOVEFILTERS, and KEEPFILTERS
I understand iterator functions (SUMX, AVERAGEX, COUNTX, RANKX)
I know when to use SUMMARIZECOLUMNS vs. SUMMARIZE
I can implement time intelligence with a proper Date table
I understand USERELATIONSHIP for role-playing dimensions
I can configure RLS with DAX and test it with “View as Role”
I know how to optimize a model (remove unused columns, avoid bi-directional filters, use variables)

Exam Tip: The DP-600 is heavily weighted toward Domain 2 (Prepare Data) at 45-50%. Spend the most study time on data ingestion patterns, star schema design, and query languages.

← 03 — Implement and Manage Semantic Models

Back to Home →