⚡ Quick Reference Cheatsheet
- Based on: Microsoft Fabric documentation (Microsoft Learn)
- 📁 ← Back to Home
Last-minute review sheet for the DP-600 Microsoft Fabric Analytics Engineer Associate exam. Print it, bookmark it, or skim it in the parking lot.
Table of contents
🔢 1 — Key Numbers & Limits
| Item | Value / Threshold |
|---|---|
| Max tables per Direct Lake semantic model (F2) | 500 |
| Max tables per Direct Lake semantic model (F64+) | 1,000 |
| Max rows per table (Direct Lake, F2) | 300 million |
| Max rows per table (Direct Lake, F64) | 1.5 billion |
| Max columns per table (Direct Lake) | 500 |
| XMLA read/write endpoint requirement | Premium / Fabric F64+ capacity (read available at lower SKUs) |
| Deployment pipeline stages (max) | 10 (Dev, Test, Prod, plus up to 7 custom stages) |
| Workspace roles | 4: Admin, Member, Contributor, Viewer |
| Sensitivity labels — who can apply | Users with Information Protection license + label publishing policy |
| Incremental refresh — max partitions (Import) | 10,000 per table |
| Max measures in a single DAX query | Depends on memory; no hard cap but keep below 100 for performance |
| OneLake file format for Direct Lake | Delta / Parquet (V-Order optimized) |
| Max scheduled refreshes per day (Pro) | 8 |
| Max scheduled refreshes per day (Premium) | 48 |
Exam Tip: SKU-specific guardrails (row limits, table counts) are the most common “trick number” questions. Know at least the F2 and F64 tiers.
💾 2 — Storage Mode Decision Matrix
| Criteria | Import | DirectQuery | Direct Lake | Composite |
|---|---|---|---|---|
| Data freshness | Scheduled / on-demand refresh | Real-time | Near real-time (framing) | Mixed |
| Query performance | Fastest (in-memory) | Depends on source | Fast (columnar from OneLake) | Varies by table |
| Data volume | Limited by capacity memory | No size limit | Guardrails per SKU | Mixed |
| Data source | Any supported | Any supported | Delta tables in OneLake only | Any mix |
| DAX support | Full | Full (slower) | Full (with fallback) | Full |
| Use case | Small-to-mid datasets, max perf | Enforce source RLS, large data | Fabric-native analytics | Migration, mixed sources |
| Refresh needed? | Yes | No | Framing only | Partial |
flowchart TD
A[Choose Storage Mode] --> B{Data in OneLake<br/>as Delta tables?}
B -- Yes --> C{Need real-time<br/>to the second?}
C -- No --> D["**Direct Lake**<br/>Best of both worlds"]
C -- Yes --> E["**DirectQuery**<br/>to Lakehouse SQL endpoint"]
B -- No --> F{Data source<br/>supports DirectQuery?}
F -- No --> G["**Import**<br/>Schedule refresh"]
F -- Yes --> H{Need max<br/>performance?}
H -- Yes --> G
H -- No --> I["**DirectQuery**<br/>or Composite"]
Exam Caveat: Direct Lake is only available for Delta tables stored in OneLake (Lakehouse or Warehouse). If data is in Azure SQL or Synapse, you cannot use Direct Lake.
📐 3 — DAX Quick Reference
🔧 CALCULATE & Filter Modifiers
| Pattern | What It Does | Example |
|---|---|---|
CALCULATE(expr, filter) |
Evaluates expression in modified filter context | CALCULATE(SUM(Sales[Amount]), Product[Color]="Red") |
ALL(table/col) |
Removes all filters from table or column | CALCULATE(SUM(Sales[Amount]), ALL(Product)) |
ALLEXCEPT(table, col) |
Removes all filters except specified columns | CALCULATE(…, ALLEXCEPT(Product, Product[Category])) |
REMOVEFILTERS(col) |
Alias for ALL — clearer intent | CALCULATE(…, REMOVEFILTERS(Date[Year])) |
KEEPFILTERS(filter) |
Adds filter without overriding existing context | CALCULATE(…, KEEPFILTERS(Product[Color]="Red")) |
🔄 Iterators
| Function | Purpose |
|---|---|
SUMX(table, expr) |
Row-by-row sum |
AVERAGEX(table, expr) |
Row-by-row average |
COUNTX(table, expr) |
Count rows where expr is not blank |
RANKX(table, expr) |
Rank each row by expression |
MAXX / MINX |
Row-by-row max/min |
📊 Table Functions
| Function | Purpose |
|---|---|
FILTER(table, condition) |
Returns filtered table (iterator — slower) |
ADDCOLUMNS(table, name, expr) |
Adds calculated columns to a table expression |
SUMMARIZE(table, groupCol) |
Groups by column(s) — avoid adding measures here |
SUMMARIZECOLUMNS(col, filter, name, expr) |
Preferred for grouping + measures (optimized) |
SELECTCOLUMNS(table, name, expr) |
Projects specific columns |
DISTINCT(col) |
Unique values of a column |
VALUES(col) |
Unique values including blank row from broken relationships |
⏰ Time Intelligence
| Function | Purpose |
|---|---|
TOTALYTD(expr, dateCol) |
Year-to-date total |
TOTALQTD / TOTALMTD |
Quarter / month-to-date |
SAMEPERIODLASTYEAR(dateCol) |
Shifts dates back one year |
DATEADD(dateCol, -1, YEAR) |
Shifts dates by interval |
DATESYTD(dateCol) |
Returns dates from start of year to current date |
PARALLELPERIOD(dateCol, -1, QUARTER) |
Full parallel period |
Exam Caveat: Time intelligence functions require a contiguous Date table marked as a date table. Gaps in dates break these functions.
🪟 Windowing Functions (New in Fabric)
| Function | Purpose |
|---|---|
OFFSET(n, relation, orderBy) |
Access row N positions away |
WINDOW(from, to, relation, orderBy) |
Define a sliding window |
INDEX(n, relation, orderBy) |
Access the Nth row |
📌 Variables Pattern & USERELATIONSHIP
-- VAR / RETURN pattern (always prefer for readability & performance)
Profit Margin =
VAR TotalRevenue = SUM(Sales[Revenue])
VAR TotalCost = SUM(Sales[Cost])
RETURN
DIVIDE(TotalRevenue - TotalCost, TotalRevenue, 0)
-- USERELATIONSHIP for role-playing dimensions
Ship Date Sales =
CALCULATE(
SUM(Sales[Amount]),
USERELATIONSHIP(Sales[ShipDateKey], Date[DateKey])
)
Exam Tip:
USERELATIONSHIPactivates an inactive relationship for that measure only. The relationship must already exist in the model as inactive.
🔒 4 — Security Layers Cheatsheet
| Layer | RLS (Row-Level Security) | CLS (Column-Level Security) | OLS (Object-Level Security) |
|---|---|---|---|
| What it restricts | Rows visible to user | Columns visible to user | Tables or columns hidden from user |
| Defined in | Semantic model (DAX filter) | Semantic model (role config) | Semantic model (role config) |
| Implementation | DAX expression on table | Column permissions in role | None or Read metadata permission |
| Tooling | Power BI Desktop / XMLA / Tabular Editor | XMLA / Tabular Editor only | XMLA / Tabular Editor only |
| Works with Direct Lake? | Yes | Yes | Yes |
| Test method | “View as Role” in service | XMLA query as role | XMLA query as role |
flowchart LR
subgraph "Security Enforcement Stack"
A["Workspace Roles<br/>(Admin, Member,<br/>Contributor, Viewer)"] --> B["Semantic Model<br/>RLS / CLS / OLS"]
B --> C["Sensitivity Labels<br/>(Microsoft Purview)"]
end
style A fill:#1a3a5c,stroke:#4a9eff
style B fill:#1a3a5c,stroke:#4a9eff
style C fill:#1a3a5c,stroke:#4a9eff
Exam Caveat: RLS is defined in the semantic model, NOT in the Lakehouse or Warehouse. Users with Admin/Member roles bypass RLS when browsing data in the service.
🗄️ 5 — Data Store Selection
| Criteria | Lakehouse | Warehouse | KQL Database | Eventhouse |
|---|---|---|---|---|
| Query language | Spark SQL, PySpark | T-SQL | KQL (Kusto) | KQL |
| File format | Delta (Parquet) | Managed tables | Columnar time-series | Columnar time-series |
| Best for | Data engineering, ML, ELT | Enterprise BI, complex joins, stored procs | Log/telemetry analytics | Streaming + real-time analytics |
| Schema | Schema-on-read & schema-on-write | Schema-on-write (strict) | Semi-structured, schemaless | Semi-structured |
| Transactions | ACID via Delta | Full T-SQL transactions | Append-optimized | Append-optimized |
| SQL endpoint? | Yes (read-only auto-generated) | Native | No (KQL only) | No (KQL only) |
| Direct Lake support | Yes | Yes | No | No |
Exam Tip: If the question mentions T-SQL stored procedures or views with complex joins, the answer is Warehouse. If it mentions streaming or IoT telemetry, think Eventhouse / KQL Database.
📥 6 — Ingestion Method Selection
| Criteria | Dataflow Gen2 | Notebook (Spark) | Copy Activity (Pipeline) | Shortcut |
|---|---|---|---|---|
| Skill level | Low-code (Power Query) | Pro-code (Python/Scala) | Low-code (config) | No-code |
| Transformation | Yes (M / Power Query) | Yes (full Spark) | Minimal (mapping only) | None (pass-through) |
| Best for | Simple transforms, small data | Complex ETL, ML prep, large data | Bulk copy between stores | Virtualized access, zero-copy |
| Scheduling | Pipeline or standalone | Pipeline | Pipeline | Always live |
| Compute | Fabric Dataflow compute | Spark pool | Pipeline IR | None |
| Writes Delta? | Yes (to Lakehouse) | Yes | Yes | No (reads in-place) |
Exam Caveat: Shortcuts do not copy data — they create a virtualized reference. Data stays in the source (ADLS, S3, or another OneLake location). This is critical for questions about data residency and latency.
🔷 7 — Direct Lake Deep Dive
📌 Key Facts
| Fact | Detail |
|---|---|
| Required format | Delta tables with V-Order optimization |
| Where data lives | OneLake (Lakehouse or Warehouse) |
| Framing | Snapshot of Delta log metadata; determines which Parquet files to read |
| Refresh type | Framing (lightweight metadata update, not data copy) |
| Automatic framing | Occurs when Fabric detects schema/data changes |
| Fallback behavior | Falls back to DirectQuery when guardrails are exceeded |
| Fallback triggers | Row count exceeds SKU limit, column count exceeds limit, unsupported DAX, or timeout |
| Disable fallback | DirectLakeBehavior = DontFallback (query fails instead of going to DQ) |
| V-Order | Special sort order applied at write time; improves read performance dramatically |
flowchart TD
Q["DAX Query Arrives"] --> G{Within SKU<br/>guardrails?}
G -- Yes --> DL["Direct Lake<br/>(read Parquet via memory)"]
G -- No --> FB{Fallback<br/>enabled?}
FB -- Yes --> DQ["DirectQuery<br/>(SQL endpoint)"]
FB -- No --> ERR["Query Error"]
DL --> R["Return Results"]
DQ --> R
Exam Caveat: Direct Lake does not import data into the model. It reads Parquet files directly from OneLake into memory on demand. Framing is NOT the same as a refresh — it only updates the Delta log pointer.
⚠️ 8 — Common Exam Traps
- RLS is defined in the semantic model, not in the Lakehouse, Warehouse, or data source.
- Direct Lake requires Delta tables in OneLake — no CSV, no Azure SQL, no external Parquet.
- Viewers cannot build reports on shared semantic models without Build permission.
- Workspace Admin and Member roles bypass RLS when browsing data directly in the service.
- XMLA read/write requires at minimum Power BI Premium Per User (PPU) or Fabric F64+ for write; read is available at lower SKUs.
- Deployment pipelines compare content between stages — they do not version-control code (use Git integration for that).
- Incremental refresh in Import mode creates partitions automatically — you configure the
RangeStartandRangeEndparameters, not partition logic. - Sensitivity labels require Microsoft Purview Information Protection; they propagate downstream when data is exported.
- SUMMARIZE should not be used to add new measure columns — use
SUMMARIZECOLUMNSorADDCOLUMNS(SUMMARIZE(...))instead. - Composite models can mix Import and DirectQuery tables, but Direct Lake tables cannot be mixed with Import tables in the same model.
- Shortcuts are read-only virtual references — they do not move, copy, or transform data.
- CLS and OLS cannot be configured in Power BI Desktop — they require XMLA endpoint or Tabular Editor.
- Time intelligence functions require a Date table marked as a date table with no gaps.
- V-Order is a write-time optimization — it must be applied when data is written to Delta, not at query time.
KEEPFILTERSdoes not remove context; it intersects with existing filters. Contrast withCALCULATEwhich overrides.- Direct Lake fallback to DirectQuery uses the SQL analytics endpoint — if that endpoint is down, queries fail.
- Git integration in Fabric works at the workspace level, not at individual item level.
- Dataflow Gen2 outputs to Lakehouse/Warehouse; it does not load directly into a semantic model like legacy Power BI dataflows could.
USERELATIONSHIPonly works with inactive relationships — you cannot reference a relationship that does not exist.- Large format datasets (over 10 GB) require Premium or Fabric capacity — Pro licenses are capped.
Exam Tip: When two answers both seem correct, ask yourself: “Where is this configured?” The exam loves testing whether something is set in the semantic model, the workspace, the data source, or the admin portal.
✅ 9 — Pre-Exam Checklist
🔒 Domain 1 — Maintain a Data Analytics Solution (25-30%)
- I can explain the four workspace roles and their permissions
- I know how deployment pipelines compare content across stages
- I understand Git integration scope (workspace-level, supported item types)
- I can configure sensitivity labels and explain downstream inheritance
- I know when to use XMLA endpoints and which SKUs support read vs. read/write
- I can describe the difference between RLS, CLS, and OLS and where each is configured
- I understand how workspace Admin/Member roles interact with RLS
🔄 Domain 2 — Prepare Data (45-50%)
- I can choose between Lakehouse, Warehouse, KQL Database, and Eventhouse
- I know when to use Dataflow Gen2 vs. Notebook vs. Pipeline Copy Activity vs. Shortcut
- I can design a star schema with fact and dimension tables
- I understand how Shortcuts work and their limitations (read-only, no transformation)
- I can write basic Spark SQL, T-SQL, and KQL queries
- I know the incremental refresh parameter pattern (
RangeStart/RangeEnd) - I can describe Delta Lake format, V-Order, and their role in Direct Lake
📐 Domain 3 — Implement and Manage Semantic Models (25-30%)
- I can choose between Import, DirectQuery, Direct Lake, and Composite storage modes
- I know the Direct Lake guardrails, framing process, and fallback behavior
- I can write CALCULATE with ALL, ALLEXCEPT, REMOVEFILTERS, and KEEPFILTERS
- I understand iterator functions (SUMX, AVERAGEX, COUNTX, RANKX)
- I know when to use SUMMARIZECOLUMNS vs. SUMMARIZE
- I can implement time intelligence with a proper Date table
- I understand USERELATIONSHIP for role-playing dimensions
- I can configure RLS with DAX and test it with “View as Role”
- I know how to optimize a model (remove unused columns, avoid bi-directional filters, use variables)
Exam Tip: The DP-600 is heavily weighted toward Domain 2 (Prepare Data) at 45-50%. Spend the most study time on data ingestion patterns, star schema design, and query languages.