🔎 Appendix A — KQL Reference & Exam Caveats
- Based on: Kusto Query Language documentation (Microsoft Learn)
- 📁 ← Back to Home
Focused syntax reference for KQL (Kusto Query Language) as it appears on the DP-600 exam. KQL powers Eventhouse, KQL Databases, and Real-Time Analytics in Fabric. You will not be asked to write long queries, but you will be asked to read a query and pick the correct operator, order, or result.
Table of contents
- 🧭 1 — Where KQL Shows Up in Fabric
- 🔧 2 — Core Query Structure
- 🎛️ 3 — Filtering & Row Operators
- 📊 4 — Aggregation with
summarize - ⏱️ 5 — Time-Series & the
bin()Function - 🔗 6 — Joins & Lookups
- 🧱 7 — Variables, Functions &
render - 📥 8 — Ingestion & Materialized Views (know the concepts)
- ⚠️ 9 — KQL Exam Traps (Rapid Fire)
- 🧾 10 — KQL ↔ T-SQL Quick Translation
🧭 1 — Where KQL Shows Up in Fabric
| Surface | Uses KQL? | Notes |
|---|---|---|
| KQL Database | ✅ Native | Stored in an Eventhouse; columnar, time-series optimized |
| Eventhouse | ✅ Native | Container for one or more KQL Databases |
| KQL Queryset | ✅ | Saved .kql queries + results, shareable |
| Real-Time Dashboard | ✅ | Tiles are backed by KQL queries |
| Lakehouse / Warehouse | ❌ | Use Spark SQL / T-SQL instead |
| Semantic model (DAX) | ❌ | KQL results can feed a model, but the model queries in DAX |
Exam Caveat: KQL is read-optimized for append-only, high-volume, time-stamped data (logs, telemetry, IoT). If a scenario mentions frequent updates or deletes of individual rows, KQL is the wrong store — point to a Warehouse or Lakehouse instead.
🔧 2 — Core Query Structure
A KQL query starts with a table name and pipes (|) data through a sequence of tabular operators. Data flows left to right, top to bottom.
StormEvents // 1. source table
| where StartTime > ago(7d) // 2. filter rows
| where State == "FLORIDA" // 3. filter again (AND)
| summarize Count = count() by EventType // 4. aggregate
| sort by Count desc // 5. order
| take 10 // 6. limit
Key Point: Order matters.
wherebeforesummarizefilters raw rows (fast, uses indexes).whereaftersummarizefilters aggregated results. Filtering early is both correct semantics and the performant choice — a common exam distractor puts the filter after the aggregation.
🎛️ 3 — Filtering & Row Operators
| Operator | Purpose | Example |
|---|---|---|
where |
Filter rows on a predicate | \| where Level == "Error" |
take / limit |
Return N rows (no ordering guarantee) | \| take 100 |
top |
Top N rows by a column (sorts) | \| top 5 by Duration desc |
sort by / order by |
Sort rows | \| sort by Timestamp asc |
distinct |
Unique combinations of columns | \| distinct DeviceId, City |
project |
Select / rename / compute columns | \| project City, Temp = TempC |
project-away |
Drop specific columns | \| project-away _internalId |
project-keep |
Keep only matching columns | \| project-keep City*, Temp |
extend |
Add a new calculated column | \| extend TempF = TempC * 9/5 + 32 |
Exam Caveat:
projectreplaces the column set — anything not listed is dropped.extendadds a column and keeps everything else. Mixing these up is the single most common KQL trap: if a later operator references a column, make sure aprojectdidn’t already remove it.
String operators (case sensitivity)
| Operator | Case-sensitive? | Meaning |
|---|---|---|
== |
✅ Yes | Exact match |
=~ |
❌ No | Case-insensitive equals |
has |
❌ No | Whole-term match (indexed, fast) |
contains |
❌ No | Substring match (not indexed, slower) |
startswith / endswith |
❌ No | Prefix / suffix |
matches regex |
✅ Yes | Regular expression |
Exam Tip: Prefer
hasovercontainswhen matching a full word/term —hasuses the term index and is dramatically faster on large tables.containsscans and should be reserved for genuine substring needs.
📊 4 — Aggregation with summarize
summarize groups rows and computes aggregates. Columns after by become the grouping keys.
Telemetry
| summarize
AvgTemp = avg(Temperature),
MaxTemp = max(Temperature),
Readings = count()
by DeviceId, bin(Timestamp, 1h) // group per device, per hour
| Aggregation function | Returns |
|---|---|
count() |
Row count |
dcount(col) |
Approx. distinct count (fast, ~1% error) |
countif(pred) |
Count where predicate is true |
sum(col) / avg(col) |
Sum / average |
min(col) / max(col) |
Extremes |
percentile(col, 95) |
Nth percentile |
make_list(col) / make_set(col) |
Aggregate values into an array |
arg_max(col, *) |
Row with the max of col (keep all columns) |
Exam Caveat:
count()takes no argument;count(col)is invalid KQL (unlike T-SQL’sCOUNT(col)). To count non-nulls of a column usecountif(isnotnull(col)). Anddcount()is approximate by design — if a scenario demands an exact distinct count, usecount()over adistinctresult, notdcount().
⏱️ 5 — Time-Series & the bin() Function
Time-bucketing is the heart of KQL. bin() (alias floor) rounds a value down to the nearest multiple — most often a time interval.
Signups
| where Timestamp > ago(30d)
| summarize DailyUsers = dcount(UserId) by bin(Timestamp, 1d)
| render timechart
| Expression | Meaning |
|---|---|
ago(7d) |
Timestamp 7 days before now |
now() |
Current UTC time |
bin(T, 1h) |
Round T down to the hour |
datetime_diff("day", end, start) |
Whole units between two datetimes |
startofday(T) / startofmonth(T) |
Truncate to boundary |
todatetime("2026-01-01") |
Parse to datetime |
format_datetime(T, "yyyy-MM-dd") |
Format as string |
Time units: d (day), h (hour), m (minute), s (second), ms, microsecond, tick. There is no w, M, or y literal — use functions like startofweek() / startofmonth() for those.
Exam Caveat: All KQL datetimes are UTC. There is no time-zone-aware datetime type. Convert for display only, and remember
ago()/now()are UTC — a scenario about “yesterday in local time” needs an explicit offset.
flowchart LR
A["Raw events<br/>(one row each)"] --> B["bin(Timestamp, 1h)"]
B --> C["Rows grouped<br/>into hourly buckets"]
C --> D["summarize count()<br/>per bucket"]
D --> E["render timechart"]
🔗 6 — Joins & Lookups
| Operator | Use when | Notes |
|---|---|---|
join kind=inner |
Match rows in both tables | Default kind is innerunique, not inner |
join kind=leftouter |
Keep all left rows | Right columns null when no match |
lookup |
Enrich from a small dimension table | Optimized for dimension lookups |
union |
Stack rows from multiple tables | Columns unioned by name |
Sales
| join kind=inner (
Products
| project ProductId, Category
) on ProductId
Exam Caveat: KQL’s default join is
innerunique— it de-duplicates the left-side key before joining, which can silently drop rows compared to a SQLINNER JOIN. If you need true SQL inner-join semantics, you must writekind=innerexplicitly. This is a favourite “spot the subtle bug” question.
Exam Tip: For best join performance, put the smaller table on the left side of
join. This is the opposite instinct from some SQL engines — KQL broadcasts the left table.
🧱 7 — Variables, Functions & render
let threshold = 100; // scalar variable
let RecentErrors = // tabular variable
Logs | where Level == "Error" and Timestamp > ago(1d);
RecentErrors
| where Count > threshold
| render columnchart
letbinds a name to a scalar or a whole table. Statements are separated by;and the final statement produces the result.renderis a client-side visualization hint (timechart,barchart,columnchart,piechart,table). It does not change the data.
Exam Caveat:
renderonly affects how a result is drawn in a KQL queryset or dashboard tile — it produces no new columns and cannot be consumed downstream. A question implyingrendertransforms or filters data is wrong.
📥 8 — Ingestion & Materialized Views (know the concepts)
| Concept | What to know for the exam |
|---|---|
| Update policy | Runs a KQL transform automatically on ingestion (source table → target table) |
| Materialized view | Pre-aggregated, always-current view over a source table (e.g. hourly rollups) |
.ingest |
Management command to load data (batch or streaming) |
| Retention / caching policy | Controls how long data is kept vs. how much is hot-cached in memory |
| OneLake availability | A KQL Database can mirror its data to OneLake as Delta for cross-engine access |
Exam Tip: Management commands start with a dot (
.create table,.ingest,.alter). Anything starting with.is a control command, not a query — you cannot pipe query operators into it.
⚠️ 9 — KQL Exam Traps (Rapid Fire)
projectreplaces columns;extendadds them — the #1 source of “column not found” distractors.count()takes no argument —count(col)is invalid; usecountif(isnotnull(col)).- Default
joinisinnerunique, which de-dupes the left key — specifykind=innerfor SQL-style joins. dcount()is approximate — usedistinct+count()when exactness is required.hasis indexed and fast;containsscans — preferhasfor whole-term matches.- All datetimes are UTC — no time-zone type;
ago()/now()are UTC. ==is case-sensitive;=~is not — a common cause of “why is my filter empty” scenarios.- Filter (
where) beforesummarizefor correctness and performance. - Put the smaller table on the left of a
joinfor performance. - Control commands start with
.and are not part of the tabular query pipeline. - KQL is append-optimized — update/delete-heavy workloads belong in Warehouse/Lakehouse, not a KQL Database.
renderis display-only — it never reshapes or filters the result set.
Exam Tip: When a KQL item shows a full query and asks for the result, trace the pipe order carefully: figure out the column set after each
project/extend, then confirm every later operator’s referenced columns still exist.
🧾 10 — KQL ↔ T-SQL Quick Translation
| Intent | T-SQL | KQL |
|---|---|---|
| Filter | WHERE x = 1 |
\| where x == 1 |
| Select columns | SELECT a, b |
\| project a, b |
| Add column | SELECT *, a+b AS c |
\| extend c = a + b |
| Group & aggregate | GROUP BY g |
\| summarize ... by g |
| Sort | ORDER BY c DESC |
\| sort by c desc |
| Top N | SELECT TOP 5 ... ORDER BY |
\| top 5 by c desc |
| Distinct | SELECT DISTINCT a |
\| distinct a |
| Count rows | COUNT(*) |
count() |
Exam Caveat: KQL uses
==for equality (single=is assignment insideproject/extend/summarize). Writingwhere x = 1is a syntax error, not a filter.