Domain 4 · 10–15%

Manage compliance using Microsoft Purview

Classifying data, governing its lifecycle with retention, protecting it with sensitivity labels and encryption, and preventing it from leaking with Data Loss Prevention across every Microsoft 365 workload.

Two objectives:

Everything here lives in the Microsoft Purview portal (purview.microsoft.com). Many features require Microsoft 365 E5 / E5 Compliance; baseline labels & manual DLP are in E3.

4.1 Information protection & data lifecycle

Microsoft frames information protection as a four-stage cycle — recognise which feature serves which stage:

flowchart LR K[Know your data
SITs · classifiers · Content explorer] --> P[Protect your data
sensitivity labels · encryption · DLP] P --> G[Govern your data
retention · records · disposition] G --> M[Monitor
Activity explorer · alerts · reports] M --> K
Know → Protect → Govern → Monitor: the Microsoft Purview information-protection lifecycle.

Data classification — how Purview finds sensitive data

ClassifierHow it matchesUse for
Sensitive Info Type (SIT)Pattern: keywords, regex, checksums, proximity, confidence levelCredit cards, SSNs, passport & tax IDs
Trainable classifierMachine-learned from sample contentCategories like contracts, resumes, source code
EDM (Exact Data Match)Matches against a hashed database of exact valuesYour own customer/employee records
Keyword / keyword list / dictionaryExact term matchingProject code words, product names

Sensitive information types ship with 200+ built-ins. Build a custom SIT from keywords, a keyword list, a regular expression, plus supporting evidence within a character proximity and a confidence level (low/medium/high) and instance count.

Caveat — confidence & instance count tuning

To reduce false positives, raise the confidence level and require supporting keywords nearby. To catch bulk leaks, set a min/max instance count (e.g. "10 to 500 credit card numbers"). These knobs appear in both SIT and DLP questions.

Exact Data Match (EDM) — the workflow

  1. Define a schema (the columns of your sensitive table, e.g. SSN, account #) and choose which fields are searchable.
  2. Hash & upload the sensitive data file with the EDM Upload Agent — only salted hashes leave your premises (the actual values never reach Microsoft).
  3. Create an EDM SIT referencing the schema, then use it in DLP/auto-labeling like any SIT.

When to choose which classifier

SIT = well-known patterns (card numbers). EDM = match your exact records with very low false positives at scale. Trainable classifier = categorise by meaning where there's no fixed pattern. Trainable classifiers come pre-trained (e.g. Resumes, Source code, Harassment, Profanity) or you build a custom one by feeding 50–500 sample documents (seed) then test.

Retention labels & retention policies (data lifecycle)

Retention answers "keep this for X, then delete (or review)" and "don't delete before X".

Retention policyRetention label
ScopeApplied at container level (whole mailbox, site, Teams, etc.)Applied at item level (specific doc/email)
Applied howAuto, location-wideManually by users, auto (by SIT/keyword/trainable), or default for a library
Extra powerBroad, simple coverageRecords management: mark as record / regulatory record, disposition review, event-based retention

Caveat — principles of retention (conflict resolution)

When multiple retention settings apply, the outcome follows this order: 1) Retention wins over deletion (keep beats delete) → 2) Longest retention period wins3) Explicit (label) wins over implicit (policy)4) Shortest deletion period wins. Memorise this — it's a classic exam item.

Adaptive vs static policy scopes

Static scope = you manually pick the locations/sites. Adaptive scope = a query on user/group/site attributes (e.g. department = Legal) that auto-updates membership as attributes change. Prefer adaptive for large or changing orgs so the policy stays current without edits.

Caveat — records vs regulatory records

A retention label can mark an item as a record (can't be edited/deleted, but a privileged admin can unlock it) or a regulatory record (locked — label can never be removed or downgraded, even by an admin). Regulatory records are irreversible — a favourite distractor.

Sensitivity labels & label policies

Sensitivity labels classify and protect content/containers. A label can apply: encryption, content marking (header/footer/watermark), and access/scope controls. Labels are persistent and travel with the file.

flowchart LR C[Create labels
Public · Internal · Confidential · Highly Confidential] --> S[Set protection:
encryption · marking · container settings] S --> P[Publish via label policy
to users / groups] P --> A{Apply} A -->|Manual| U[User picks label] A -->|Auto / recommended| AI[Match SIT / classifier] A -->|Default| D[Default label on docs/sites]
Labels are created, optionally protect content, then published to users via a label policy.

Encryption & access control on a label

Caveat — sensitivity vs retention labels

Sensitivity label = classify + protect (encrypt, mark, restrict). Retention label = govern lifecycle (keep/delete). An item can have one sensitivity label and one retention label at the same time. Don't mix them up.

Monitor label usage — Content explorer, Activity explorer, reports

ToolShowsAnswers
Content explorerCurrent snapshot — where labeled/sensitive items live right now"What sensitive content do I have and where?"
Activity explorerHistorical activities on labeled content (label applied/changed/removed, file actions)"What happened to labeled content over time?"
Label / classifier reportsAdoption & distribution of labels"How widely are labels used?"

Permissions

Viewing Content explorer needs the Content Explorer List/Content viewer roles; Activity explorer needs the relevant data classification reader roles. Least privilege still applies.

4.2 Data Loss Prevention (DLP)

DLP detects sensitive content (via SITs, sensitivity labels, classifiers) and enforces protective actions when users share or move it.

DLP policies across Microsoft 365 workloads

A DLP policy = locations + rules (conditions → actions). Supported locations now include:

Rule elementExamples
ConditionsContent contains SIT / sensitivity label / classifier; shared with people outside the org; instance count thresholds
ActionsBlock access / block only external; restrict sharing; encrypt; restrict access on device
User notificationsPolicy tips in Outlook/Office/Teams to educate the user inline
User overridesAllow override with business justification or false-positive report (configurable)
Incident reportsEmail alert to admins, severity, aggregated alerts

Caveat — policy tip + override behaviour

"Policy tips" require user notifications enabled. If a question wants users blocked but able to proceed with a reason, enable allow overrides with justification. To hard-stop, disable overrides. Test new policies in simulation/test mode (with or without tips) before turning on enforcement — same pattern as CA report-only.

Per-workload behaviour to remember:

WorkloadWhat DLP can do
Exchange OnlineBlock/encrypt outbound mail, policy tips in Outlook, override with justification
SharePoint / OneDriveBlock external/everyone sharing of matching files; auto-remove access
Teams chat & channelsBlock sharing sensitive messages; requires the content to be sent (not just typed)
Endpoint (devices)Control copy/print/USB/upload (see below)
Microsoft 365 CopilotExclude content with a chosen sensitivity label from being processed/summarised by Copilot
Power BI / on-prem (scanner)Detect & alert on sensitive datasets / repositories

Policy priority

DLP policies have a priority order (0 = highest). When content matches multiple policies, the most restrictive action across matching rules is enforced; rule processing within a policy stops at the first matched rule that applies an action, per its priority.

Endpoint DLP

Extends DLP to activities on Windows 10/11 & macOS devices that are onboarded (the same onboarding as Defender for Endpoint — Intune or script).

Caveat — Endpoint DLP prerequisites

Endpoint DLP requires devices to be onboarded (Intune or local script) and works best on Entra-joined/registered Windows. Use Microsoft Edge or installed extensions for best browser coverage; "unallowed browsers" can be blocked from accessing sensitive items.

Review & respond to DLP alerts, events & reports

Licensing quick map (Purview)

E3: manual sensitivity/retention labels, baseline DLP (Exchange/SP/OneDrive/Teams), message encryption. E5 / E5 Compliance: automatic labeling, Endpoint DLP, advanced SITs (EDM/trainable classifiers), records management, Insider Risk, advanced auditing.