Manage compliance using Microsoft Purview
Classifying data, governing its lifecycle with retention, protecting it with sensitivity labels and encryption, and preventing it from leaking with Data Loss Prevention across every Microsoft 365 workload.
- Information protection & data lifecycle — SITs, retention, sensitivity labels, explorers
- Data Loss Prevention (DLP) — workload policies, Endpoint DLP, alerts & reports
Everything here lives in the Microsoft Purview portal (purview.microsoft.com). Many features require Microsoft 365 E5 / E5 Compliance; baseline labels & manual DLP are in E3.
4.1 Information protection & data lifecycle
Microsoft frames information protection as a four-stage cycle — recognise which feature serves which stage:
SITs · classifiers · Content explorer] --> P[Protect your data
sensitivity labels · encryption · DLP] P --> G[Govern your data
retention · records · disposition] G --> M[Monitor
Activity explorer · alerts · reports] M --> K
Data classification — how Purview finds sensitive data
| Classifier | How it matches | Use for |
|---|---|---|
| Sensitive Info Type (SIT) | Pattern: keywords, regex, checksums, proximity, confidence level | Credit cards, SSNs, passport & tax IDs |
| Trainable classifier | Machine-learned from sample content | Categories like contracts, resumes, source code |
| EDM (Exact Data Match) | Matches against a hashed database of exact values | Your own customer/employee records |
| Keyword / keyword list / dictionary | Exact term matching | Project code words, product names |
Sensitive information types ship with 200+ built-ins. Build a custom SIT from keywords, a keyword list, a regular expression, plus supporting evidence within a character proximity and a confidence level (low/medium/high) and instance count.
Caveat — confidence & instance count tuning
To reduce false positives, raise the confidence level and require supporting keywords nearby. To catch bulk leaks, set a min/max instance count (e.g. "10 to 500 credit card numbers"). These knobs appear in both SIT and DLP questions.
Exact Data Match (EDM) — the workflow
- Define a schema (the columns of your sensitive table, e.g. SSN, account #) and choose which fields are searchable.
- Hash & upload the sensitive data file with the EDM Upload Agent — only salted hashes leave your premises (the actual values never reach Microsoft).
- Create an EDM SIT referencing the schema, then use it in DLP/auto-labeling like any SIT.
When to choose which classifier
SIT = well-known patterns (card numbers). EDM = match your exact records with very low false positives at scale. Trainable classifier = categorise by meaning where there's no fixed pattern. Trainable classifiers come pre-trained (e.g. Resumes, Source code, Harassment, Profanity) or you build a custom one by feeding 50–500 sample documents (seed) then test.
Retention labels & retention policies (data lifecycle)
Retention answers "keep this for X, then delete (or review)" and "don't delete before X".
| Retention policy | Retention label | |
|---|---|---|
| Scope | Applied at container level (whole mailbox, site, Teams, etc.) | Applied at item level (specific doc/email) |
| Applied how | Auto, location-wide | Manually by users, auto (by SIT/keyword/trainable), or default for a library |
| Extra power | Broad, simple coverage | Records management: mark as record / regulatory record, disposition review, event-based retention |
Caveat — principles of retention (conflict resolution)
When multiple retention settings apply, the outcome follows this order: 1) Retention wins over deletion (keep beats delete) → 2) Longest retention period wins → 3) Explicit (label) wins over implicit (policy) → 4) Shortest deletion period wins. Memorise this — it's a classic exam item.
- A retention label policy publishes labels to locations so users can apply them; an auto-apply label policy applies them based on conditions (SITs, keywords, trainable classifiers).
- Retained content in SharePoint/OneDrive is kept in the Preservation Hold library; in Exchange the Recoverable Items folder.
- Disposition review routes items to reviewers at end of retention instead of auto-deleting; reviewers can extend, relabel, or approve permanent deletion.
- Retention period can start from when created / last modified / labeled, or from an event (event-based retention — e.g. employee leaves, contract expires).
Adaptive vs static policy scopes
Static scope = you manually pick the locations/sites. Adaptive scope = a query on user/group/site attributes (e.g. department = Legal) that auto-updates membership as attributes change. Prefer adaptive for large or changing orgs so the policy stays current without edits.
Caveat — records vs regulatory records
A retention label can mark an item as a record (can't be edited/deleted, but a privileged admin can unlock it) or a regulatory record (locked — label can never be removed or downgraded, even by an admin). Regulatory records are irreversible — a favourite distractor.
Sensitivity labels & label policies
Sensitivity labels classify and protect content/containers. A label can apply: encryption, content marking (header/footer/watermark), and access/scope controls. Labels are persistent and travel with the file.
Public · Internal · Confidential · Highly Confidential] --> S[Set protection:
encryption · marking · container settings] S --> P[Publish via label policy
to users / groups] P --> A{Apply} A -->|Manual| U[User picks label] A -->|Auto / recommended| AI[Match SIT / classifier] A -->|Default| D[Default label on docs/sites]
- Label order = priority: the label lowest in the list is the most restrictive/highest sensitivity.
- Labels can be scoped to files & emails, groups & sites (container protection — privacy, external sharing, unmanaged-device access for Teams/SharePoint/M365 Groups), schematized data assets, and Power BI.
- Auto-labeling: client-side (recommended/automatic in Office apps, needs E5) and service-side (auto-labeling policies on data at rest in SharePoint/OneDrive/Exchange).
- Microsoft Purview Message Encryption (OME) can be applied via a label or a mail flow rule for encrypted/rights-protected email to internal & external recipients.
Encryption & access control on a label
- Encryption can assign permissions now (named users/groups with rights like View, Edit, Copy, Print — Co-Author / Co-Owner / Viewer presets) or let users assign permissions (Do Not Forward, encrypt-only).
- Encryption is enforced by Azure Rights Management in the Microsoft Purview/Entra back end; rights travel with the file everywhere.
- Double Key Encryption (DKE) — for the most sensitive data, a customer-held key + a Microsoft key are both required (Microsoft cannot decrypt).
- Encrypted documents support co-authoring & AutoSave when "Enable co-authoring for labeled files" is on tenant-wide.
Caveat — sensitivity vs retention labels
Sensitivity label = classify + protect (encrypt, mark, restrict). Retention label = govern lifecycle (keep/delete). An item can have one sensitivity label and one retention label at the same time. Don't mix them up.
Monitor label usage — Content explorer, Activity explorer, reports
| Tool | Shows | Answers |
|---|---|---|
| Content explorer | Current snapshot — where labeled/sensitive items live right now | "What sensitive content do I have and where?" |
| Activity explorer | Historical activities on labeled content (label applied/changed/removed, file actions) | "What happened to labeled content over time?" |
| Label / classifier reports | Adoption & distribution of labels | "How widely are labels used?" |
Permissions
Viewing Content explorer needs the Content Explorer List/Content viewer roles; Activity explorer needs the relevant data classification reader roles. Least privilege still applies.
4.2 Data Loss Prevention (DLP)
DLP detects sensitive content (via SITs, sensitivity labels, classifiers) and enforces protective actions when users share or move it.
DLP policies across Microsoft 365 workloads
A DLP policy = locations + rules (conditions → actions). Supported locations now include:
- Exchange Online email, SharePoint Online, OneDrive, Teams chat & channel messages, Devices (Endpoint DLP), Power BI, on-prem repositories, and Microsoft 365 Copilot (restrict labeled content from being summarised/processed).
| Rule element | Examples |
|---|---|
| Conditions | Content contains SIT / sensitivity label / classifier; shared with people outside the org; instance count thresholds |
| Actions | Block access / block only external; restrict sharing; encrypt; restrict access on device |
| User notifications | Policy tips in Outlook/Office/Teams to educate the user inline |
| User overrides | Allow override with business justification or false-positive report (configurable) |
| Incident reports | Email alert to admins, severity, aggregated alerts |
Caveat — policy tip + override behaviour
"Policy tips" require user notifications enabled. If a question wants users blocked but able to proceed with a reason, enable allow overrides with justification. To hard-stop, disable overrides. Test new policies in simulation/test mode (with or without tips) before turning on enforcement — same pattern as CA report-only.
Per-workload behaviour to remember:
| Workload | What DLP can do |
|---|---|
| Exchange Online | Block/encrypt outbound mail, policy tips in Outlook, override with justification |
| SharePoint / OneDrive | Block external/everyone sharing of matching files; auto-remove access |
| Teams chat & channels | Block sharing sensitive messages; requires the content to be sent (not just typed) |
| Endpoint (devices) | Control copy/print/USB/upload (see below) |
| Microsoft 365 Copilot | Exclude content with a chosen sensitivity label from being processed/summarised by Copilot |
| Power BI / on-prem (scanner) | Detect & alert on sensitive datasets / repositories |
Policy priority
DLP policies have a priority order (0 = highest). When content matches multiple policies, the most restrictive action across matching rules is enforced; rule processing within a policy stops at the first matched rule that applies an action, per its priority.
Endpoint DLP
Extends DLP to activities on Windows 10/11 & macOS devices that are onboarded (the same onboarding as Defender for Endpoint — Intune or script).
- Monitors/controls: copy to USB / removable media, copy to network share, print, copy to clipboard, upload to (unallowed) browser/cloud, paste, access by unallowed apps, Bluetooth transfer, Remote Desktop copy.
- Configure DLP settings: allowed/unallowed browsers & apps, restricted/printer/USB device groups, and evidence collection.
Caveat — Endpoint DLP prerequisites
Endpoint DLP requires devices to be onboarded (Intune or local script) and works best on Entra-joined/registered Windows. Use Microsoft Edge or installed extensions for best browser coverage; "unallowed browsers" can be blocked from accessing sensitive items.
Review & respond to DLP alerts, events & reports
- DLP alerts dashboard (Purview → Data loss prevention → Alerts): triage by severity, view matched content, the rule hit, and user; set status (Investigating / Resolved / Dismissed).
- Activity explorer shows DLP-related activities and policy matches over time.
- DLP reports: policy matches, false positives & overrides, incident trends — feed tuning of conditions/confidence to cut noise.
- Integrated with Defender XDR alerts and can flow to Insider Risk Management for higher-risk users.
Licensing quick map (Purview)
E3: manual sensitivity/retention labels, baseline DLP (Exchange/SP/OneDrive/Teams), message encryption. E5 / E5 Compliance: automatic labeling, Endpoint DLP, advanced SITs (EDM/trainable classifiers), records management, Insider Risk, advanced auditing.