Wire a New Alert Consumer to PSI Notify Bot
Step-by-step for hooking a new PSI automation into the shared PSI Notify Bot. Uses egnyte-stp-sync as the worked example — the first persistent-chat consumer — but the steps apply to any new consumer (scheduled tasks, web apps, CI/CD workflows, ad-hoc scripts).
If you haven’t read it yet: Teams Notifications covers the architecture; this page covers the wiring.
Decision tree before you start
-
Persistent chat or per-event?
- Per-event (one chat per discrete event, scoped audience that won’t recur) → like Deploy ProApps. You’ll create a fresh chat each time and discard the chat ID afterward.
- Persistent (one long-lived chat for ongoing alerts) → like
IT Critical Notifications. You’ll bootstrap once, store the chat ID in KV, post to it forever. - If you can’t decide, default to persistent. Per-event chats are right for collaboration around a single event; everything else is persistent.
-
What chat ID will you target?
- Reuse an existing persistent chat (e.g.
IT Critical Notifications) → look up its KV secret name in Teams Notifications → Active persistent chats. - Need a new chat → you’ll bootstrap one with
Initialize-NotifyChat.ps1.
- Reuse an existing persistent chat (e.g.
-
Where do your secrets come from?
- App Service → Key Vault references (
@Microsoft.KeyVault(...)) - PS-PROXY scheduled task → managed identity +
Get-AzKeyVaultSecret - GitHub Actions → OIDC +
az keyvault secret show - Ad-hoc operator script →
az keyvault secret showwith the user’s own credentials
- App Service → Key Vault references (
All four shapes ultimately read the same psi-notify--* secrets out of ps-certificates-kv.
Worked example: egnyte-stp-sync
The system: scheduled task on PS-PROXY runs every 5 minutes. Pulls files from the Egnyte APPROVED folder, validates them, moves them to a DFS share. Three alert conditions defined in the spec:
| Trigger | When | Rate limit |
|---|---|---|
| File stuck | ≥5 failed attempts OR >24h in APPROVED | Once per file per day |
| Token refresh failure | Egnyte OAuth refresh returns non-200 | Once per 30 min |
| DFS access failure | Can’t list APPROVED folder | Once per 30 min |
Target chat: existing IT Critical Notifications (psi-notify--chat-it-critical). Decision: persistent — ops alerts for a long-running system.
Secrets shape: PS-PROXY scheduled task → managed identity → KV.
Step 1 — Pull the module at task startup
The PSI.Notify module lives in ProgressiveSurface/psi-notify-bot on GHE. Three options for how to make it available to the egnyte-stp-sync task:
| Option | When to use |
|---|---|
| Git submodule | Recommended for repo-scoped consumers — pin to a known-good version, update on demand |
| Fresh clone in task wrapper | Simple, but every run hits GHE. Fine for low-frequency tasks; wasteful for the 5-min egnyte schedule |
| Local install on PS-PROXY | One-time git clone to C:\Tools\psi-notify-bot, scheduled task imports from there. Lowest-runtime overhead |
For egnyte-stp-sync (5-min schedule on PS-PROXY), pick local install. Wrapper script does:
Import-Module C:\Tools\psi-notify-bot\src\PSI.Notify.psd1 -ForceA monthly git pull (or a dedicated update task) keeps PS-PROXY’s copy current.
Step 2 — Fetch secrets via managed identity
Add this to the egnyte-stp-sync config-load:
# Authenticate to KV using PS-PROXY's managed identity. No client secret on disk.
Connect-AzAccount -Identity -ErrorAction Stop | Out-Null
function Get-NotifySecret { param([string]$Name)
(Get-AzKeyVaultSecret -VaultName ps-certificates-kv -Name $Name -AsPlainText)
}
$notify = @{
TenantId = Get-NotifySecret "psi-notify--tenant-id"
ClientId = Get-NotifySecret "psi-notify--client-id"
ClientSecret = Get-NotifySecret "psi-notify--client-secret"
TeamsAppCatalogId = Get-NotifySecret "psi-notify--teams-app-catalog-id"
ChatId = Get-NotifySecret "psi-notify--chat-it-critical"
}If PS-PROXY’s managed identity doesn’t already have Key Vault Secrets User on ps-certificates-kv, grant it once:
$proxyMi = (Get-AzVM -Name PS-PROXY).Identity.PrincipalId
New-AzRoleAssignment -ObjectId $proxyMi `
-RoleDefinitionName "Key Vault Secrets User" `
-Scope "/subscriptions/<sub>/resourceGroups/PS-RG-01/providers/Microsoft.KeyVault/vaults/ps-certificates-kv"Step 3 — Implement SyncNotifier.psm1
The stub becomes a thin wrapper around PSI.Notify:
# SyncNotifier.psm1
Import-Module C:\Tools\psi-notify-bot\src\PSI.Notify.psd1 -Force
# Token cache — Bot Framework tokens live ~1h; avoid re-grabbing per alert
$script:botToken = $null
$script:botTokenExpiresAt = [datetime]::MinValue
function Get-BotTokenCached {
param($Cfg)
if ((Get-Date) -lt $script:botTokenExpiresAt.AddMinutes(-5)) { return $script:botToken }
$script:botToken = Get-NotifyBotToken -TenantId $Cfg.TenantId -ClientId $Cfg.ClientId -ClientSecret $Cfg.ClientSecret
$script:botTokenExpiresAt = (Get-Date).AddHours(1)
return $script:botToken
}
function Send-AlertMessage {
[CmdletBinding()]
param(
[Parameter(Mandatory)][hashtable]$NotifyConfig, # the four bot secrets + ChatId
[Parameter(Mandatory)][string]$Subject,
[Parameter(Mandatory)][string]$BodyHtml
)
$token = Get-BotTokenCached -Cfg $NotifyConfig
$fullBody = "<b>STP Egnyte Sync: $Subject</b><br><br>$BodyHtml"
Send-NotifyMessage -BotToken $token -ChatId $NotifyConfig.ChatId -BodyHtml $fullBody
}
Export-ModuleMember -Function Send-AlertMessageStep 4 — Rate-limit alerts at the trigger layer
Don’t push rate limiting into the notifier — keep it next to the trigger logic so the dedupe key is obvious. For egnyte-stp-sync:
# rate-limit state file (JSON on disk; keyed by trigger + scope)
$rateStatePath = "C:\ProgramData\PSI\StpEgnyteSync\rate-state.json"
function Test-RateLimit {
param([string]$Key, [timespan]$MinInterval)
$state = if (Test-Path $rateStatePath) { Get-Content $rateStatePath -Raw | ConvertFrom-Json -AsHashtable } else { @{} }
$lastSent = if ($state.ContainsKey($Key)) { [datetime]$state[$Key] } else { [datetime]::MinValue }
if ((Get-Date) - $lastSent -lt $MinInterval) { return $false } # too soon, skip
$state[$Key] = (Get-Date).ToString("o")
$state | ConvertTo-Json | Set-Content $rateStatePath -Encoding UTF8
return $true # ok to send
}
# Usage at each trigger:
if (Test-RateLimit -Key "stuck-file:$($file.Name)" -MinInterval ([timespan]::FromHours(24))) {
Send-AlertMessage -NotifyConfig $notify -Subject "file stuck" -BodyHtml @"
<b>File:</b> $($file.Name)<br>
<b>Age in APPROVED:</b> $age<br>
<b>Last error:</b> $($file.LastError)<br>
<b>Log:</b> <code>$logPath</code>
"@
}Step 5 — Alert body style
Match the house style. Three concrete bodies for egnyte:
File stuck:
<b>STP Egnyte Sync: file stuck</b><br>
<br>
<b>File:</b> Sample-12345.pdf<br>
<b>Attempts:</b> 5 failures since 2026-05-12 09:13<br>
<b>Age in APPROVED:</b> 26h 14m<br>
<b>Last error:</b> 502 Bad Gateway from Egnyte<br>
<b>Log:</b> <code>\\PS-PROXY\logs\stp-egnyte-sync\2026-05-13.log</code>Token refresh failure:
<b>STP Egnyte Sync: OAuth token refresh failed</b><br>
<br>
<b>When:</b> 2026-05-13 14:22 EDT<br>
<b>Endpoint:</b> POST https://progressivesurface.egnyte.com/puboauth/token<br>
<b>HTTP:</b> 401 — invalid_grant<br>
<b>Impact:</b> No new files will sync until refresh succeeds. Auto-retry every 5 min.<br>
<b>Action:</b> Rotate the refresh token in <code>ps-certificates-kv/stp-egnyte--refresh-token</code> if this persists past 30 min.DFS access failure:
<b>STP Egnyte Sync: DFS access failure</b><br>
<br>
<b>When:</b> 2026-05-13 14:25 EDT<br>
<b>Path:</b> <code>\\ad.ptihome.com\DFS\STPEgnyteApproved\</code><br>
<b>Error:</b> Logon failure: unknown user name or bad password<br>
<b>Impact:</b> Sync cannot list source folder. No data movement until resolved.<br>
<b>Action:</b> Check PS-PROXY domain trust / scheduled task credentials.Step 6 — Smoke test before going live
Before scheduling the task:
- Run the wrapper manually once with a forced trigger (e.g. set the rate-limit state to allow, simulate a stuck file).
- Confirm the message lands in
IT Critical Notifications. - Confirm the rate-limit state file is updated.
- Repeat the forced trigger immediately — confirm the second send is suppressed.
Step 7 — Register as a consumer
Add a row to:
- Teams Notifications → Consumers (this wiki page)
- psi-notify-bot README → Consumers section
So future consumers can see what’s already wired and how.
Template: minimum consumer wiring
For consumers other than egnyte, the same shape applies:
# 1. Import module (path varies by deployment shape)
Import-Module $modulePath/src/PSI.Notify.psd1 -Force
# 2. Pull secrets (source varies by runtime)
$cfg = @{
TenantId = ...
ClientId = ...
ClientSecret = ...
ChatId = ... # from psi-notify--chat-<purpose>
}
# 3. Get a token (cache it if you'll send more than ~10 messages/hour)
$bot = Get-NotifyBotToken -TenantId $cfg.TenantId -ClientId $cfg.ClientId -ClientSecret $cfg.ClientSecret
# 4. Send the alert (HTML body matching house style)
Send-NotifyMessage -BotToken $bot -ChatId $cfg.ChatId -BodyHtml $htmlThat’s the whole contract. The module handles the three Graph pitfalls for you; you handle the message shape, rate limiting, and trigger logic.
Common mistakes
- Sending raw error stacks. Recipients skim. Lead with the subject, give the user something to act on, then put the gory details below.
- No rate limiting. A bad day will flood the chat. Pick a sensible dedupe key (trigger + scope) and a sensible interval (24h for per-resource issues, 30min for system-wide).
- Logging the bot token. Tokens are 1h-lived but still — don’t log them.
$bot.Substring(0,12) + '...'if you must. - Hardcoding the chat ID in the consumer. Always pull from KV. Chat IDs may change (rebootstrap, migration); the KV secret name is the stable handle.
- Reusing a per-event chat ID. A per-event chat (deploy-style) gets closed/forgotten; messages sent there later might be invisible. Persistent-chat consumers should only use
psi-notify--chat-<purpose>IDs.
See Also
- Teams Notifications — architecture, secrets, pitfalls, consumer registry
- PSI Notify Bot — bot resource page
- Deploy ProApps — per-event chat consumer example
- psi-notify-bot repo — module source
Last updated: 2026-05-13 — written alongside the egnyte-stp-sync spec.