Wire a New Alert Consumer to PSI Notify Bot

Step-by-step for hooking a new PSI automation into the shared PSI Notify Bot. Uses egnyte-stp-sync as the worked example — the first persistent-chat consumer — but the steps apply to any new consumer (scheduled tasks, web apps, CI/CD workflows, ad-hoc scripts).

If you haven’t read it yet: Teams Notifications covers the architecture; this page covers the wiring.


Decision tree before you start

  1. Persistent chat or per-event?

    • Per-event (one chat per discrete event, scoped audience that won’t recur) → like Deploy ProApps. You’ll create a fresh chat each time and discard the chat ID afterward.
    • Persistent (one long-lived chat for ongoing alerts) → like IT Critical Notifications. You’ll bootstrap once, store the chat ID in KV, post to it forever.
    • If you can’t decide, default to persistent. Per-event chats are right for collaboration around a single event; everything else is persistent.
  2. What chat ID will you target?

  3. Where do your secrets come from?

    • App Service → Key Vault references (@Microsoft.KeyVault(...))
    • PS-PROXY scheduled task → managed identity + Get-AzKeyVaultSecret
    • GitHub Actions → OIDC + az keyvault secret show
    • Ad-hoc operator script → az keyvault secret show with the user’s own credentials

All four shapes ultimately read the same psi-notify--* secrets out of ps-certificates-kv.


Worked example: egnyte-stp-sync

The system: scheduled task on PS-PROXY runs every 5 minutes. Pulls files from the Egnyte APPROVED folder, validates them, moves them to a DFS share. Three alert conditions defined in the spec:

TriggerWhenRate limit
File stuck≥5 failed attempts OR >24h in APPROVEDOnce per file per day
Token refresh failureEgnyte OAuth refresh returns non-200Once per 30 min
DFS access failureCan’t list APPROVED folderOnce per 30 min

Target chat: existing IT Critical Notifications (psi-notify--chat-it-critical). Decision: persistent — ops alerts for a long-running system.

Secrets shape: PS-PROXY scheduled task → managed identity → KV.

Step 1 — Pull the module at task startup

The PSI.Notify module lives in ProgressiveSurface/psi-notify-bot on GHE. Three options for how to make it available to the egnyte-stp-sync task:

OptionWhen to use
Git submoduleRecommended for repo-scoped consumers — pin to a known-good version, update on demand
Fresh clone in task wrapperSimple, but every run hits GHE. Fine for low-frequency tasks; wasteful for the 5-min egnyte schedule
Local install on PS-PROXYOne-time git clone to C:\Tools\psi-notify-bot, scheduled task imports from there. Lowest-runtime overhead

For egnyte-stp-sync (5-min schedule on PS-PROXY), pick local install. Wrapper script does:

Import-Module C:\Tools\psi-notify-bot\src\PSI.Notify.psd1 -Force

A monthly git pull (or a dedicated update task) keeps PS-PROXY’s copy current.

Step 2 — Fetch secrets via managed identity

Add this to the egnyte-stp-sync config-load:

# Authenticate to KV using PS-PROXY's managed identity. No client secret on disk.
Connect-AzAccount -Identity -ErrorAction Stop | Out-Null
 
function Get-NotifySecret { param([string]$Name)
    (Get-AzKeyVaultSecret -VaultName ps-certificates-kv -Name $Name -AsPlainText)
}
 
$notify = @{
    TenantId          = Get-NotifySecret "psi-notify--tenant-id"
    ClientId          = Get-NotifySecret "psi-notify--client-id"
    ClientSecret      = Get-NotifySecret "psi-notify--client-secret"
    TeamsAppCatalogId = Get-NotifySecret "psi-notify--teams-app-catalog-id"
    ChatId            = Get-NotifySecret "psi-notify--chat-it-critical"
}

If PS-PROXY’s managed identity doesn’t already have Key Vault Secrets User on ps-certificates-kv, grant it once:

$proxyMi = (Get-AzVM -Name PS-PROXY).Identity.PrincipalId
New-AzRoleAssignment -ObjectId $proxyMi `
    -RoleDefinitionName "Key Vault Secrets User" `
    -Scope "/subscriptions/<sub>/resourceGroups/PS-RG-01/providers/Microsoft.KeyVault/vaults/ps-certificates-kv"

Step 3 — Implement SyncNotifier.psm1

The stub becomes a thin wrapper around PSI.Notify:

# SyncNotifier.psm1
Import-Module C:\Tools\psi-notify-bot\src\PSI.Notify.psd1 -Force
 
# Token cache — Bot Framework tokens live ~1h; avoid re-grabbing per alert
$script:botToken = $null
$script:botTokenExpiresAt = [datetime]::MinValue
 
function Get-BotTokenCached {
    param($Cfg)
    if ((Get-Date) -lt $script:botTokenExpiresAt.AddMinutes(-5)) { return $script:botToken }
    $script:botToken = Get-NotifyBotToken -TenantId $Cfg.TenantId -ClientId $Cfg.ClientId -ClientSecret $Cfg.ClientSecret
    $script:botTokenExpiresAt = (Get-Date).AddHours(1)
    return $script:botToken
}
 
function Send-AlertMessage {
    [CmdletBinding()]
    param(
        [Parameter(Mandatory)][hashtable]$NotifyConfig,  # the four bot secrets + ChatId
        [Parameter(Mandatory)][string]$Subject,
        [Parameter(Mandatory)][string]$BodyHtml
    )
    $token = Get-BotTokenCached -Cfg $NotifyConfig
    $fullBody = "<b>STP Egnyte Sync: $Subject</b><br><br>$BodyHtml"
    Send-NotifyMessage -BotToken $token -ChatId $NotifyConfig.ChatId -BodyHtml $fullBody
}
 
Export-ModuleMember -Function Send-AlertMessage

Step 4 — Rate-limit alerts at the trigger layer

Don’t push rate limiting into the notifier — keep it next to the trigger logic so the dedupe key is obvious. For egnyte-stp-sync:

# rate-limit state file (JSON on disk; keyed by trigger + scope)
$rateStatePath = "C:\ProgramData\PSI\StpEgnyteSync\rate-state.json"
 
function Test-RateLimit {
    param([string]$Key, [timespan]$MinInterval)
    $state = if (Test-Path $rateStatePath) { Get-Content $rateStatePath -Raw | ConvertFrom-Json -AsHashtable } else { @{} }
    $lastSent = if ($state.ContainsKey($Key)) { [datetime]$state[$Key] } else { [datetime]::MinValue }
    if ((Get-Date) - $lastSent -lt $MinInterval) { return $false }   # too soon, skip
    $state[$Key] = (Get-Date).ToString("o")
    $state | ConvertTo-Json | Set-Content $rateStatePath -Encoding UTF8
    return $true                                                     # ok to send
}
 
# Usage at each trigger:
if (Test-RateLimit -Key "stuck-file:$($file.Name)" -MinInterval ([timespan]::FromHours(24))) {
    Send-AlertMessage -NotifyConfig $notify -Subject "file stuck" -BodyHtml @"
<b>File:</b> $($file.Name)<br>
<b>Age in APPROVED:</b> $age<br>
<b>Last error:</b> $($file.LastError)<br>
<b>Log:</b> <code>$logPath</code>
"@
}

Step 5 — Alert body style

Match the house style. Three concrete bodies for egnyte:

File stuck:

<b>STP Egnyte Sync: file stuck</b><br>
<br>
<b>File:</b> Sample-12345.pdf<br>
<b>Attempts:</b> 5 failures since 2026-05-12 09:13<br>
<b>Age in APPROVED:</b> 26h 14m<br>
<b>Last error:</b> 502 Bad Gateway from Egnyte<br>
<b>Log:</b> <code>\\PS-PROXY\logs\stp-egnyte-sync\2026-05-13.log</code>

Token refresh failure:

<b>STP Egnyte Sync: OAuth token refresh failed</b><br>
<br>
<b>When:</b> 2026-05-13 14:22 EDT<br>
<b>Endpoint:</b> POST https://progressivesurface.egnyte.com/puboauth/token<br>
<b>HTTP:</b> 401 — invalid_grant<br>
<b>Impact:</b> No new files will sync until refresh succeeds. Auto-retry every 5 min.<br>
<b>Action:</b> Rotate the refresh token in <code>ps-certificates-kv/stp-egnyte--refresh-token</code> if this persists past 30 min.

DFS access failure:

<b>STP Egnyte Sync: DFS access failure</b><br>
<br>
<b>When:</b> 2026-05-13 14:25 EDT<br>
<b>Path:</b> <code>\\ad.ptihome.com\DFS\STPEgnyteApproved\</code><br>
<b>Error:</b> Logon failure: unknown user name or bad password<br>
<b>Impact:</b> Sync cannot list source folder. No data movement until resolved.<br>
<b>Action:</b> Check PS-PROXY domain trust / scheduled task credentials.

Step 6 — Smoke test before going live

Before scheduling the task:

  1. Run the wrapper manually once with a forced trigger (e.g. set the rate-limit state to allow, simulate a stuck file).
  2. Confirm the message lands in IT Critical Notifications.
  3. Confirm the rate-limit state file is updated.
  4. Repeat the forced trigger immediately — confirm the second send is suppressed.

Step 7 — Register as a consumer

Add a row to:

So future consumers can see what’s already wired and how.


Template: minimum consumer wiring

For consumers other than egnyte, the same shape applies:

# 1. Import module (path varies by deployment shape)
Import-Module $modulePath/src/PSI.Notify.psd1 -Force
 
# 2. Pull secrets (source varies by runtime)
$cfg = @{
    TenantId     = ...
    ClientId     = ...
    ClientSecret = ...
    ChatId       = ...   # from psi-notify--chat-<purpose>
}
 
# 3. Get a token (cache it if you'll send more than ~10 messages/hour)
$bot = Get-NotifyBotToken -TenantId $cfg.TenantId -ClientId $cfg.ClientId -ClientSecret $cfg.ClientSecret
 
# 4. Send the alert (HTML body matching house style)
Send-NotifyMessage -BotToken $bot -ChatId $cfg.ChatId -BodyHtml $html

That’s the whole contract. The module handles the three Graph pitfalls for you; you handle the message shape, rate limiting, and trigger logic.


Common mistakes

  • Sending raw error stacks. Recipients skim. Lead with the subject, give the user something to act on, then put the gory details below.
  • No rate limiting. A bad day will flood the chat. Pick a sensible dedupe key (trigger + scope) and a sensible interval (24h for per-resource issues, 30min for system-wide).
  • Logging the bot token. Tokens are 1h-lived but still — don’t log them. $bot.Substring(0,12) + '...' if you must.
  • Hardcoding the chat ID in the consumer. Always pull from KV. Chat IDs may change (rebootstrap, migration); the KV secret name is the stable handle.
  • Reusing a per-event chat ID. A per-event chat (deploy-style) gets closed/forgotten; messages sent there later might be invisible. Persistent-chat consumers should only use psi-notify--chat-<purpose> IDs.

See Also


Last updated: 2026-05-13 — written alongside the egnyte-stp-sync spec.