PRGJSMES (Development MES)

Web-based Manufacturing Execution System replacing 161+ legacy Microsoft Access databases for PRG thermal spray coating operations. Built with .NET 8 API + React 19 + SQL Server.

Repository: ProgressiveSurface/PRGJSMES (private)

For AI agents / new contributors: start at CLAUDE_ONBOARDING.md in the repo root — single-file cold-start guide covering current pilot state, non-obvious rules (no direct push to main, inline ExecuteSqlRaw schema pattern, three-place permission seeding, expand-contract for breaking changes), and task-routing pointers into the wiki + knowledge base.


Overview

PropertyValue
Production URLpsmes.progressivesurface.com
Tech Stack.NET 8 API + React 19 + SQL Server
API Port (Dev)http://localhost:5262
Frontend Port (Dev)http://localhost:3000
DatabaseAzure SQL — procserv-proddata.database.windows.net / PRGJSMES
RepositoryProgressiveSurface/PRGJSMES
Production DeploymentMES — Azure App Service prgjsmes-prod

Architecture

Backend

  • ASP.NET Core 8.0 Web API
  • Entity Framework Core with SQL Server
  • 48 API Controllers covering all MES modules
  • 111 EF Core Entity Models for production data
  • Swagger UI at http://localhost:5262/swagger

Frontend

  • React 19 with TypeScript
  • React Router DOM 7 for navigation
  • Axios for API calls (request signals wired via AbortController across operator pages)
  • Recharts for analytics dashboards
  • 74 page components (58 top-level + 16 Job Shop) across 45+ modules
  • 27 shared components (PageHeader, WorkInstructionModal, ScaleStatus, etc.)
  • Touch-screen optimized for production terminals; React.memo on hot header, useMemo on derived lists, slice-based pagination on large tables
  • PWA support — service worker, installable on mobile devices (Zebra TC52)

Database

  • Azure SQL — procserv-proddata.database.windows.net
  • Database Name: PRGJSMES
  • 103 SQL migration scripts for schema and seed data
  • Connection: Windows Authentication (dev), Entra ID (prod)

Authentication & Authorization

  • Entra ID (Azure AD) — Easy Auth in production, dev bypass locally
  • RBAC System — 10 roles, 41 permissions across 10 categories
  • Entra Group Sync — map Entra ID security groups to application roles
  • Permission-based dashboard — each user sees only the modules their role grants

Role-Based Access Control

The dashboard dynamically filters modules based on the logged-in user’s permissions. Roles are managed through the Admin UI and can be mapped to Entra ID groups for automatic assignment.

Roles

RoleTypeDescription
AdminSystemFull system access including user and role administration
ManagerCustomFull operational access — everything except user/role admin
SupervisorSystemApprove lot closes, sign-offs, corrections; see all operational modules
SchedulerCustomManage order entry, scheduling, run orders, line assignments
QACustomQuality assurance — thermal, argo, QA dashboard, skid scan, coating analysis
OperatorSystemDefault catch-all — thermal + argo + basic production views
Thermal OperatorCustomThermal spray line operators — thermal dashboards, ASP blasting, DP bagging
Argo OperatorCustomArgo label station — pack boxes, create labels, manage crates
Material HandlerCustomReceiving and outbound shipping
Line LeaderCustomSenior operators with selected supervisor-level capabilities (currently: skid scan undo + reopen). Operator + Supervisor permission union to be filled in by admin.

Dashboard Sections

The home dashboard is organized into permission-gated sections:

SectionContentsGate Permission
Production ModulesOrder Entry, Receiving, Scheduling, etc.Per-card (e.g., orders.create, scheduling.view)
QualityQA Dashboard, Skid Scan + Label/Defect Type configargo.correct_box / admin.manage_config
Material HandlingWarehouse, Receiving, Receiving Recon, ShippingPer-card (e.g., warehouse.view, shipping.view)
Production AnalyticsProduction Metrics (live KPIs) — rendered to the right of Material Handling via parallelRow flexproduction.view
Job ShopJob Shop entry tile — its own section alongside Production Analyticsjob_shop.view
ASP Blasting / DP BaggingBlasting and bagging group managementthermal.view
Thermal OperationsDynamic machine buttons (loaded from DB)thermal.view
Argo Label MakersDynamic line buttons (loaded from DB)argo.view
Materials & OrdersManage Orders, Raw Materials, Parts, Part Material UsagePer-card (e.g., orders.edit, raw_materials.view)
System ConfigurationBucket Types, Routingsadmin.manage_config / admin.manage_routings
AdministrationUser & Role Admin onlyadmin.manage_users / admin.manage_roles

Empty sections are hidden entirely. Users with no roles see a “No Modules Assigned” message.


Audit & Remote Support

The audit subsystem captures who did what in PRGJSMES — every API call that changes state, every database write with field-level before→after diffs, every page navigation, and (for tagged buttons) every click. Supporters can investigate incidents in three ways: by user, by entity (order / cart / box / etc.), or by glancing at who’s currently active.

Use cases this exists for:

  • “Operator says order 12345 has wrong status — who changed it and when?”
  • “Operator reports the system is broken — where are they in the app and what did they do last?”
  • “Quality finds a defect on a lot — who closed that lot and what did they enter?”
  • “Operator on the floor needs help — let them request it from inside the app, with full context attached.”

Supporter surfaces

URLPurposeGated by
/admin/auditInvestigate per-user timeline, per-entity history, or watch a single user live (5s polling)admin.view_audit_log
/admin/presenceSee who’s actively using the system right now and on what page (10s polling)admin.view_audit_log
History tab on /orders (and others as they’re adopted)Inline audit timeline for the entity you’re already looking atadmin.view_audit_log

Default holders of admin.view_audit_log: Admin, Manager, Supervisor.

Operator surface

Every page has a Get Help button in PageHeader (top-right). Operators describe their issue in plain English; the system bundles their identity, current page, and last 10 audit events, persists a HelpRequests row (Status=Open), and posts a Teams card into the persistent PRGJSMES Help chat via the shared PSI Notify Bot. The card carries a “Resolve this request →” deep-link to /help/resolve/{id}; a triager (Admin / Manager) opens it, writes a resolution note, and submits — the row is marked Resolved and a follow-up card posts in the same chat. No gate on the Get Help submit; the resolve page is gated by support.triage. Cards are HTML over the Bot Framework REST API; bot identity + chat ID resolve from psi-notify--* Key Vault secrets via App Service references.

How it works (under the hood)

                              ┌───────────────────────┐
                              │  AuditEvents          │
   Server-side:               │  SQL table            │
   - AuditMiddleware → every  │  (user, entity, route │
     POST/PUT/PATCH/DELETE    │   diff, timestamp...) │
   - EF SaveChangesInterceptor│                       │
     → every entity write     └───────────▲───────────┘
     with old→new diff                    │
                                          │ bulk insert (batched)
   Frontend:                              │
   - AuditProvider buffer ───────► API ───┘
   - useAudit() hook (manual)             AuditFlusherHostedService
   - data-audit attr (auto-capture clicks)
   - AuditNavTracker (page.view events)

Critical property: the audit pipeline is best-effort and never blocks the request path. The backend queue is bounded with drop-oldest backpressure; producer paths swallow their own exceptions. If audit degrades, the app keeps working — operators just lose audit fidelity temporarily.

Permission

admin.view_audit_log — seeded via the inline-startup pattern (see “Database Schema Changes” section below). Default role assignments: Admin, Manager, Supervisor. Operators and QA roles excluded by default.

Extending audit coverage (for future feature work)

When you add a new state-changing endpoint that mutates an investigatable entity:

[AuditEntity("Cart", "{cartId}")]
[HttpPut("cart/{cartId}/do-thing")]
[RequirePermission("thermal.do_thing")]
public async Task<IActionResult> DoThing(int cartId, ...) { ... }

The [AuditEntity] attribute opts the endpoint into per-entity timelines. Without it, the call still gets captured by middleware but won’t show up in “everything that happened to order 12345” queries. Convention: attribute order is [AuditEntity][Http…][RequirePermission].

When you add an important operator button:

<button onClick={handleFinishLot} data-audit="thermal.lot.finish_click" ...>
  Finish Lot
</button>

The global click listener in AuditProvider captures any element with data-audit. Naming convention: <module>.<thing>.<verb>_<noun>. Optional data-audit-<key>="value" attrs flow into the event’s context JSON.

When you add a new entity detail view:

import { EntityHistoryPanel } from '../components/EntityHistoryPanel';
<EntityHistoryPanel entityType="Cart" entityId={cart.cartId} />

The component is permission-gated by admin.view_audit_log — renders nothing for non-admins.

Code map (where things live in the repo)

Backend (PRGJSMES.API/):

  • Auditing/AuditMiddleware.cs — captures state-changing API calls
  • Auditing/AuditSaveChangesInterceptor.cs — captures EF entity writes with diffs
  • Auditing/AuditWriter.cs + AuditFlusherHostedService.cs — bounded channel + background batch insert
  • Auditing/AuditEntityAttribute.cs + AuditEntityResolver.cs — opt-in entity correlation
  • Controllers/AuditController.cs/api/audit/user/{id}, /entity/{type}/{id}, /client, /presence
  • Controllers/SupportController.cs + Services/HelpRequestService.cs — Get Help endpoint
  • Models/AuditEvent.cs + HelpRequest.cs — entity classes
  • Program.cs (~line 175) — DI wiring; (~line 500) — inline startup migration for the tables + permission seed

Frontend (prgjsmes-web/src/):

  • auth/AuditProvider.tsx — queue, periodic flush, fetch + keepalive on pagehide (NOT sendBeacon — endpoint requires Authorization header), global data-audit click capture
  • auth/AuditNavTracker.tsx — fires page.view on Router changes
  • pages/AdminAudit.tsx — the dashboard (3 tabs)
  • pages/AdminPresence.tsx — presence panel
  • components/EntityHistoryPanel.tsx — reusable per-entity timeline (currently embedded in OrdersManagement; adoption ongoing)
  • components/HelpButton.tsx — Get Help button in PageHeader

SQL (sql/):

  • applied/20260507_create_audit_and_help_tables.sql — canonical schema reference
  • fix_missing_permissions.sql SECTION 2.6 — admin.view_audit_log seed
  • sop_seed_data.sql SOP-PRGJSMES-060/061/062 — SOPs for the three new admin features

Detailed reference

Repo-local: docs/audit-and-remote-support.md — same content with deeper conventions, scenario walkthroughs, and pointers for future Claude instances. Build log + 10-item follow-up table: BUILD_LOG.md (2026-05-13 entry). Design spec + plan: docs/superpowers/specs/2026-05-07-remote-support-and-audit-design.md and docs/superpowers/plans/2026-05-07-phase-1-audit-foundation-and-help-on-ramp.md.


Badge-Scan Supervisor Approval & Terminal Registry

Two related “enhanced management” features layered on top of the core MES, both progressive enhancements — PSMES runs fully with none of them in place.

Badge-scan supervisor approval

The SupervisorApproval modal offers two paths for a non-supervisor to obtain approval, side by side: badge tap (preferred) or typed initials (fallback). Either path produces the same wire result (onApprove(initials, reason)); when both are populated the badge scan wins (physical-tap identity is more authoritative).

Where it appears today: ArgoInput, ThermalInput, PreGritInput, ThermalWeight, Thermal2Weight, DPBagging, ASPBlasting, QADashboard, AdminMachines.

Per-Confirm UX:

  • Badge resolved + has approval permission → supervisor name shown, Confirm enabled
  • Badge resolved but lacks the permission → red banner “Badge accepted, but X does not have the required approval authority.” (surfaced from ?permission= on the lookup so the operator sees it without a Confirm round-trip)
  • Typed initials path is always available — terminals without a reader, or any reader that fails mid-shift, keep working

Backend:

  • GET /api/admin/badge/{code}?permission=… — resolves a UID to { initials, displayName, hasPermission? }
  • PUT /api/admin/users/{id}/badge-code — capture/clear a UID for a user, with a one-card-one-user collision check (409 Conflict if the UID is already on another user)

PSI Badge Reader Service

A small local HTTP service that runs on each Windows terminal, reads HID Crescendo card UIDs from an Omnikey 5022 CL reader via PC/SC, and exposes localhost:5555/{read,health,clear}. The frontend (useOmnikeyBadge) polls /read while an approval modal is open.

  • Source: badge-reader-service/ in the PRGJSMES repo. Node + Express + pcsc-mini (prebuilt binaries, no VS Build Tools required).
  • Wrapped as a Windows service by WinSW v2 (PSIBadgeReader.exe) — plain node.exe doesn’t ack the SCM start request and SCM kills it after 85 s.
  • Logs: %ProgramData%\PSI\BadgeReaderService\logs\ (service.log + rotated stdout/stderr + WinSW wrapper log).
  • Deployment: built as an MSI by installer/build.ps1, wrapped as a Win32 .intunewin, pushed via Intune Line-of-Business app. MSI version tracks badge-reader-service/package.json (not Node) so service-only changes trigger MajorUpgrade. /health reports hostname, version, readerConnected, readerName, lastCardTimestamp.

Terminal registry

TerminalStatuses table (Hostname-keyed, unique index) accumulates device-reported telemetry: LastSeenAt, AppVersion, BadgeServiceUp, BadgeServiceVersion, ReaderConnected, ReaderName, LastCardAt, LastUserName. Inline ExecuteSqlRaw startup migration in Program.cs.

  • POST /api/terminals/heartbeat — any signed-in user; upserts by hostname (normalized to upper-invariant); the current user is stamped server-side from the JWT
  • GET /api/terminals/status — gated admin.manage_config; backs the admin table
  • TerminalHeartbeat.tsx (mounted in App) — every 60 s, fetches localhost:5555/health; if the badge service answers, posts a heartbeat. Silent degrade: no badge service → no heartbeat → terminal not in registry → app unaffected. If the service drops mid-session, keeps reporting with the cached hostname and badgeServiceUp:false so the admin page can show “terminal up, badge service down”.

Admin UI — unified Devices page (/admin/devices)

/admin/printers was retired (PR #284, 2026-06) and now redirects here. The operator self-service /printer-setup and /scale-setup pages redirect to /terminal-claim — hardware is no longer picked per browser; it flows from the Station the terminal claims.

The Devices page is the single inventory for all shop-floor hardware, split into three tables:

  • Workstations — Windows shop-floor PCs. Merges the Intune ManagedDevices cache (Graph poller, ~5-min cadence + manual “Sync from Intune”) with device-reported TerminalStatuses heartbeats. Columns: lifecycle status (In Use / Assigned / Unassigned), heartbeat health (Online / Stale / Offline), badge-service up/down + version, reader connected. Heartbeat-only rows get a “discovered” badge. Actions: Assign / Reassign (bind to a Station via a Terminal row keyed on EntraDeviceId), Unbind (drop the Terminal binding, keep the device), Remove (untrack + clear heartbeat rows; 409-guarded while a Terminal binding exists; PSMES-* group members re-track on the next sweep).
  • Peripherals (printers & scales — named to match the workcenter detail panel) — manual inventory with CRUD, Test (printer: test ZPL label; scale: live SSE weight stream), Label (body-sticker QR), Swap / Assign (peripheral slot binding), and lifecycle status (InUse / Spare / Broken / Retired — InUse vs Spare is derived from the peripheral binding, not stored). Adding a printer auto-generates its name from the model as PS-{MODEL}-{NN} (e.g. PS-ZT231-05); Location (stored as Building until #301’s column drop) is the shelf locator for spares. Deleting a device that an active slot or Station still resolves returns a clean 409 (PR #304).
  • Handhelds — Android TC52/TC53e devices. No admin-time assignment: they bind to a Station at runtime when the operator scans the station QR placard (StationSession). Columns show live presence (operator + station) and last session activity.

Which Intune devices appear is driven by PSMES-* Entra group membership (PSMES-Terminals / PSMES-Handhelds); per-device overrides live on /admin/devices/import.

Workcenters & Stations (/admin/workcenters)

The hardware-resolution model (PR-A→PR-E, 2026-05):

Workcenter (Line 4, Waterjet 1, …)
 ├── Stations (role-keyed: Argo, PreGrit, Thermal, …)
 │     └── own the printer / backup printer / scale / backup scale / scale mode
 │     └── Terminals CLAIM a station → browsers resolve hardware from it
 └── Peripherals (named slots, e.g. "Argo Box Label Printer")
       └── the supervisor swap workflow — see SOP-PRGJSMES-068

A Station’s Role is its function key: it drives the derived display name (<Workcenter> <Role>), flow-diagram ordering, mobile shift-picker grouping, and Copy-structure-from dedup. The workcenter detail page has the material-flow diagram, Station CRUD (+ “Copy structure from…” another line), the Peripherals swap panel, live handheld operator presence, and the Terminals claimed table (with a Move action to re-point a terminal at a different Station).

Deep documentation lives in the PRGJSMES repo wiki: wiki/admin/device-registry.md, wiki/admin/workcenters.md, wiki/admin/peripherals.md, wiki/hardware/spare-device-prep.md; SOPs 067–069 cover the operator procedures.

Badge enrollment (admin-driven)

Today, /admin/users → expand a user → Badge Code section → Tap Card to Fill (or Clear to remove). Card UIDs are read on the admin’s own workstation’s reader and saved to AppUsers.BadgeCode. The list endpoints return BadgeCode so admins can see and clear an assigned card from the UI. Self-service “My Badge” enrollment is proposed but not yet built.

Quick training overview: docs/training/admin-badge-enrollment.md. Cross-project integration guide (for adding badge-code capture to the PSI Badge Provisioner): docs/badge-code-read-and-store-guide.md.

Code map

AreaFile
SupervisorApproval modalprgjsmes-web/src/components/SupervisorApproval.tsx
Badge poll hookprgjsmes-web/src/hooks/useOmnikeyBadge.ts
Badge lookup + verify APIPRGJSMES.API/Controllers/AdminController.cs (LookupByBadge, UpdateBadgeCode)
Badge reader servicebadge-reader-service/service.js
Badge MSI installerbadge-reader-service/installer/{Package.wxs, build.ps1, PSIBadgeReader.xml}
Terminal registry table + migrationsql/20260518_terminal_status.sql + Program.cs inline startup block
Terminal heartbeat (client)prgjsmes-web/src/auth/TerminalHeartbeat.tsx
Terminal APIPRGJSMES.API/Controllers/TerminalsController.cs
Unified device inventoryprgjsmes-web/src/pages/AdminDevices.tsx + PRGJSMES.API/Controllers/DevicesController.cs
Workcenter detailprgjsmes-web/src/pages/AdminWorkcenterDetail.tsx + WorkcentersController.cs

Build-log references

BUILD_LOG.md entries: 2026-05-18 Intune MSI installer, 2026-05-18 Terminal registry, 2026-05-18 Badge-scan additive supervisor approval, 2026-05-19 Badge MSI versioning fix, 2026-05-19 BadgeCode returned by users endpoints, 2026-05-19 Configure auto-discovered terminals.

Known limitations / not yet built

  • Self-service “My Badge” enrollment — proposed, not built. Enrollment today requires an admin and the card physically on the admin’s reader.
  • Badge-reader Test button — proposed, not built. The closest current test is the capture flow on /admin/users.

(Previously listed here and since shipped: manual terminal claim + Android device identity → /terminal-claim, 2026-05-20; printer/scale assignment on the Terminal row → superseded by the Station model, PR-A→PR-E.)


Key Modules (42 modules)

ModuleRoutePurpose
Dashboard/Role-filtered production overview with section-based layout
Order Entry/order-entryIC paste grid, ASP/TP simplified entry with auto-pallet-splitting and DP auto-create
Receiving/receivingVerify received parts against orders
Receiving Recon/receiving-reconciliationExpandable date sidebar, re-edit corrected orders, IncomingId auto-parsing
Line Scheduling/line-schedulingSortable columns, DP auto-scheduling with parent TP
Coinstack Setup/coinstack-setupConfigure coinstack parameters, sortable pending orders
Production Paperwork/production-paperworkGenerate production travelers and lot traces
Run Order/run-orderLine-level production execution
Line Switch/line-switchTransfer orders between lines
Raw Materials/raw-materialsPowder lot and inventory management
Part Material Usage/part-material-usageMaterial consumption tracking
Thermal Dashboard/thermalCross-line thermal operations overview (dynamic from machines DB)
Line Dashboard/thermal/line/:lineId or /thermal/machine/:machineIdSingle-line cart tracking with barcode scanner navigation
Pre-Grit Input/thermal/line/:lineId/pre-grit or /thermal/machine/:machineId/pre-gritPre-grit weighing, exception tracking, reweight, lot close validation
Thermal Input/thermal/line/:lineId/thermal or /thermal/machine/:machineId/thermalThermal spray data entry
Weight Pages/thermal/pre-grit/:cartIdPer-cart weight capture
Coating Analysis/thermal/coating-analysisCross-line coating thickness analytics
Argo Label/argo/line/:lineId or /argo/machine/:machineIdPost-thermal packing and labeling with Mettler Toledo scale integration
QA Dashboard/qa/dashboardLabel audit with scale integration, holds, reports (tab-based)
Skid Scan/skidscanBarcode scan, bucket classification, pallet building, crate labels. Operator corrections: Undo Scan (RGA/ASP single box, Bulk non-ASP crate batch) and Reopen Session (revert 92→90 with TicketStale flag). Physical verification only — data corrections (Pieces, BucketTypeId) go to QA Lot Audit.
Shipping/shippingShipment creation, BOL/ASN tracking, load reports, invoicing
ASP Blasting/asp-blastingASP blasting group management
DP Bagging/dp-baggingDP bagging group management
Warehouse/warehouseRack storage, putaway, pick lists, truck loading
Orders Management/ordersManage existing orders (edit, cancel)
Rework/reworkRework cart creation for defective parts
Admin Dashboard/adminUser management, role management, group management, diagnostics
Admin: Devices/admin/devicesUnified inventory — printers, scales, Intune workstations, handhelds (replaced /admin/printers)
Admin: Devices Import/admin/devices/importBrowse the Intune cache, per-device tracking overrides, PSMES-* group sweep
Admin: Workcenters/admin/workcentersWorkcenter → Station model: station CRUD, peripheral swap panel, terminals claimed
Claim Terminal/terminal-claimBrowser claims a Terminal/Station (replaced /printer-setup + /scale-setup)
Admin: Label Templates/admin/label-templatesVisual card gallery with ZPL preview and test print for all 13 templates
Admin: Label Types/admin/label-typesConfigure label types with bucket routing and per-part assignment
Admin: Defect Types/admin/defect-typesConfigure QA defect categories with per-part assignment
Admin: Bucket Types/admin/bucket-typesConfigure shipping bucket structures and pallet rules
Production Config/production-configHub page for label types, defect types, bucket types, freight carriers, and part material usage (Parts, Routings, Raw Materials are promoted to home-dashboard tiles)
Admin: Routings/admin/routingsProduction line routing configuration
Admin: Parts/admin/partsPart catalog management with weight limits and scale config
Admin: Machines/admin/machinesCapital equipment tracking and line assignment
Admin: Bay Locations/admin/bay-locationsManage warehouse rack locations by building
Admin: Users/admin/usersUser account management and role assignment
Admin: Roles/admin/rolesRole and permission management
Admin: Groups/admin/groupsEntra ID group-to-role mapping
Supervisors/supervisorsAuthorized approver management
Admin: SOPs/admin/sopsISO 9001:2015 standard operating procedures (59 SOPs — 001-045 core, 046-059 Job Shop) with work instructions and production flowchart
Admin: Customers/admin/customersCustomer master data with BC-compatible fields and addresses

Production Flow

Order Entry → Receiving → Line Scheduling → Coinstack Setup
→ Production Paperwork → Run Order → Pre-Grit → Thermal Spray
→ Argo Label → QA Audit → Skid Scan → Ship

Order Status Flow

0 (New) → 5 (ReceivingDiscrepancy) → 9 (AwaitingPutaway) → 10 (ReadyToSchedule)
→ 15 (Scheduled) → 20 (PaperworkPrinted) → 25 (PreGritComplete)
→ 35 (ThermalComplete) → 40 (ArgoInProgress) → 45 (ArgoComplete)
→ 90 (ReadyToSkidScan) → 92 (SkidScanComplete) → 95 (Shipped)

AEP two-pass thermal:

... → 25 (PreGritComplete) → 27 (AEPThermal1Complete)
→ 30 (AEPPreGrit2Complete) → 35 (ThermalComplete) → 40 → ...

DP Order Handling

DP (Donut/Plenum) orders are automatically created when TP orders are entered via Order Entry. DPs are linked to their parent TP via LinkedOrderId. In Line Scheduling, DPs are hidden from the table and auto-scheduled to the same line when their parent TP is scheduled. DPs do not create carts — they use bagging groups based on OutgoingBoxQty.


Production Travelers

Six traveler types generated from Production Paperwork:

TravelerPart TypesKey Features
IC114747Pre-Grit, Thermal Spray, Post-Grit sections
FEP131298Pre-Grit, Thermal, Post-Grit, Recess Removal
AEP131299Two-pass thermal (Thermal 1/2, D Flat A/B)
TP76711, 141315Pre-Grit, Thermal, Post-Grit, Tab Removal; shows linked DP Skid #
DP Bagging142455Multi-per-page (5 groups/page) with barcodes; shows linked TP Skid #
ASP155606, 155607, etc.Plug, Thermal, QC, D Flat A/B, Tab Removal, Argo

Traveler Features

  • TP↔DP cross-reference: TP travelers display linked DP skid number with DP’s skid color; DP bagging travelers display parent TP skid number with TP’s skid color
  • Cart QTY barcodes: All traveler types include a scannable Cart QTY barcode for quick data entry at Pre-Grit Input
  • Print order: ASP blasting sheets → DP bagging sheets → thermal travelers
  • Multi-per-page: ASP blasting (7/page) and DP bagging (5/page) pack multiple groups with scannable barcodes
  • Thermal columns: Machine #, Circle Wheel A/B, IN/WT, Avg Pick Up Weight, INITIAL
  • Pre-Grit columns: Machine #, Pre Weight (two boxes), Pre Weight Op Initials

Hardware Environment

  • Touch-screen terminals at 6 production lines
  • Zebra TC52 handhelds — Android mobile devices running PWA for warehouse/receiving
  • USB barcode scanners (wedge mode, 200ms buffer)
  • A&D scales (serial-to-USB) — Pre-Grit Input and Thermal Input weight capture
  • Mettler Toledo counting scales — dual-mode: USB via Web Serial API (default) or network TCP via SSE streaming. 8 stations: 6 Argo, 1 QA, 1 DP Bagging. MT-SICS protocol with auto-send weight + piece count. Article programmed per part with reference weight; scale calculates piece count from weight. Per-terminal mode toggle: web-serial, network, or none.
  • Scale Broadcast Relay (Raspberry Pi) — Pi appliance in front of any non-MT-SICS USB scale (A&D HID-keyboard, USB-CDC, etc.). Pi parses the scale’s native format and broadcasts PSMES-canonical JSON on TCP/5000. Backend reads it through the same (ScaleIp, ScalePort) model as Mettler scales — select Type = “Scale Broadcast Relay (Pi)” in Admin > Devices. First deployment: Line 4 with an A&D GX-8202M. Broadcast-only; tare/zero/article commands not yet supported. Pi source: C:\git\Scale Broadcast\.
  • Zebra label printers (3.75” labels) — dual-connected: USB to terminal + network IP for silent ZPL printing
  • Network connectivity to Azure SQL and API

Silent Label Printing (ZPL)

Labels are printed silently (no print dialog) via server-side ZPL generation sent directly to Zebra printers over raw TCP port 9100.

Architecture:

  1. Admins register physical printers in the Printers table (IP, port, DPI, model, building) via Admin > Devices (/admin/devices); names are auto-generated as PS-{MODEL}-{NN}
  2. The browser resolves its printer from the Station its Terminal claims (/terminal-claimGET /api/terminals/by-hostname / by-entra-device); PrinterPromptModal appears at print time only for unclaimed browsers, and a localStorage override (prgjsmes_printer) exists for one-off / backup switchover
  3. On “Save & Print”, the frontend sends printerId to POST /api/print/argo-label (or other print endpoints)
  4. The API looks up the printer by ID, generates native ZPL II via ZplTemplateService, and sends to printer IP via PrintService (raw TCP:9100, 3-second timeout)
  5. If server-side printing fails, the app falls back to browser-based iframe printing automatically
  6. If the saved printer is deleted (404), the app auto-clears localStorage and re-prompts

Data model:

  • Printers / Scales — physical devices (name, IP, port, model, DPI/capacity, building, lifecycle status)
  • WorkcentersStations — Stations own the printer/scale/backup/mode; Terminals claim a Station
  • Terminals — named workstations with StationId (+ EntraDeviceId identity); legacy direct Printer/Scale FKs are being dropped (PR #208)
  • Peripherals — named workcenter slots for the supervisor swap workflow (PeripheralSwapHistory is the audit trail)
  • PrintJobs — audit log of every print attempt (FK to Printer, status: sent/failed/timeout)

ZPL label features:

  • 13 label templates (12 label types + test label) with FitFont dynamic font sizing
  • 3.65” x 3.65” content area within 4x4 labels (36-dot margins via ^LS/^LT)
  • Centered Code 128 barcodes with accurate B↔C switch width estimation
  • All variable-length text fields (BE PN, MFG PN, PRG LOT#, PO#, label type names) dynamically shrink to fit their column

Supported label types via ZPL (13 templates):

  • 10 Argo label types (Good, Good-NT, A2, Donut/Plenum, Non-Conforming, PRG Scrap, Rework, Partial Box, High X-DIM, Rework Enclosed)
  • Argo crate labels
  • 2 generic templates (shipping and non-shipping) for dynamic label types from DB

Admin pages:

  • /admin/devices — unified device inventory (CRUD, test print, swap, labels) — replaced /admin/printers
  • /admin/workcenters/:id — Station hardware assignment + peripheral swap panel
  • /admin/label-templates — Visual card gallery with ZPL preview, Labelary rendering, and test print for all 13 templates
  • /terminal-claim — browser claims its Terminal/Station (replaced /printer-setup + /scale-setup)

Network Scale Management

Network scales (Mettler Toledo MT-SICS and Scale Broadcast Relay / Pi) are managed centrally via the backend ScaleController:

  • Scales table — registered network scales (name, IP, port, model, ScaleType = "MettlerToledo" or "Relay", weight unit, location)
  • Station-resolved — the browser’s scale (+ backup + mode) comes from the Station its Terminal claims; /scale-setup is retired (redirects to /terminal-claim)
  • ScaleReadings — audit log of every scale reading (weight, pieces, command type, errors). Relay JSON parse failures are also recorded here so protocol drift surfaces in the audit log.
  • SSE streaming — ScaleService holds persistent TCP connections to scale IPs, streams readings to browsers via Server-Sent Events with <100ms latency. MT-SICS scales get the SIR auto-stream; Relay scales are a passthrough for the Pi’s pre-formatted JSON broadcast. The browser uses fetch() + ReadableStream, not EventSource — the API’s auth policy accepts Bearer/PinAuth JWT only (no cookie scheme), and EventSource cannot attach the Authorization header. openAuthenticatedSseStream() in services/api.ts is the shared helper used by ScaleTestModal and useNetworkScale.
  • Heartbeat — 30-second ping to each connected scale; auto-reconnect on failure
  • Admin → Devices → Test runs a scale-type-aware pre-flight check before opening the live SSE: MT-SICS sends I1 and parses the reply; Relay does a pure TCP-connect reach test. A&D HID scales emit only on operator PRINT press, so the relay pre-flight intentionally does NOT wait for a line — silence is the normal pre-flight state, not a failure. Connect failure surfaces in the status row as “Cannot reach relay — …”.

Deployment

PropertyValue
App Serviceprgjsmes-prod in resource group PS-WEBAPPS
App Service Planpsi-asp-windows — Premium v3 (P1v3), Windows, 1 instance, shared-tenant plan for future Windows webapps (was ps-mes-apps-plan Basic B3; migrated 2026-04-20)
Deployment Slotsstaging slot — used by CI/CD for blue-green deploys via slot swap
URLpsmes.progressivesurface.com
CI/CDGitHub Actions with self-hosted runner (ps-cicd-runner)
AuthAzure Easy Auth (Entra ID) in production; dev bypass locally
NetworkPrivate endpoints — all traffic routes through VNet
TriggersPush to main (API/frontend paths) or manual dispatch

CI/CD — Deployment Slots and Blue-Green Deploy

Deploy workflow (.github/workflows/deploy-production.yml) uses a slot-swap pattern for zero-downtime deploys with instant rollback.

Flow

push to main  →  build + test  →  deploy to `staging` slot  →  warm + smoke test staging
                                                                        │
                                                                        ▼
                                                    swap staging ↔ production (atomic)
                                                                        │
                                                                        ▼
                                                     post-swap smoke test on prod URL
SlotURLRole
production (main)psmes.progressivesurface.comLive traffic
stagingprgjsmes-prod-staging.azurewebsites.netDeploy target, pre-production verification

Triggers

  • Push to main — full build + deploy to staging + auto-swap to production
  • Manual dispatch (workflow_dispatch) — same, with optional skip_swap=true to leave the new build in staging for manual QA before promotion

Swap semantics

  • Swap is atomic — traffic cuts over instantly; zero user-visible downtime
  • After the swap, the previous production code is now in the staging slot
  • Instant rollback: run the swap again (same command, same direction) — staging’s “old prod” code goes back to production:
    az webapp deployment slot swap \
      -g PS-WEBAPPS -n prgjsmes-prod \
      --slot staging --target-slot production

Slot settings (swap behavior)

All app settings currently swap with the slot (Key Vault refs, Azure AD config, connection strings). This is safe because both slots target the same prod DB, same Entra tenant, same Key Vault.

If a setting must stay bound to a specific slot (e.g. to point staging at a separate DB for pre-prod testing), mark it as a slot setting in the Portal or via:

az webapp config appsettings set -g PS-WEBAPPS -n prgjsmes-prod --slot staging \
  --slot-settings SOME_SETTING=staging-value

VNet integration

Both slots inherit the App Service Plan’s regional VNet integration (PS-WebApps subnet in PS-VNMAIN). Private endpoint access to procserv-proddata SQL works from both slots.

Manual deploys

Deploy a code change to staging without swapping (for manual testing first):

  1. Trigger workflow: GitHub → Actions → Deploy PRGJSMES to Azure App Service → Run workflow
  2. Set skip_swap = true
  3. After the run finishes, open https://prgjsmes-prod-staging.azurewebsites.net and verify
  4. When satisfied, swap via CLI (rollback via the same command if needed)

Rollback runbook

SituationAction
Bad deploy, broken in productionRun swap again (see command above) — old code returns in <30s
Smoke test failed automatically after swapWorkflow exits with error; prior production code is now in staging; re-swap to restore
Deploy succeeded but bug discovered laterCheck whether staging still has the prior code; if yes, swap to roll back; if staging has been overwritten by a newer deploy, restore the site from the latest App Service daily backup

Database Schema Changes (canonical pattern)

PRGJSMES applies schema changes via inline idempotent ExecuteSqlRaw blocks at startup in PRGJSMES.API/Program.cs. EF Core migrations exist for tooling and the model snapshot, but they are not applied to production — production schema is owned by the startup blocks.

Adding a schema change

  1. Author the SQL as a new file sql/<yyyymmdd>_<change>.sql for review. The file must be idempotent (IF NOT EXISTS guards on every CREATE).
  2. Add a matching inline ExecuteSqlRaw block inside the existing using (var scope = app.Services.CreateScope()) { ... } block in Program.cs, mirroring the file’s content. Each block:
    • Lives inside its own try { ... } catch (Exception ex) { startupLogger.LogWarning(...); }
    • Begins with a comment naming the feature and pointing at the mirror file: // Mirror of sql/applied/<file>.sql
    • Uses IF NOT EXISTS / IF EXISTS guards so the block runs harmlessly on every deploy
  3. (Optional) Generate the matching EF migration via dotnet ef migrations add so the model snapshot stays in sync. The migration file is for code tracking only — it is not applied to production.
  4. Open the PR. Merge triggers a deploy to the staging slot → smoke test → swap to production. The startup block runs on the first request after swap.
  5. After production deploy lands, move the .sql file from sql/ to sql/applied/ and update the inline-block comment to point at the new path. This is the visible signal that the change is live.

Permission seeds (same pattern)

Permission codes added via [RequirePermission(...)] or frontend usePermissions() follow the same inline-startup convention. The seed lives in three places:

  1. Inline ExecuteSqlRaw block in Program.cs — the actual application path. Idempotent INSERT into Permissions + IF NOT EXISTS-guarded role assignments to RolePermissions. This is what runs on every deploy and keeps schema in lockstep with code.
  2. sql/fix_missing_permissions.sql — canonical reference and recovery path if production ever desyncs from a fresh restore. Accumulates every permission code in the system.
  3. sql/FULL_MIGRATION.sql — same content as fix_missing_permissions.sql, applied during fresh-install bootstraps.

Adding a permission means adding it to all three. The inline block ensures the recurring “missing permission row = Admin silently loses that capability” failure mode can’t happen — the code that requires the permission ships in the same commit as the row that satisfies it. See CLAUDE.md “Permission System (CRITICAL)” for the full checklist.

Why inline-at-startup instead of CI-driven SQL

ConcernInline startupCI-driven SQL apply
Schema and code can’t divergeBlock ships with the code that needs itPossible if SQL step is skipped or fails silently
Rollback semanticsSlot swap reverts code; inline blocks are idempotent so a prior-version restart is safeNeed an inverse SQL or accept schema-ahead-of-code drift
Auth surfaceApp’s managed identity (already has DB write)Requires a separate CI service principal with schema-modify rights
Single-dev memory burdenNone — block ships in the PRHigh — someone has to remember to run the SQL

The inline pattern is the trade-off the team chose. The Program.cs block near the bottom of service startup is the canonical place; new contributors should grep for ExecuteSqlRaw to see existing examples (legacy QALots FK drops, Printers + Terminals tables, AuditEvents + HelpRequests, etc.).

Operations & Disaster Recovery

Database backup and recovery posture for the production MES database.

Database Backup Configuration

SettingValueRationale
Serverprocserv-proddata in North Central USShared with project-explorer, erp-migration-tool
DatabasePRGJSMES — GP_S_Gen5_2 (General Purpose Serverless, 2 vCore, auto-pause OFF, min 0.5 vCore)Serverless scales with shop-floor load
Point-In-Time Restore (PITR)35 daysMaximum for GP tier — covers delayed-discovery data corruption scenarios
Long-Term Retention (LTR) — Weekly4 weeksCovers month-end discovery of issues
Long-Term Retention (LTR) — Monthly12 monthsQuarterly audits + annual data recall
Long-Term Retention (LTR) — Yearly5 years (week 1 snapshot)ISO 9001 / AS9100 record retention
Backup Storage RedundancyGRS (Geo-Redundant)Async copy to paired region (South Central US) — survives region loss
Zone RedundancyNot applicable — NCUS has no availability zonesSee azure-resources for region constraint

Disaster Recovery Scenarios

ScenarioRecovery mechanismExpected RTOExpected RPO
Application bug corrupts data (discovered same day)PITR restore to point before corruption<1 hour<5 min
Data issue discovered days/weeks laterPITR (up to 35 days) or LTR weekly snapshot1–2 hours5 min (PITR) / 1 week (LTR)
Compliance / audit request for historical dataLTR monthly (12M) or yearly (5Y)1–2 hours1 month / 1 year
North Central US region outageRestore latest geo-replicated backup in paired region (South Central US)Several hoursUp to 1 hour (geo-replication lag)
Full server lossRestore from geo-replicated backups into new serverSeveral hoursUp to 1 hour

RTO/RPO improvement option: Add a failover group to a secondary DB in South Central US for warm-standby DR (<1 min failover, near-zero RPO). Cost: roughly 2× the DB compute. Not currently configured.

Backup / Restore Commands

See Backup Policies for the canonical az sql db str-policy / ltr-policy / restore commands.

Production Readiness Status

Tracked against PSI webapp compliance baseline. As of 2026-04-20:

AreaStatusNotes
DB backup (PITR 35d)✅ Configured2026-04-20
DB backup (LTR 4W/12M/5Y)✅ Configured2026-04-20
DB backup storage (GRS)✅ Configured2026-04-20
Zone-redundant DB❌ N/A in NCUSWould require region migration
App Service Plan tier✅ Premium v3 (P1v3)2 vCPU dedicated, 8 GB RAM, SLA 99.95%, supports slots + auto-scale
App Service health check✅ Configured/api/health wired — does a real db.Database.CanConnectAsync() check
App Service AutoHeal✅ Proactive AutoHeal + Crash Monitoring enabledAuto-recovers on memory pressure / slow requests
App Service staging slotstaging slot createdBlue-green deploy target
Application Insights✅ WiredShared psi-webapps-insights via Key Vault refs
SQL Auditing✅ Enabled (DB-level) → Log AnalyticsTargets DefaultWorkspace-NCUS
SQL Advanced Threat Protection✅ Enabled via Microsoft Defender for SQLSubscription-level plan (~$15/mo flat, covers all servers in sub)
SQL diagnostic settings → Log Analytics✅ ConfiguredSQLInsights, Errors, Blocks, Deadlocks, Timeouts, QueryStore
App Service diagnostic settings → Log Analytics✅ ConfiguredHTTP/Console/App/Audit/IPSecAudit/Platform logs + AllMetrics
Alert rules✅ 7 rules liveHealth-check fail, 5xx storm, plan CPU/memory, DB CPU/storage/failed-connection — all route to ag-prgjsmes-oncall action group
Scheduled App Service backup✅ Daily, 30-day retentionStored in psiappbackups (GRS) / container webapp-backups; SAS rotates annually
Subscription budget✅ $6,500/mo with alerts at 50/80/100% actual + 110% forecastEmail to adevereaux@progressivesurface.com
Log Analytics retention✅ 90 daysBumped from default 30 days for ISO audit trail

Android APK CI/CD + Google Play Publishing

The build-android.yml workflow builds a signed release APK, uploads it as a GitHub Actions artifact, and publishes it to Managed Google Play — which syncs to Intune for automatic deployment to managed TC52 devices.

Trigger: Manual only (workflow_dispatch) with optional inputs:

  • version_name — e.g. 2.0.0 (defaults to 1.0.{run_number})
  • publish — publish to Google Play (default: true)
  • track — Google Play track: production or internal (default: production)

Workflow steps:

  1. Checkout code
  2. Setup JDK 21 (Temurin) via actions/setup-java@v4
  3. Setup Android SDK via android-actions/setup-android@v3
  4. Setup Node.js 22 + npm install
  5. Build React frontend with Android-specific env vars (absolute API URL for WebView)
  6. npx cap sync android — copies Capacitor plugins into native project
  7. Decode signing keystore from ANDROID_KEYSTORE_BASE64 secret
  8. Compute version code (run_number + 100 to avoid conflicts with manually uploaded APKs)
  9. ./gradlew assembleRelease bundleRelease — builds signed APK and AAB
  10. Upload APK and AAB as GitHub Actions artifacts
  11. Publish APK to Managed Google Play via r0adkll/upload-google-play@v1

Distribution pipeline:

GitHub Actions → Managed Google Play (production) → Intune → TC52 devices

Intune integration: PS MES is approved in Intune and assigned as Required to the Device - TC52 group. The Intune connector syncs with Managed Google Play successfully, and the device restrictions policy has app auto-updates set to “Always.”

Known issue — auto-update delay: Despite the “Always” auto-update policy, the Play Store on TC52 devices does not reliably install updates automatically. In testing (March 2026), published updates sat for 2+ days without auto-installing — the Play Store showed the “Update” button but did not install it until manually tapped. Workaround: open the Play Store on-device or use adb shell am start -a android.intent.action.VIEW -d "market://details?id=com.progressivesurface.mes" to trigger the update check, then tap Update via ADB or on-screen.

Versioning: Defaults to 1.0.{run_number}. Version code = run_number + 100 (offset avoids conflicts with earlier manual uploads).

GitHub Secrets Required

SecretPurpose
ANDROID_KEYSTORE_BASE64Base64-encoded psmes.jks signing keystore
ANDROID_KEYSTORE_PASSWORDKeystore store password
ANDROID_KEY_ALIASKey alias (psmes)
ANDROID_KEY_PASSWORDKey password
GOOGLE_PLAY_SERVICE_ACCOUNT_JSONGCP service account JSON key for Google Play publishing

Also uses repository variables AZURE_CLIENT_ID, AZURE_TENANT_ID, and MOBILE_AUTH.

Google Play Setup

  • GCP project: civil-medley-489323-t8
  • Service account: psi-android-deploy@civil-medley-489323-t8.iam.gserviceaccount.com
  • Google Play Console: Managed Google Play enterprise account (org: progressivesurface.com)
  • API: Google Play Android Developer API (androidpublisher.googleapis.com) enabled in GCP project
  • Note: Managed Google Play enterprise accounts don’t have the standard “API access” settings page. Service accounts are added via Users and permissions > Invite new users with app-level “Release to production” permission.

Running a Build

# Build + publish with default version
gh workflow run build-android.yml
 
# Build + publish with custom version
gh workflow run build-android.yml -f version_name=2.0.0
 
# Build only (no publish)
gh workflow run build-android.yml -f publish=false
 
# Publish to internal track for testing
gh workflow run build-android.yml -f track=internal

Manual Install via ADB

APK artifacts are also available for direct download from GitHub Actions. For dev/debug installs:

adb install -r app-release.apk

Local Builds

The app/build.gradle signing config reads from environment variables with fallback defaults, so local builds still work without setting env vars (produces an unsigned 1.0-dev build).


Development Setup

# Clone repository
git clone https://progressivesurface.ghe.com/ProgressiveSurface/PRGJSMES
cd PRGJSMES
 
# Backend API
cd PRGJSMES.API
dotnet build
dotnet run  # Starts on http://localhost:5262
 
# Frontend (separate terminal)
cd prgjsmes-web
npm install
npm start   # Starts on http://localhost:3000

Database: Requires VPN access to Azure SQL (procserv-proddata.database.windows.net). Auth Bypass: Dev mode skips Azure AD — both API and frontend run without authentication locally.


Recent Developments (May–June 2026)

Workcenter / Station / Devices consolidation (PR-A→PR-E + #247–#284, May 2026)

The device-admin model was restructured. Hardware resolution moved from per-Terminal FKs to a Workcenter → Station model (Stations own printer/scale/backups/mode; Terminals claim a Station). Named Peripheral slots give supervisors a mid-shift swap workflow with a full PeripheralSwapHistory audit trail (SOP-068). The Intune ManagedDevices cache (Graph poller + PSMES-* Entra group tracking) and the unified /admin/devices inventory replaced /admin/printers; /printer-setup + /scale-setup were replaced by /terminal-claim. See “Admin UI — unified Devices page” and “Workcenters & Stations” sections above.

Devices admin UX: workstation Remove, printer auto-name, Building, terminal Move (PR #302, 2026-06-04)

  • Workstation Remove on /admin/devices — stale duplicate rows (rename/re-image leftovers) are now deletable: heartbeat-only “Discovered” rows delete outright; Intune rows untrack + heartbeat-sweep, 409-guarded while a Terminal binding exists.
  • Printer name auto-generation — Add Printer derives the name from the model: PS-{MODEL}-{NN}, auto-incrementing per model.
  • Location → Building — the legacy free-text Location field was removed from the printer/scale modals; Building (shelf locator for spares) is now the editable location field, displayed as “Location” in the UI as of PR #304. Column drop (and a BuildingLocation schema rename) tracked as expand-contract follow-up (#301).
  • Terminal Move — the workcenter detail “Terminals claimed” table can re-point a terminal at a different Station (PUT /api/terminals/{id}/station).

Badge reader service v1.3.x + badge UID Entra sync (2026-06-03)

Phone-home service heartbeat (POST /api/terminals/service-heartbeat, shared-secret auth, independent of browser sessions), log rotation, EPIPE/WiX install fixes (v1.3.1–1.3.3, Intune .intunewin pipeline), and badge UID read/write against the Entra schema extension (issue #289, Pilot 2 WS2.3).

Skid Scan — Undo Scan + Reopen Session (2026-05-19)

New operator-correction surface for mis-scans and post-completion corrections.

  • Two new endpoints
    • DELETE /api/skidscan/session/{sessionId}/box/{boxId} — undo a scan. RGA / non-Bulk / Bulk-for-ASP: single-row soft delete. Bulk non-ASP: batch delete via optional crateNumber body param (removes every SkidScanBox row sharing the crate in one transaction). Permission skidscan.undo_scan.
    • POST /api/skidscan/session/{sessionId}/reopen — revert a completed session. Sets session 92→90, order 92→90, carts→90 (mirror). Pallet counter is not rolled back — already-issued pallet numbers remain reserved (matches the project-wide rolling-counter rule). Sets TicketStale=true to prompt a reprint. Blocked with 409 if Order.StatusId > 92 (e.g. shipped — reverse the shipment first). Permission skidscan.reopen.
  • Soft-delete on SkidScanBoxes — new columns IsDeleted, DeletedAt, DeletedBy. Every read site in SkidScanController filters WHERE !IsDeleted so the deleted box reappears in pending automatically. Hard delete was avoided because it would break the pallet counter rewind logic in BuildPallets.
  • TicketStale flag on SkidScanSessions — persistent yellow banner in the UI; Print Skid Ticket button disabled while the flag is set; flag clears when the next ticket is fetched.
  • Audit — no new tables. Existing [AuditEntity("SkidScanBox", "{boxId}")] and [AuditEntity("SkidScanSession", "{sessionId}")] decoration captures supervisor initials + reason in AuditEvent.PayloadJson and before/after state in ChangesJson. SupervisorSignOff row written on reopen with SignOffType = "SkidScanReopen".
  • New Line Leader role — bootstrapped in Program.cs with skidscan.scan + skidscan.undo_scan + skidscan.reopen only. The broader Operator+Supervisor permission union (per docs/knowledge-base/domain-rules.md) is a separate admin ticket — Line Leader is created as an empty-membership role, members to be assigned by an admin.
  • Two new permissionsskidscan.undo_scan and skidscan.reopen, both assigned to Supervisor, QA, Line Leader, Material Handler, Scheduler. Operators get scan only. Permission count: 41 across 10 categories (was 39).
  • UI — expandable “Scanned Boxes” panel per bucket card in the Pallet Summary. RGA / non-Bulk / Bulk-for-ASP rows have “Undo Scan”. Bulk non-ASP rows are grouped by CrateNumber and have “Remove Crate”. Both flow through a shared confirmation modal (supervisor initials + reason, min 10 chars). “Reopen Session” button visible only when session status is 92 + user has skidscan.reopen.
  • Scope boundary baked into the spec — Skid Scan is physical verification, not classification. Parts/bucket/label-type edits live in QA Dashboard Lot Audit. The “send back to QA” path is the recovery flow when an operator notices upstream data is wrong. This boundary is also documented in the SOP and the Skid Scan wiki page.
  • SOP-PRGJSMES-020 bumped to revision 2 — steps 12 (Undo Scan), 13 (Reopen Session), 14 (Boundary statement) added; RACI now includes Undo and Reopen activities; NCCA process updated; new risks (undo misuse, reopen after shipment, stale ticket) added to the risk register.
  • Files — spec at docs/migration/skidscan-undo-scan-and-reopen.md. Migration: sql/20260519_skidscan_undo_scan_columns.sql. SOP update: sql/20260519_update_sop_skidscan_undo_reopen.sql. Permission seeds in Program.cs + sql/fix_missing_permissions.sql + sql/FULL_MIGRATION.sql.

Recent Developments (April 2026)

Coinstack Audit + Aggregation (PR #51, #52, 2026-04-30)

Behavior change — coinstacks audit as one unit, not lot-for-lot.

  • QA Queue grouping — coinstacks render as a single row in the “Ready for QA” and “QA In Process” lists. A coinstack only appears in the queue when ALL member orders have reached StatusId = 45 (ArgoComplete). Standalone (non-coinstack) orders behave as before.
  • One-click batch initialize — new endpoint POST /api/qa/lots/coinstack/{coinstackId}/initialize advances every member lot to status 80 in a single transaction. Idempotent per-lot (validates all members at 45, mirrors status to carts).
  • Coinstack-wide totals — Parts Received, Pieces Packed, Defect Tally, and Reconciliation gap math all aggregate across siblings when the lot is coinstack-bound. New nullable DTO fields CoinstackPartsReceived, CoinstackPiecesPacked, CoinstackResolvedPcs populated only when Order.CoinstackId != null; UI uses coinstack* ?? perLot fallback so non-coinstack flows are untouched.
  • Tally header flips to “Coinstack N” when isCoinstack=true; print payload uses coinstack totals. Defect tally sheets are informational/print artifacts only — not persisted records.
  • What stayed per-lot — Argo “Finish Lot” close gate (partsToGo), history records, and hold tags. Each lot still closes independently when its own packed count meets its own received count.
  • SOP-PRGJSMES-019 procedure steps updated to document the new coinstack queue and audit grouping rules.

Job Shop Work Instructions + 14 New SOPs (2026-04-22)

  • OpSheet / BatchEntry / OrderEntry photo galleries now refresh on uploadAttachmentUploader now passes onUploaded that bumps a per-entity counter; AttachmentGallery refetches via its refreshKey prop. Per-batch galleries use Record<batchId, number> state keyed by entity ID so uploading to one batch does not trigger refetches on unrelated batches.
  • Scheduling helpRoute fixJobShop/Scheduling.tsx was wired with helpRoute="/job-shop/scheduling" but the actual app route is /job-shop/schedule; realigned to match.
  • 14 new Job Shop SOPs (SOP-PRGJSMES-046 through 059) — full ISO 9001:2015 ProcedureStepsJson, EquipmentJson, RiskAssessmentJson, KPIsJson for each Job Shop page: Hub, Op Sheets, Parts, Quotes, Orders List, Order Entry, Scheduling, Operator Dashboard, Batch Entry, Inspection, Shipping, Invoicing, Materials Inventory, Stock Count. Added to sql/sop_seed_data.sql, idempotent (IF NOT EXISTS guarded). Total SOP inventory now 59 (up from 45).
  • Root cause on “work instructions don’t work for Job Shop” — all 18 Job Shop pages already had helpRoute correctly wired via PageHeader. Silent failure mode: WorkInstructionModal fetches via sopApi.getByRoute(route), and missing SOP row = empty modal with no error. Same silent-failure pattern as the permission system — reason to always seed SOPs alongside new pages, not as a follow-up.

Production Metrics Fix + Cart LineId Backfill + UI Optimization Phases 1-3 (2026-04-22)

  • Root-cause fix for zero-valued Production Metrics DashboardMetricsController was firing ~63 queries per load (6 lines × 10+ status slices) and frequently timing out on Azure SQL. Rewritten to a single grouped query + dictionary lookup (~4 queries total). Now loads in under a second.
  • Cart LineId FK backfill — three cart-creating controllers (SchedulingController, ProductionPaperworkController, LineSwitchController) did not copy LineId from the parent Order to new Carts, leaving ~976 historical carts NULL. Metrics dashboard queries filter on LineId so the missing FK caused zero results. Fixed root cause in all 3 controllers and ran sql/20260421_backfill_cart_lineid.sql against production to repair historical rows. Idempotent — safe to re-run.
  • Weekly metrics endpointGET /api/metrics/weekly now returns the current work week (Monday 00:00 → now) instead of the last completed week. Matches supervisor intuition of “this week so far.”
  • Non-regressive status on pre-grit lot closeMath.Max(currentStatus, newStatus) prevents race conditions from regressing completed orders back to earlier states.
  • Phase 1 UI optimizationReact.memo(PageHeader) (eliminates spurious header rerenders site-wide), AbortController on useEffect across ArgoInput / PreGritInput / ThermalInput / MetricsDashboard (cancels in-flight requests on unmount), AxiosRequestConfig threaded through 8 get* API methods so pages can pass cancellation signals, native title= tooltips on every Dashboard interactive element.
  • Phase 2 UI optimizationuseMemo on Warehouse derived lists (empty locations, filtered inventory) and other large derived arrays; sort icons (↕ ▲ ▼ per UI standard) added to Shipping / ThermalInput / ThermalDashboard / DPBagging / RawMaterials / PreGritInput tables; title-attribute tooltip sweep on AdminGroups + remaining admin pages.
  • Phase 3 UI optimization — slice-based 50/page pagination on Warehouse inventory and QA holds tables (no virtualization dep, touch-terminal safe); AbortController rolled out to all high-frequency operator pages.
  • Dashboard split — Job Shop tile lifted out of Production Analytics into its own dedicated section, positioned alongside Production Analytics via the parallelRow flex pattern. Empty sections still auto-hide when the user has no granted permissions.

Production-Readiness Audit + Dashboard Restructure (2026-04-21)

  • Full frontend + backend audit — reviewed all 48 controllers and 74 page components (58 top-level + 16 Job Shop); closed permission-sync gaps, scanner-coverage gaps, and convention drift across pages
  • React.lazy code splitting across all 58 top-level pages + Job Shop subroutes — reduced initial bundle, pages load on demand
  • Dashboard restructure — Production Metrics and Job Shop lifted out of “Production Modules” into a dedicated Production Analytics section, positioned to the right of Material Handling via parallelRow flex layout. Empty-permission sections still auto-hide.
  • Production SQL migrations executed against procserv-proddata / PRGJSMES via sqlcmd -G (Azure AD interactive) — three 20260421_*.sql scripts plus fix_missing_permissions.sql for the new analytics permissions
  • CI gate learning — deploy fails the frontend lint step on --max-warnings 0; a single unused import blocks production deploy. Fix-forward pattern: npm run lint + npx tsc --noEmit locally before every push.

Job Shop Module (JSWJ) Integration

  • Job Shop MES subtree (/job-shop) — op sheets, parts, quotes, orders, scheduling, shipping, inspections, invoicing for the job shop operations that run alongside thermal spray
  • Separate permission domain (job_shop.view, etc.) keeps Job Shop tiles hidden from thermal-only operators
  • Shared Customer catalog with PRGJSMES via common Customers table
  • Entry point tile on the main Dashboard under the new Production Analytics section

Blue-Green Slot-Swap Deploys

  • Staging slot (prgjsmes-prod-staging.azurewebsites.net) added for zero-downtime deploys
  • .github/workflows/deploy-production.yml now deploys to staging, smoke-tests, then performs an atomic slot swap to production
  • Instant rollback — re-running the swap command restores the previous code in under 30 seconds
  • skip_swap=true workflow_dispatch option lets a manual tester verify in staging before promoting

Database DR Hardening (2026-04-20)

  • PITR bumped to 35 days (GP tier max); LTR policy set at 4W / 12M / 5Y to cover ISO 9001 / AS9100 audit retention
  • Backup storage flipped to GRS — async replication to South Central US covers region-loss scenarios
  • 7 alert rules live on ag-prgjsmes-oncall (health-check fail, 5xx storm, plan CPU/memory, DB CPU/storage/failed-connection)
  • Subscription budget at $6,500/mo with 50/80/100/110% forecast alerts

App Service Plan Migration

  • Migrated from ps-mes-apps-plan (Basic B3) → psi-asp-windows (Premium v3 / P1v3) on 2026-04-20
  • Unlocks deployment slots, auto-scale, AutoHeal, scheduled backups, and 99.95% SLA
  • Shared-tenant plan now hosts future PSI Windows webapps

Recent Developments (March 2026)

Network Scale Integration (SSE Streaming)

  • Dual-mode Mettler Toledo scales — operators toggle between USB (Web Serial API) and network TCP (SSE streaming) per terminal
  • Backend ScaleService holds persistent TCP connections to scale IPs, streams weight/piece readings to browsers via Server-Sent Events with <100ms latency
  • ScaleController manages scale CRUD, terminal assignment, and SSE endpoints
  • Scale Setup page (/scale-setup) — operator self-service to select their network scale (like Printer Setup)
  • ScaleReadings audit log tracks every reading with weight, pieces, command type, and errors
  • useScale meta-hook — React hook that switches between USB (useMettlerToledoScale) and network (useNetworkScale) modes; both hooks always instantiated (React rules) but only the active one enabled

QA Dashboard — Scale Integration for Audit & Create Box

  • 3-mode checkweigher display added to both Audit modal and Create Box modal, mirroring the Argo correction modal pattern
  • Scale mode: big colored weight/pieces display with green/yellow/red piece status, auto-populated from scale readings
  • Simple mode: standard editable weight + pieces inputs when no scale connected
  • Tare/insert/total weight row with editable tare weight (pre-filled from Argo box tare), read-only cardboard insert weight from part catalog, and computed total weight in lbs
  • Inline piece count warnings — immediate validation feedback: Good label must equal outgoingBoxQty, Partial Box must be less, shipping labels cannot exceed
  • Scale recipe push on modal open — pushes article reference weight to scale when audit or create box modal opens
  • BoxTareWeight persisted per audit for weight accuracy tracking

ZPL Label System Overhaul

  • Dynamic font sizing via FitFont helper — all variable-length text fields (BE PN, MFG PN, PRG LOT#, PO#, label type names) dynamically shrink to fit their column width, with a 12pt minimum floor. Applied to 42 field instances across all 13 templates.
  • Centered barcodes with accurate Code 128 B↔C switch width estimation — barcodes are perfectly centered regardless of data content
  • 3.65” content area within 4x4 labels (36-dot ContentInset margins via ^LS/^LT) for uniform label margins
  • Font bump — all fonts increased ~20-25% for readability on Zebra printers
  • Template cleanup — removed 7 unused templates (Scrap Tally, Rework Tally, Skid Ticket, SkidScan Crate, Defect Tally, Rework Tag, Hold Tag), reducing from 20 to 13 active templates
  • 2 generic templates (shipping and non-shipping) automatically handle any new label types added to the DB

Dynamic Labels and Defects from Database

  • Label templates auto-generated from LabelTypes DB table — adding a new label type in Admin automatically creates its label template and Argo selector entry
  • Defect checkboxes dynamic from DefectTypes DB table — defect entry forms on all labels pull from the database, no code changes needed for new defect types

Argo Correction Modal — Scale Integration

  • Box correction modal now mirrors the main screen’s weight/pieces behavior exactly
  • Scale mode: big colored weight/pieces display with Argo green theme, auto-populated from Mettler Toledo scale
  • Manual mode: large weight input matching main screen layout
  • Tare weight (editable), insert weight (read-only from part catalog), and total weight (computed) info row added to correction modal

Mettler Toledo Counting Scale Integration

  • Web Serial API integration for 8 Mettler Toledo counting scales (6 Argo stations, 1 QA, 1 DP Bagging)
  • MT-SICS protocol: S\r\n (stable) or SI\r\n (immediate), 9600 baud 8N1
  • Article programmed per part with reference weight calculated from thermal spray pickup weights
  • Scale auto-sends weight + piece count; operator just places box and waits for stable reading
  • Color-coded status display (green=exact, yellow=under, red=over) on Argo, QA, and DP Bagging pages
  • Tare workflow: prompt on scale connect, persisted per-lot in localStorage

Admin: Label Templates Page

  • Visual card gallery showing all 13 ZPL label templates with rendered previews
  • ZPL code viewer, Labelary API rendering, and one-click test print to any configured printer
  • Auto-generates sample data for each template type

ISO 9001 SOP Management

  • 59 SOPs (SOP-PRGJSMES-001 through 059 — 001-045 core PRGJSMES, 046-059 Job Shop) with all ISO 9001:2015 sections
  • Work instruction HTML content served per-page via GET /api/sops/by-route
  • Every page has a WorkInstructionModal with context-sensitive help
  • Production Flowchart tab showing full production pipeline with interactive stage details
  • Revision history with automatic snapshots on edit

Rework Module

  • Rework cart creation for defective parts returning to production
  • Session tracking with defect breakdown per rework round
  • Linked to Argo defect categories and cart status flow

Customer Management

  • Customer master data with BC-compatible fields (CustomerNo, BillTo, etc.)
  • Customer addresses (ship-to and bill-to) per customer
  • Designed for future BC integration via link keys

Warehouse Module

  • Rack storage with Building-Bay-Column-Row-Slot location codes
  • Putaway workflow with barcode scanning
  • Weight-balanced pick lists for truck loading
  • Activity audit trail for all movements

Recent Developments (February–March 2026)

QA Dashboard — Create Box, Inline Validation, and Modal Overhaul

  • Create Box modal — QA specialists can now create boxes directly from the QA Dashboard with full validation (label type, weight, pieces, defects, Good-NT initials) and auto-print
  • Inline modal error messages — replaced all alert(), window.prompt(), and window.confirm() calls with inline red error banners inside modals for better UX on touch screens
  • Delete Crate confirmation modal — replaced window.confirm() with a proper confirmation dialog showing crate details
  • Dashboard Release Hold modal — replaced window.prompt() chain with a structured modal for release reason and initials
  • Expandable defect breakdown — non-shipping defect totals in ArgoInput now expand on click to show per-box defect counts (Box 1 (3), Box 2 (1), etc.)

Server-Side Piece Count Validation

  • OutGoingBoxQty enforcement on box audit create and update endpoints:
    • Good/Good-NT pieces must exactly equal OutGoingBoxQty
    • Partial Box pieces must be less than OutGoingBoxQty
    • Shipping labels cannot exceed OutGoingBoxQty
  • Good-NT initials (AIN) — GoodNTInitials field added to CreateBoxAuditRequest and persisted on ArgoBox.GoodNTInitials

Argo Route Migration

  • Argo routes changed from /thermal/line/:lineId/argo to /argo/line/:lineId (and /argo/machine/:machineId)

PWA and Mobile Support

  • Service worker registration for PWA installability (production only)
  • Manifest updated — app name corrected to “Progressive Surface MES”, maskable icon added, start URL set to /mobile, portrait orientation enforced
  • Mobile CSS stylesheet added for responsive touch-screen layouts
  • Targets Zebra TC52 shared Android devices for warehouse and receiving workflows

Machine-Driven Thermal Operations (Phases 1-3)

  • Machine entity — new Machines table with 60 records (10 machine types x 6 lines), managed via Admin Machines page
  • Routing MachineId FKs — 11 nullable FK columns on Routings + 4 on Carts link to specific machines; Admin Routings uses machine dropdowns
  • Dynamic dashboards — Thermal Dashboard and main Dashboard load machines from DB instead of hardcoded line arrays; adding a machine auto-updates the UI
  • Dual-route navigation — both /thermal/line/:lineId and /thermal/machine/:machineId routes work side-by-side during transition
  • Machine-aware API — all 9 thermal endpoints accept optional ?machineId= query param, resolved to lineId via ResolveLineId helper

QA Dashboard — Full Argo Parity

  • QA Lot Audit tab now matches ArgoInput.tsx exactly — QA specialists see the same data and functions as Argo operators
  • Expanded left column with BE PN, BE PO#, Good Box QTY, A2 Argo Value, Label A2 Calc, Coinstack fields
  • Box history table below tile grid with per-row Reprint/Audit/Delete actions
  • Parts Left to Run section with Starting/Ran/Remaining counts and color coding
  • ArgoLabelPrint component integration for actual Zebra label reprints with CODE128 barcodes

Pre-Grit Input Enhancements

  • Reweight button on completed carts — loads existing weights back into form with auto-checked reweight flag for audit trail
  • Last-cart exception enforcement — when completing the final cart, if parts ran don’t balance with parts received, operator must categorize the shortfall before saving
  • Parts balance tracking — stats section shows Parts Ran, Parts Received, Exceptions total, and Balance indicator
  • Lot close validation — Complete Lot checks parts balance; if unaccounted, lot-level exceptions modal requires exact categorization before supervisor approval

Production Travelers Overhaul

  • Six traveler types: IC, FEP, AEP, TP, ASP, DP Bagging
  • TP↔DP cross-reference skid numbers with correct skid colors on both traveler types
  • DP bagging travelers pack multi-per-page (5 groups/page) with scannable barcodes
  • ASP Blasting travelers pack 7 groups/page with scannable barcodes

Admin Configuration Pages

Self-service configuration for label types (with bucket routing), defect types, bucket types, routings, parts, and machines — all with per-part assignment. Eliminates direct database edits for shop floor config.

ASP/TP Simplified Order Entry

New IC/ASP/TP tab navigation in Order Entry. ASP and TP tabs provide 5-field simplified forms with auto-pallet-splitting, lot ID generation (MT-20YYMMDD-ProcessCode-PartNumber), and auto-receive (status=10). TP orders automatically create matching DP orders with LinkedOrderId pairing.


Part Types and Ring Sizes

Part NumberPNProduct LineRing Size
114747S6 IC (4518)IC65
131298Thin FEP (4520)FEP64
131299Thin AEP (4519)AEP65
76711TPTP36
141315TPTP36
142455DPDP32 (OutgoingBoxQty, no carts)
155606Split ASP(606)ASP18
155607Split ASP(607)ASP18

Documentation

LocationContents
CLAUDE.mdComprehensive technical documentation
docs/data-flow.mdProduction lifecycle Mermaid diagrams
docs/migration/14 module migration specifications
docs/work-instructions/4 standalone + 15 embedded operator work instructions
docs/api/swagger.jsonOpenAPI specification
docs/frontend-interactive-elements-guide.mdUI patterns catalog
docs/argo-label-schema-inventory.mdLegacy Access database schema inventory
docs/production-flowchart.mdProduction flow documentation

Legacy System

Replaces 161+ Microsoft Access databases:

  • Line-specific thermal spray databases (6 lines)
  • Argo label databases (6 lines)
  • QA Label Dashboard database
  • Skid Scan Master database
  • Shipping database
  • ASP Blasting database
  • Shared backend databases on S:\ network drive

Key legacy forms replaced:

LegacyPRGJSMES Replacement
frmLabelMakingArgoInput.tsx
2-ArgoLabel.accdbArgoController + ArgoInput.tsx
QALabelDashBoard.accdbQADashboard.tsx
SkidScanMaster.accdbSkidScanDashboard.tsx
Shipping.accdbShippingDashboard.tsx
ProductionControlDashboard.accdbOrderEntry.tsx (ASP/TP tabs)
ASPBlasting.accdbPreGritInput.tsx
Line-specific thermal DBsThermalInput.tsx


Last updated: 2026-06-05 — Workcenter/Station/Devices consolidation documented; retired /admin/printers, /printer-setup, /scale-setup references replaced; PR #302 devices-admin UX