PSI DataSync
PSI DataSync synchronizes Intego measurement data (geometry CSVs and daily statistics) from six shop-floor ARGO PCs to a centralized Azure Files share. It replaces GoodSync Control Center with a custom-built system designed to eliminate Azure Storage transaction costs.
Architecture
┌─────────────────────────────────────────────────────┐
│ PSI Shop Floor (On-Prem) │
│ │
│ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │ ARGO1 │ │ ARGO2 │ │ ARGO3 │ ... ARGO4-6 │
│ │ Agent │ │ Agent │ │ Agent │ │
│ └───┬────┘ └───┬────┘ └───┬────┘ │
│ └──────────┼──────────┘ │
│ │ VPN │
└─────────────────┼─────────────────────────────────────┘
│
┌─────────────────┼─────────────────────────────────────┐
│ Azure VNet (PS-VNMAIN) │
│ │ │
│ ┌───────▼────────┐ ┌──────────────────┐ │
│ │ DataSync │ │ Azure Files │ │
│ │ Server │ │ (psargostorage) │ │
│ │ (App Service) │◄──►│ argodatastore │ │
│ │ │ │ │ │
│ │ • REST API │ │ /Argo-Export/ │ │
│ │ • Dashboard │ │ /Daily-Stats/ │ │
│ │ • Alert Engine │ │ │ │
│ └────────────────┘ └──────────────────┘ │
│ │ │
│ ┌───────▼────────┐ │
│ │ Azure SQL │ │
│ │ (DataSync DB) │ │
│ └────────────────┘ │
└────────────────────────────────────────────────────────┘
Components
| Component | Technology | Hosting |
|---|---|---|
| Server API | Node.js / Express | Azure App Service (psi-datasync) |
| Dashboard | React / TypeScript / TailwindCSS | Served by Express (same App Service) |
| State Database | Azure SQL | DataSync on procserv-proddata |
| Agent | .NET 8 Windows Service | Each ARGO PC |
| File Storage | Azure Files | \\psargostorage.file.core.windows.net\argodatastore |
How It Works
The system avoids expensive Azure Files list/read transactions by maintaining a server-side state index in Azure SQL. Agents never query Azure Files to determine what needs syncing — they ask the server.
- Agent scans local source folders for new/changed files
- Agent sends a file manifest to the Server
- Server compares the manifest against its state index database
- Server returns only the delta (new/changed files)
- Agent copies delta files directly to Azure Files via SMB
- Agent reports completion; Server updates its index
Folder sealing prevents re-scanning completed date folders. Once a geometry date folder has been inactive for 24 hours, it’s sealed — the agent skips it entirely on future cycles.
URLs and Access
| Resource | URL | Access |
|---|---|---|
| Dashboard | https://psi-datasync.azurewebsites.net | Azure AD SSO (internal network via private endpoint) |
| API Health | https://psi-datasync.azurewebsites.net/api/health | Unauthenticated |
| GHE Repository | https://progressivesurface.ghe.com/ProgressiveSurface/PSISync | GHE access |
Jobs
DataSync manages 20 sync jobs across 6 ARGO PCs:
| Category | Per ARGO | Total | Schedule | Type |
|---|---|---|---|---|
| Geometry (ARGO1-2, Pattern A) | 3 | 6 | Every 4 hours | Append-only CSVs |
| Geometry (ARGO3-6, Pattern B) | 2 | 8 | Every 4 hours | Append-only CSVs |
| Daily Stats (all) | 1 | 6 | Every 15 min | Mutable .txt files |
Dashboard
The dashboard provides:
- Health Overview — All runners with traffic-light status, sync volume, active alerts
- Runner Detail — Per-runner status, heartbeat, disk space, assigned jobs
- Job Detail — Sync history, folder seal status, manual seal/unseal
- Sync Run Log — Per-file drill-down for any sync run
- Alert Management — Active alerts, configurable rules, suppression windows
- Credential Management — Azure Files credential tracking with expiry alerts
- Cost — Per-job Azure Files transaction cost (see below)
Cost Accounting
The Cost dashboard tab attributes Azure Files spend to individual jobs.
Phase 1 is self-attributed: each completed sync_run records the
billable write operations (1 CreateFile + ceil(bytes/4MiB) PutRange per OK
file) and bytes written; each triangulation_run records the
ListFilesAndDirectories pages consumed by the walk. Costs apply the Azure
Files Standard LRS Hot North Central US pricing card (verify on contract
renewal).
Phase 2 will pipe storage diagnostic logs to a Log Analytics workspace and reconcile per-URI requests back to job ids; variance >5% between Phase 1 estimate and Phase 2 truth raises an alert.
Endpoints (dashboard auth or X-CI-Secret):
GET /api/admin/cost?since=&until=— per-job rollupGET /api/admin/cost/jobs/:jobId?since=&until=— daily series for one job
Alerting
| Alert | Trigger | Severity |
|---|---|---|
| Consecutive Failures | 3 failed sync runs | Critical (Teams + Email) |
| Runner Offline | No heartbeat for 6 hours | Critical (Teams + Email) |
| Stale Sync | Last success > 2× schedule | Warning (Teams) |
| Partial Sync | >10% file failures | Warning (Teams) |
| Disk Space Low | <5 GB free | Warning (Teams) |
| Credential Expiring | <30 days to expiry | Warning (Teams) |
| Credential Critical | <7 days to expiry | Critical (Teams + Email) |
Alerts auto-resolve when conditions clear. Suppression windows can be configured for planned maintenance.
Related
- Deploy to Azure — App Service infrastructure
- PRD and architecture docs in the PSISync repository under
docs/