ARGO Analytics Dashboard
Interactive analytics dashboard for PSI’s ARGO inspection machines — yield monitoring, SPC control charts, defect analysis, surface heatmaps, predictive analytics, and shift correlation across 22M+ inspected SOFC interconnect plates from 3 machines.
Overview
The ARGO Analytics Dashboard provides real-time visibility into the quality inspection process for solid oxide fuel cell (SOFC) interconnect plates. Three ARGO machines (ARGO4, ARGO5, ARGO6) collectively inspect ~38,000 plates per day across 382 measurement dimensions including surface curvature, thickness, defect detection, and coating quality. The dashboard replaces manual CSV spot-checking with interactive visualizations, statistical process control, and machine learning-driven insights.
| Feature | Description |
|---|---|
| Production URL | https://argo.progressivesurface.com |
| Backend | Python FastAPI + DuckDB on Azure Linux VM |
| Authentication | Azure AD (PSI credentials) — planned |
| Hosting | Azure VM ps-argo-etl (B2as_v2, Ubuntu 24.04) behind nginx with SSL |
| Repository | ProgressiveSurface/argo-analytics |
| Data Source | Azure File Share psargostorage/argodatastore (ARGO4, ARGO5, ARGO6) |
| Data Volume | 22M parts across 3 machines, 2,500 daily stat files, May 2023–present |
| Access | VPN/onsite only (private IP 10.160.0.20) |
Features
Dashboard Pages
| Page | Route | Description |
|---|---|---|
| Executive Overview | / | KPI cards (yield, throughput, errors), yield trend chart, defect Pareto, hourly yield heatmap |
| Time-Series & SPC | /spc | X-bar control charts with Western Electric rules, Cp/Cpk capability, parameter selector |
| Batch Analysis | /batch | Batch comparison table (color-coded yield), recipe timeline |
| Defect Deep Dive | /defects | Failure mode breakdown, defects by hour (shift patterns), defects by batch |
| Part Lookup | /parts | Search by part/batch, full measurement profile, 57x57 surface heatmaps (canvas-rendered) |
| Predictive Analytics | /predictive | Anomaly detection (Isolation Forest), feature importance (Random Forest), yield forecast |
| Shift & Labor | /shifts | Day vs Night yield comparison, 24-hour yield profile with shift boundary markers |
| Multi-Machine | /machines | Cross-machine yield comparison, defect comparison, per-machine KPI cards |
All pages include a machine picker (ARGO4, ARGO5, ARGO6, or All) and date range picker (presets: 7d, 30d, 90d, or custom).
Machines
| Machine | Parquet Files | Date Range | Parts (total) | Notes |
|---|---|---|---|---|
| ARGO4 | 872 | Oct 2022–present | ~8M | Legacy Daily stats dump/ path, some schema errors in old files |
| ARGO5 | 854 | Feb 2023–present | ~7.6M | Primary machine, cleanest data |
| ARGO6 | 773 | Jul 2023–present | ~6.5M | Lowest yield (~86%) — under investigation |
ETL Pipeline
- Parses 382-column DailyStats CSVs with latin-1 encoding and schema version detection (v1–v4)
- Writes compressed Parquet files (5:1 compression ratio) partitioned by machine and date
- Builds DuckDB materialized aggregation tables for fast dashboard queries
- Reads 57x57 surface grid files on-demand for Part Lookup
- Runs every 4 hours via cron on the ETL VM
union_by_namehandles schema mismatches across machines and time periods
Analytics
- SPC: X-bar/R control charts with Western Electric sensitizing rules (all 4 rules), Cp/Cpk process capability
- PCA: Principal Component Analysis on failed parts for multivariate defect clustering
- Anomaly Detection: Isolation Forest on 6 core measurement columns, flags unusual parts even if they pass individual checks
- Feature Importance: Random Forest classifier identifies which measurements most predict pass/fail (curvature = 55% of predictive power)
- Yield Prediction: Trend-based forecast using weighted recent hourly data
Architecture
┌─────────────────────────────────────────────────────────────────────────┐
│ AZURE CLOUD │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────┐ │
│ │ Azure File Share (psargostorage) │ │
│ │ ├── ARGO4/Daily stats dump/*.txt │ Raw CSVs from machines │
│ │ ├── ARGO5/DailyStats/*.txt │ 382 cols, ~33MB/day each │
│ │ ├── ARGO5/Curvature57/**/*.csv │ 57x57 grids, ~39KB each │
│ │ ├── ARGO5/Thickness57/**/*.csv │ 57x57 grids, ~29KB each │
│ │ ├── ARGO6/DailyStats/*.txt │ │
│ │ └── argo-analytics-data/parquet/ │ Processed Parquet files │
│ │ ├── argo4/daily_stats/date=*/ │ 872 files │
│ │ ├── argo5/daily_stats/date=*/ │ 854 files │
│ │ └── argo6/daily_stats/date=*/ │ 773 files │
│ └──────────────────┬──────────────────────┘ │
│ │ CIFS mount (/mnt/argodatastore) │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ VM: ps-argo-etl (10.160.0.20) │ │
│ │ Ubuntu 24.04, B2as_v2 (2 vCPU, 8GB) │ │
│ │ │ │
│ │ ┌──────────────────────────────────┐ │ │
│ │ │ ETL Pipeline (cron, every 4 hrs) │ │ │
│ │ │ Python 3.12 + Polars + PyArrow │ │ │
│ │ │ Reads CSVs → Writes Parquet │ │ │
│ │ │ Rebuilds DuckDB cache │ │ │
│ │ └──────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────┐ │ │
│ │ │ DuckDB (local disk, ~25MB) │ │ │
│ │ │ 22M rows, 16 columns │ │ │
│ │ │ Materialized: daily_yield, │ │ │
│ │ │ hourly_yield, defect_pareto, │ │ │
│ │ │ batch_summary │ │ │
│ │ └──────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────┐ │ │
│ │ │ FastAPI (uvicorn, port 8000) │ │ │
│ │ │ 28 REST API endpoints │ │ │
│ │ │ DuckDB read-only mode │ │ │
│ │ │ scikit-learn analytics modules │ │ │
│ │ │ Serves React frontend (dist/) │ │ │
│ │ └──────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────┐ │ │
│ │ │ nginx (port 80/443) │ │ │
│ │ │ SSL: *.progressivesurface.com │ │ │
│ │ │ Reverse proxy → localhost:8000 │ │ │
│ │ │ HTTP → HTTPS redirect │ │ │
│ │ └──────────────────────────────────┘ │ │
│ └──────────────────┬──────────────────────┘ │
│ │ │
│ ┌──────────────────┴──────────────────────┐ │
│ │ DNS: argo.progressivesurface.com │ │
│ │ A record → 10.160.0.20 │ │
│ │ SSL: Wildcard cert from ps-certificates-kv │
│ └─────────────────────────────────────────┘ │
│ │
└──────────────────────────────┬──────────────────────────────────────────┘
│ VPN / Site-to-Site
┌──────────────────────────────┴──────────────────────────────────────────┐
│ PSI On-Premises / VPN Users │
│ Browser → https://argo.progressivesurface.com │
└─────────────────────────────────────────────────────────────────────────┘
Technology Stack
| Component | Technology |
|---|---|
| ETL | Python 3.12, Polars, PyArrow |
| Data Lake | Apache Parquet (Snappy compression) on Azure File Share |
| Query Engine | DuckDB 1.5 (embedded, columnar, read-only for API) |
| Backend | FastAPI 0.115, Pydantic, uvicorn |
| Analytics | scikit-learn (Random Forest, Isolation Forest, PCA), scipy |
| Frontend | React 18, TypeScript, Vite 5, TailwindCSS 3 |
| Charts | Recharts 2.12 (SVG), HTML Canvas (57x57 heatmaps) |
| State | Zustand 4 (filters/machine), TanStack React Query 5 (data) |
| Auth | Azure AD / MSAL (planned) |
| Hosting | Azure VM ps-argo-etl (B2as_v2, Ubuntu 24.04) |
| Reverse Proxy | nginx 1.24 with wildcard SSL |
| CI/CD | GitHub Actions on [self-hosted, psi-internal] runner |
Key Architecture Decisions
| Decision | Rationale |
|---|---|
| VM instead of App Service | VM has CIFS mount access to Azure File Share; App Service cannot mount file shares for ETL reads |
| DuckDB on local disk | CIFS-mounted DuckDB files have locking issues; local disk eliminates lock conflicts |
| DuckDB read-only for API | Prevents lock conflicts between ETL writes and API reads |
| Parquet on file share | Persistent storage survives VM restarts; shareable across services |
union_by_name in DuckDB | ARGO4’s older Parquet files have different schemas; DuckDB reconciles automatically |
| Single uvicorn worker | DuckDB’s single-writer model conflicts with gunicorn’s multi-process workers on network storage |
| Symlinks for Parquet | /opt/argo-analytics/data/parquet/{machine} → file share; keeps config simple |
Data Sources
DailyStats (382 Columns)
Each row is one inspected part. The 382 columns are organized into 17 groups:
| Group | Columns | Description |
|---|---|---|
| Identity | 1–6 | Part number, timestamp, result, barcode |
| Pass/Fail Flags | 7–21 | Individual quality gate results (+/- for 15 defect types) |
| Plenum Depth | 22–24 | Plenum depth results |
| IC Geometry | 25–33 | Interconnect dimensional checks |
| Inspection Status | 34–40 | System health (timing, image acquisition, belt stop) |
| Defect Counts | 41–60 | Counts and dimensions of detected defects |
| Curvature Top | 61–112 | 50 measurement points + summary (µm) |
| Curvature Bottom | 113–164 | 50 measurement points + summary (µm) |
| S7 Topology | 165–182 | Regional flatness metrics (Air/Fuel sides) |
| Thickness | 183–255 | 50 measurement points + averages (µm) |
| A² / Inertia | 256–296 | Structural stiffness measurements |
| Plenum Depth Stats | 297–302 | Per-area average, min, max |
| IC Dimensions | 303–325 | Hole positions, diameters, symmetry (mm) |
| Production Meta | 326–329 | Recipe, batch, cage, powder lot |
| Coating Quality | 330–350 | Stddev/offset of coating registration per region |
| System Validation | 351–372 | 3D rotation, temperatures (7 sensors), eval times |
| Edge/Calibration | 373–382 | Calibration offsets, z-score alarms |
Schema evolved from 375 columns (v1, 2023) to 382 (v4, 2026). The ETL auto-detects version by column count and normalizes to the canonical v4 schema.
Data Locations
| Machine | Raw Data Path | Parquet Output |
|---|---|---|
| ARGO4 | /mnt/argodatastore/ARGO4/Daily stats dump/ | parquet/argo4/daily_stats/date=*/ |
| ARGO5 | /mnt/argodatastore/ARGO5/DailyStats/ | parquet/argo5/daily_stats/date=*/ |
| ARGO6 | /mnt/argodatastore/ARGO6/DailyStats/ | parquet/argo6/daily_stats/date=*/ |
Note: ARGO4 uses the legacy directory name Daily stats dump/ (with spaces).
Per-Part Surface Grids (57x57)
Each part has three high-resolution surface scan files (ARGO5 only currently):
| Grid Type | Location | Values | Range |
|---|---|---|---|
| Curvature Top | Curvature57/{date}/CurvatureTop/ | Surface height (µm) | ~4800–5100 |
| Curvature Bottom | Curvature57/{date}/CurvatureBottom/ | Surface height (µm) | ~4800–5100 |
| Thickness | Thickness57/{date}/Thickness/ | Plate thickness (mm) | ~1.3–2.1 |
Filename pattern: N000{PartNo}BID{BatchID}CID{CageID}D{YYYYMMDD}U{HHMMSS}.csv
External Data
| Source | Table | Description | Status |
|---|---|---|---|
Azure SQL PSI_Analytics | labor_detail | Timesheet data (1.37M rows) | Planned |
Azure SQL PSI_Analytics | tslabor2 | All timesheets (5.25M rows) | Planned |
Currently shift analysis uses simulated shift windows (Day 06:00–17:59, Night 18:00–05:59) derived from ARGO inspection timestamps.
API Endpoints
All endpoints accept an optional machine query parameter (argo4, argo5, argo6, or all; default: all).
Overview (/api/overview)
| Method | Endpoint | Description |
|---|---|---|
| GET | /kpi?date_start&date_end&machine | Aggregate KPIs (yield, throughput, errors) |
| GET | /yield-trend?date_start&date_end&machine | Daily yield time series |
| GET | /defect-pareto?date_start&date_end&machine | Defect type counts ranked |
| GET | /hourly-heatmap?date_start&date_end&machine | Yield by date x hour grid |
| GET | /throughput?date_start&date_end&machine | Parts per hour by date |
SPC (/api/spc)
| Method | Endpoint | Description |
|---|---|---|
| GET | /parameters | Available measurement parameters for SPC |
| GET | /control-chart?parameter&date_start&date_end&subgroup_size&machine | X-bar/R chart with WE rules |
| GET | /cpk?parameter&date_start&date_end&usl&lsl&machine | Process capability indices |
Batch (/api/batch)
| Method | Endpoint | Description |
|---|---|---|
| GET | /list?date_start&date_end&machine | Batch summaries with yield and measurements |
| GET | /{batch_no}/detail | Single batch with percentile distributions |
| POST | /compare | Side-by-side batch comparison |
| GET | /recipe-timeline?date_start&date_end&machine | Recipe changes over time |
Defect (/api/defect)
| Method | Endpoint | Description |
|---|---|---|
| GET | /failure-modes?date_start&date_end&machine | Failure type stats with trends |
| GET | /by-hour?date_start&date_end&machine | Defect counts by hour of day |
| GET | /by-batch?date_start&date_end&machine | Defect counts by batch |
Part Lookup (/api/part)
| Method | Endpoint | Description |
|---|---|---|
| GET | /search?q&limit&machine | Search by part number or batch ID |
| GET | /{part_no} | Full measurement profile |
| GET | /{part_no}/grid/{grid_type} | 57x57 surface grid data (on-demand read) |
| GET | /{part_no}/neighbors?n | Adjacent parts (±n sequential) |
Predictive (/api/predictive)
| Method | Endpoint | Description |
|---|---|---|
| GET | /anomalies?date_start&date_end&contamination&machine | Isolation Forest anomaly detection |
| GET | /feature-importance?date_start&date_end&machine | Random Forest feature importance |
| GET | /yield-forecast?date_start&date_end&machine | Trend-based yield prediction |
Labor/Shift (/api/labor)
| Method | Endpoint | Description |
|---|---|---|
| GET | /shift-yield?date_start&date_end&machine | Yield by Day/Night shift |
| GET | /hourly-profile?date_start&date_end&machine | Average yield by hour (0–23) |
| GET | /day-vs-night?date_start&date_end&machine | Day vs Night aggregate comparison |
Cross-Machine Comparison (/api/compare)
| Method | Endpoint | Description |
|---|---|---|
| GET | /summary?date_start&date_end | One-row aggregate per machine |
| GET | /yield?date_start&date_end | Yield by machine by date |
| GET | /defects?date_start&date_end | Top defects per machine |
| GET | /throughput?date_start&date_end | Parts per hour per machine |
Meta (/api)
| Method | Endpoint | Description |
|---|---|---|
| GET | /health | System health, DuckDB status, row counts |
| GET | /meta/machines | List of available machine identifiers |
VM Infrastructure
VM: ps-argo-etl
| Property | Value |
|---|---|
| Name | ps-argo-etl |
| Resource Group | PS-RG-01 |
| Size | Standard_B2as_v2 (2 vCPU, 8GB RAM) |
| OS | Ubuntu 24.04 LTS |
| Private IP | 10.160.0.20 (PS-SERVERS subnet) |
| Disk | 64GB OS disk |
File System Layout
/opt/argo-analytics/ # Application root (git repo)
├── backend/ # Python backend + ETL
│ ├── .venv/ # Python virtual environment
│ ├── .env # Environment config
│ ├── etl/ # ETL pipeline modules
│ ├── api/ # FastAPI application
│ └── analytics/ # ML/SPC modules
├── frontend/dist/ # Built React frontend
├── data/
│ ├── duckdb/argo.duckdb # DuckDB cache (~25MB, LOCAL disk)
│ └── parquet/ # Symlinks to file share:
│ ├── argo4 → /mnt/argodatastore/argo-analytics-data/parquet/argo4
│ ├── argo5 → /mnt/argodatastore/argo-analytics-data/parquet/argo5
│ └── argo6 → /mnt/argodatastore/argo-analytics-data/parquet/argo6
├── run-etl.sh # ETL execution script (cron)
└── backfill-a4a6.py # Multi-machine backfill script
/mnt/argodatastore/ # Azure File Share (CIFS mount, persistent)
├── ARGO4/Daily stats dump/ # Raw CSVs from ARGO4
├── ARGO5/DailyStats/ # Raw CSVs from ARGO5
├── ARGO5/Curvature57/ # 57x57 surface grids
├── ARGO5/Thickness57/ # 57x57 thickness grids
├── ARGO6/DailyStats/ # Raw CSVs from ARGO6
└── argo-analytics-data/parquet/ # Processed Parquet output
Services
| Service | Type | Config |
|---|---|---|
argo-api | systemd | /etc/systemd/system/argo-api.service — uvicorn on port 8000, auto-restart |
| nginx | systemd | Reverse proxy 80/443 → 8000, SSL with wildcard cert |
| ETL cron | crontab | 0 */4 * * * — runs /opt/argo-analytics/run-etl.sh |
ETL Cron Script (run-etl.sh)
#!/bin/bash
cd /opt/argo-analytics/backend
export PATH="/opt/argo-analytics/backend/.venv/bin:$PATH"
# Backfill last 2 days (catches today + late-arriving yesterday data)
python3 -m etl.backfill --days 2
# Rebuild DuckDB cache from all machine Parquet files
python3 -m etl.duckdb_cache
# Restart API to pick up fresh DuckDB (read_only mode needs reconnect)
systemctl restart argo-apiAzure File Share Mount
Credentials stored at /etc/smbcredentials/psargostorage.cred. Persistent mount via /etc/fstab:
//psargostorage.file.core.windows.net/argodatastore /mnt/argodatastore cifs nofail,credentials=/etc/smbcredentials/psargostorage.cred,dir_mode=0755,file_mode=0644,serverino,nosharesock,actimeo=30
Project Structure
argo-analytics/
├── CLAUDE.md # Claude Code instructions
├── .env.example # Environment variable template
├── startup.sh # Azure App Service startup (legacy)
├── requirements.txt # Root requirements (→ backend)
├── deploy.sh # Azure resource creation script
├── run-etl.sh # ETL cron execution script
├── .github/workflows/deploy.yml # CI/CD via self-hosted runner
│
├── backend/
│ ├── pyproject.toml # Python project metadata + deps
│ ├── requirements.txt # Pinned production dependencies
│ ├── etl/ # ETL Pipeline
│ │ ├── config.py # Paths, constants, env loading
│ │ ├── schema.py # 382-column canonical schema (v1–v4)
│ │ ├── daily_stats_parser.py # CSV → normalized DataFrame
│ │ ├── parquet_writer.py # DataFrame → partitioned Parquet
│ │ ├── duckdb_cache.py # Multi-machine DuckDB aggregation
│ │ ├── grid_parser.py # 57x57 surface grid CSV reader
│ │ ├── file_scanner.py # New file discovery
│ │ ├── backfill.py # Historical ingest CLI
│ │ └── logging_config.py # Structured logging setup
│ ├── api/ # FastAPI Application
│ │ ├── main.py # App entry, lifespan, SPA serving
│ │ ├── deps.py # DuckDB + machine validation helpers
│ │ ├── models.py # Pydantic response models
│ │ └── routers/
│ │ ├── health.py # /api/health + /api/meta/machines
│ │ ├── overview.py # /api/overview/* (5 endpoints)
│ │ ├── timeseries.py # /api/spc/* (3 endpoints)
│ │ ├── batch.py # /api/batch/* (4 endpoints)
│ │ ├── defect.py # /api/defect/* (3 endpoints)
│ │ ├── part_lookup.py # /api/part/* (4 endpoints)
│ │ ├── predictive.py # /api/predictive/* (3 endpoints)
│ │ ├── labor.py # /api/labor/* (3 endpoints)
│ │ ├── compare.py # /api/compare/* (4 endpoints)
│ │ └── analytics.py # /api/analytics/pca
│ └── analytics/ # ML/Statistics Modules
│ ├── spc.py # Control charts, Cp/Cpk, WE rules
│ ├── pca.py # PCA dimensionality reduction
│ ├── anomaly.py # Isolation Forest
│ ├── prediction.py # Yield forecasting
│ └── feature_importance.py # Random Forest importances
│
├── frontend/
│ ├── package.json
│ ├── vite.config.ts # Dev proxy /api → :8000
│ ├── tailwind.config.js # PSI brand colors
│ ├── playwright.config.ts # E2E test config
│ ├── e2e/ # 12 Playwright E2E tests
│ └── src/
│ ├── main.tsx # React root + QueryClient
│ ├── App.tsx # Routes (8 pages)
│ ├── api/client.ts # fetchApi wrapper
│ ├── stores/filterStore.ts # Zustand: date range + machine
│ ├── hooks/ # 8 TanStack React Query hook files
│ ├── components/
│ │ ├── layout/ # AppLayout, Sidebar, PageHeader
│ │ ├── shared/ # KpiCard, DateRangePicker, MachinePicker
│ │ └── charts/ # 15 chart components (Recharts + Canvas)
│ ├── pages/ # 8 route pages
│ └── types/ # TypeScript interfaces
│
└── data/ # Data lake (gitignored, on VM)
├── parquet/argo{4,5,6} → symlinks # → Azure File Share
└── duckdb/argo.duckdb # ~25MB local aggregation cache
Development
Prerequisites
- Python 3.11+
- Node.js 20+
- Access to
W:\ARGO5\network share (maps topsargostorage/argodatastore) - Azure CLI (for deployment)
Local Development
# Clone
git clone https://progressivesurface.ghe.com/ProgressiveSurface/argo-analytics.git
cd argo-analytics
# Backend
cd backend
python -m venv .venv
.venv/Scripts/activate # Windows
pip install -e ".[dev]"
# Initial data backfill (90 days, ARGO5 only)
python -m etl.backfill --days 90
# Refresh DuckDB cache
python -m etl.duckdb_cache --machines argo5
# Start API server
uvicorn api.main:app --reload --port 8000
# Frontend (separate terminal)
cd frontend
npm install
npm run dev # Vite dev server on :5173, proxies /api → :8000Environment Variables
# Data paths (local dev)
ARGO5_DATA_ROOT=W:/ARGO5
DATA_LAKE_ROOT=C:/GIT/argo-analytics/data
# Data paths (VM production)
# ARGO5_DATA_ROOT=/mnt/argodatastore/ARGO5
# DATA_LAKE_ROOT=/opt/argo-analytics/data
# Logging
LOG_LEVEL=INFORunning Tests
# Frontend E2E (requires both servers running)
cd frontend
npx playwright install chromium
npm run test:e2eDeployment
VM Deployment (Production)
The app runs on ps-argo-etl (10.160.0.20). To update:
# SSH or use az vm run-command
cd /opt/argo-analytics
git pull origin main
# Rebuild frontend
cd frontend && npm run build
# Restart API
sudo systemctl restart argo-apiCI/CD Pipeline
GitHub Actions workflow (.github/workflows/deploy.yml):
- Triggered on push to
main(path-filtered) - Runs on
[self-hosted, psi-internal]runner - Builds React frontend
- Packages backend + frontend dist into zip
- Deploys via
az webapp deployment source config-zip(App Service) or direct git pull (VM)
Azure Resources
| Resource | Name | IP / Config |
|---|---|---|
| VM | ps-argo-etl | 10.160.0.20, B2as_v2, Ubuntu 24.04 |
| Storage | psargostorage | File share argodatastore |
| DNS | argo.progressivesurface.com | A record → 10.160.0.20 |
| SSL | Wildcard cert | From ps-certificates-kv |
| App Service | ps-argo-analytics | Created but unused (VM serves instead) |
| Private Endpoint | ps-argo-analytics-pe | 10.160.140.16 (App Service, unused) |
How to Use the Dashboard
Getting Started
- Connect to the PSI VPN
- Navigate to https://argo.progressivesurface.com
- The Executive Overview page loads by default showing the last 30 days across all machines
Navigating
The left sidebar provides access to all 8 pages. Every page has:
- Machine picker (top right) — filter to ARGO4, ARGO5, ARGO6, or All Machines
- Date range picker — presets (7d, 30d, 90d) or custom start/end dates
- Changing either filter instantly refreshes all charts on the page
Common Tasks
“What’s our yield today?”
Go to Executive Overview (/). The KPI cards at the top show total parts, yield %, parts/hour, and inspection errors for the selected date range and machine.
“Which defect is causing the most rejects?”
Go to Executive Overview → Defect Pareto chart (bottom left). Bars are ranked by frequency. Highpoints is typically the dominant defect. For more detail, go to Defect Deep Dive (/defects) to see defects by hour and by batch.
“Is one machine worse than the others?”
Go to Multi-Machine (/machines). The summary cards show yield per machine. The yield timeline chart shows daily trends side-by-side. ARGO6 currently has the lowest yield (~86%).
“Did something change on a specific shift?”
Go to Shift & Labor (/shifts). The hourly profile chart shows average yield for each hour (0–23) with shift boundaries marked at 06:00 and 18:00. The Day vs Night KPI cards show the aggregate comparison.
“What does a specific part look like?”
Go to Part Lookup (/parts). Type a part number (minimum 3 digits) in the search bar. Click a result to see the full measurement profile and three 57x57 surface heatmaps (curvature top, curvature bottom, thickness). The neighbors table shows surrounding parts — if many neighbors also failed, it suggests a batch-level issue.
“Is the process in control?”
Go to Time-Series & SPC (/spc). Select a measurement parameter (Curvature Top, Thickness, etc.) from the dropdown. The X-bar control chart shows subgroup means with UCL/LCL limits. Red dots indicate Western Electric rule violations. Enter USL and LSL values to compute Cp/Cpk process capability.
“Which batches are problematic?”
Go to Batch Analysis (/batch). The table shows all batches sorted by yield (worst first). Red badges indicate yield below 85%, yellow for 85–92%, green for 92%+. Click column headers to sort.
“What predicts whether a part will pass or fail?”
Go to Predictive Analytics (/predictive). The Feature Importance chart shows which measurements matter most (curvature = 55% of predictive power). The Anomaly Timeline shows parts flagged by the Isolation Forest model. The yield forecast card shows the predicted next-period yield.
Operations Guide
Access Control
Operational access is managed through the ARGO Analytics Admins Entra ID security group. Members get:
- VM Contributor on
ps-argo-etl(start/stop, run commands) - Reader on
psargostoragestorage account - Reader on
PS-RG-01resource group
To add a new team member: add them to the ARGO Analytics Admins group in Entra ID. They also need collaborator access on the GHE repo.
| Current Members | UPN |
|---|---|
| Adam Devereaux | ADevereaux@progressivesurface.com |
| Dakota Cooper | DCooper@progressivesurface.com |
Managing the VM
All management is done via az vm run-command (no SSH configured):
# Run a command on the VM
az vm run-command invoke \
--resource-group PS-RG-01 \
--name ps-argo-etl \
--command-id RunShellScript \
--scripts "your-command-here"
# Check service status
az vm run-command invoke --resource-group PS-RG-01 --name ps-argo-etl \
--command-id RunShellScript \
--scripts "systemctl status argo-api --no-pager && systemctl status nginx --no-pager"
# View recent API logs
az vm run-command invoke --resource-group PS-RG-01 --name ps-argo-etl \
--command-id RunShellScript \
--scripts "journalctl -u argo-api --no-pager -n 30"
# View ETL cron logs
az vm run-command invoke --resource-group PS-RG-01 --name ps-argo-etl \
--command-id RunShellScript \
--scripts "journalctl -t argo-etl --no-pager -n 20"
# Check data status
az vm run-command invoke --resource-group PS-RG-01 --name ps-argo-etl \
--command-id RunShellScript \
--scripts "curl -s http://localhost:8000/api/health"Important: Only one az vm run-command can execute at a time per VM. If you get a “Conflict” error, a previous command is still running — wait and retry.
Deploying Code Updates
# Pull latest code and rebuild (via run-command)
az vm run-command invoke --resource-group PS-RG-01 --name ps-argo-etl \
--command-id RunShellScript \
--scripts '#!/bin/bash
export PATH="/opt/argo-analytics/backend/.venv/bin:$PATH"
cd /opt/argo-analytics && git pull origin main
cd frontend && npm run build
sudo systemctl restart argo-api
sleep 3 && curl -s http://localhost:8000/api/health'Running a Manual ETL
az vm run-command invoke --resource-group PS-RG-01 --name ps-argo-etl \
--command-id RunShellScript \
--scripts '#!/bin/bash
export PATH="/opt/argo-analytics/backend/.venv/bin:$PATH"
cd /opt/argo-analytics/backend
python3 -m etl.backfill --days 2
systemctl stop argo-api
python3 -m etl.duckdb_cache
systemctl start argo-api'Adding a New ARGO Machine
- Verify the machine’s data exists on the Azure file share (
/mnt/argodatastore/ARGO{N}/) - Create symlink on the VM:
ln -sfn /mnt/argodatastore/argo-analytics-data/parquet/argo{n} /opt/argo-analytics/data/parquet/argo{n} - Run backfill:
ARGO5_DATA_ROOT=/mnt/argodatastore/ARGO{N} python3 -m etl.backfill --days 9999 --machine argo{n} - Add
argo{n}toVALID_MACHINESinbackend/api/deps.py - Rebuild DuckDB:
python3 -m etl.duckdb_cache --machines argo4 argo5 argo6 argo{n} - Update the ETL cron script to include the new machine
- Restart the API
Troubleshooting
| Symptom | Likely Cause | Fix |
|---|---|---|
| Dashboard shows no data | DuckDB cache empty or API not running | Check systemctl status argo-api and curl localhost:8000/api/health |
| ”500 Internal Server Error” on page load | DuckDB lock conflict during ETL | Wait for ETL to finish, or restart API: systemctl restart argo-api |
| ETL cron not running | Cron job lost after VM restart | Verify: crontab -l, re-add if missing |
| File share not mounted | VM restarted, mount failed | Check: mount | grep argodatastore, re-mount: mount /mnt/argodatastore |
| Yield shows 0% for a machine | No Parquet files for that machine | Check: find /mnt/argodatastore/argo-analytics-data/parquet/{machine} -name '*.parquet' | wc -l |
| ”machine not found” error | Machine not in VALID_MACHINES | Add to backend/api/deps.py and redeploy |
| Old data (>4 hours stale) | Cron job failed or stuck | Check ETL logs: journalctl -t argo-etl -n 20 |
| VM unresponsive | Out of memory (DuckDB rebuild with too many machines) | Restart VM: az vm restart --resource-group PS-RG-01 --name ps-argo-etl |
Key Findings
Analysis of 22M parts across 3 machines (May 2023–March 2026):
| Finding | Detail |
|---|---|
| Total parts | 22,047,056 across ARGO4/5/6 |
| Best machine | ARGO4 (90.17% yield) |
| Worst machine | ARGO6 (86.08% yield) — 4 points below ARGO4 |
| Top defect | Highpoints (dominant across all machines) |
| Predictive power | Curvature measurements = 55% of pass/fail prediction |
| Shift effect | Day shift ~1% better than Night shift |
| Worst hour | Hour 23 (86.0%) — end of night shift |
| Best hour | Hour 13 (93.1%) — mid-day shift |
| Schema evolution | 375 cols (v1, 2023) → 382 cols (v4, 2026) |
Related Pages
- PSI Data Brain — Master data reference including labor_detail tables
- Terminology — PSI glossary (Redbook, AFTEC, BOM, etc.)
- Deploy to Azure — PSI deployment guide and checklist
- Azure Resources — VM, storage, DNS inventory
Last updated: March 2026