PSI Data Brain

The master reference for all PSI data — what exists, where it lives, how to access it, what questions it answers, and how everything connects. This is the canonical first-stop for any data question at PSI.

Machine-readable companion: data-brain.yaml — structured version for AI agents and MCP tools.


How to Use This Page

This is PSI’s data knowledge graph — a context-friendly map designed so that any person or AI agent can quickly understand:

  1. What data exists across all PSI systems
  2. Where each source lives (file path, API endpoint, database)
  3. How to access it (query method, authentication, tools)
  4. What questions it answers (use cases)
  5. How sources connect to each other (joins, foreign keys)

Maintenance Protocol: Every time analysis is performed using PSI data, this page gets updated with any new sources, joins, schema details, or gotchas discovered. Each section includes a “Last Verified” date to make staleness visible.


Data Universe Overview

PSI Data Pipeline Architecture

Full diagram: sources (left), 6 independent pipeline jobs (center), Azure SQL destination (right), consumers (far right). Gaps shown in red dashed boxes. Editable source: psi-data-pipeline/docs/data-flow-diagram.excalidraw

Azure SQL PSI_Analytics is the primary data store. All applications and reports should connect here. Ten independent pipeline jobs ingest data from AFTEC ERP exports, scraper-owned sources, and the knowledge graph — each on its own schedule, each responsible for its own tables.


1. PRIMARY DATA SOURCES

1.1 PSI Analytics Database (Azure SQL — Primary)

Last Verified: 2026-04-20

PropertyValue
WhatPSI’s unified data store — 24.4M rows across 50 tables. The preferred way to access all PSI data.
Whereprocserv-proddata.database.windows.netPSI_Analytics
AuthAzure AD (user or service principal). Private endpoint at 10.160.140.4 for internal traffic via VPN.
Repopsi-data-pipeline (10 workflows in .github/workflows/)
Refresh10 independent pipeline jobs on GHE Actions (PS-PROXY self-hosted runner) — see schedule below
ConsumersRedbook Dashboard (Streamlit), Power BI reports, PSI Explorer, Document Manager (via PSILocalService/WCF), direct SQL queries

This is the single source of truth for PSI analytics and operational data. It consolidates AFTEC ERP exports, AI-classified quality data, workforce features, BOM hierarchies, lessons learned, and more into one queryable database. All new applications and reports should connect here — not to CSV files or other intermediate stores.

How to connect:

  • Power BI: Sql.Database("procserv-proddata.database.windows.net", "PSI_Analytics") with Azure AD auth
  • Python: pyodbc with MSAL token auth (see Scripts/data/loader.py in redbook-dashboard)
  • Direct query: Azure Data Studio, SSMS, or az sql query with your PSI Azure AD account

Pipeline job schedule:

JobWorkflowScheduleTables OwnedDuration
AFTEC Raw to Azure SQLsync-aftec-raw.ymlDaily 1:00 AM ET31 raw AFTEC tables (partmaster, tslabor2, etc.)~2 hours
Build Analytics DBnightly-data-build.ymlDaily 1:30 AM ET17 analytics tables (projects, labor_detail, bom_parts, etc.)~30 min
Sync BOM Datasync-bom-data.ymlWeekdays 3:00 AM ETbom_parts~2 hours
Sync KG Datasync-kg-data.ymlDaily 2:00 AM ETkg_* tables~1 min
PRIMS PC Fleetsync-prims.ymlWeekdays midnight ETprims_installations~2 min
Dept Schedulessync-dept-schedules.ymlWeekdays 9:00 AM ETdept_schedules, floor_space~2 min
Component Lifecyclesync-rockwell.ymlMondays 5:00 AM ETcomponent_lifecycle, project_components~5 min
Product Taxonomysync-product-taxonomy.ymlMondays 2:00 AM ETproduct_classes, product_catalog~1 min
Dell Warranty Enrichmentenrich-dell-warranty.ymlSaturdays 7:00 AM ETdell_warranty (Azure SQL direct)~5 min
Setup Env Varssetup-env-vars.ymlManual trigger only(System environment)<1 min

Each job is independent — failures don’t cascade. The Component Lifecycle job now extracts both Rockwell/Allen-Bradley and GE/Emerson catalog numbers from BOM descriptions (488+ Rockwell parts via public API, 38+ GE parts via static reference — no GE lifecycle API exists). The knowledge graph tables (kg_*) are imported by the analytics build from psi_ingest.db. Full architecture: psi-data-pipeline/docs/pipeline-architecture.md.

Two categories of tables:

1. Analytics tables (17 tables, 6.3M rows) — enriched/derived data not available elsewhere:

TableRowsDescription
bom_parts4,362,913BOM hierarchy (19 columns)
labor_detail1,379,326Timesheets filtered to project jobs
work_orders435,958Work orders filtered to project jobs
redbooks40,040AI-classified quality issues (73 columns, with severity/category/root cause)
ecn_detail26,983ECNs with Code field (Engineering/Customer/Manufacturing/etc.)
ncn_detail17,789Non-conformance notices
department_summary15,978Per-(job, dept) aggregates with P2/P98 dates
purchase_orders13,467PO data matched to projects
change_orders8,395K: drive change order extractions
machine_dna_features3,084Quality prediction feature matrix (82 features)
projects2,569Core project identity (job, customer, dates, lead times, budget)
workforce_features1,875Per-project workforce metrics (tenure, HHI, corrective hrs)
lessons_learned1,166OneNote lessons learned extraction
handoff_extractions551Handoff .docx parsing (comp machine refs, risks)
plc_complexity467PLC program complexity scores
rd_classification105R&D project flags
product_classes437AFTEC product class taxonomy (ClassID, ClassName)
product_catalog20,654Products mapped to classes (ProductNumber, ClassID, Phys)
dept_schedules~3,000Department schedule entries from PMO Excel files
floor_space~200Floor space capacity bays
comprehensive_view2,569SQL view joining all analytics tables

These tables are also published through two upstream SQLite pipeline stages on PS-PROXY:

  • \\ad.ptihome.com\DFS\Schedule\SS123\LEADTIME\psi_ingest.dbdurable ingest cache for scraper-owned tables that are expensive or slow to rebuild (prims_installations, component_lifecycle, project_components, product_classes, product_catalog, dept_schedules, floor_space, kg_*, cpq_*)
  • \\ad.ptihome.com\DFS\Schedule\SS123\LEADTIME\psi_analytics.dbrebuildable analytics artifact used as the local publish source before data is synced to Azure SQL

Applications and reports should still treat Azure SQL PSI_Analytics as the primary data store. The SQLite files are pipeline stages, not the preferred app/report query target.

2. Raw AFTEC tables (32 tables, 18.1M rows) — full ERP exports, same schema as the CSV files:

TableRowsDomain
inventoryhistory7,903,102Inventory transactions
tslabor25,255,126All timesheets (unfiltered)
wiprouteline1,559,452Manufacturing routing
openwo741,342All work orders
partmaster592,914Parts master
pohist488,624Purchase history
wiplabor358,576WIP labor tracking
invoiceline224,870Invoice lines
jobparts204,548Job-to-parts
purchaseorders108,466Purchase orders
+ 22 more tablesSee Fabric ProgressiveDataSet for full list

Column names match the CSV headers exactly — existing Power BI DAX measures work unchanged when re-pointed to Azure SQL.

Key advantages over CSV/Fabric:

  • Multi-row data is queryable (e.g., SELECT code, COUNT(*) FROM ecn_detail WHERE project = '2422' GROUP BY code)
  • Concurrent access from multiple applications without file locking
  • Proper indexing for fast lookups
  • Power BI DirectQuery support (no Import mode refresh delay)
  • Private endpoint access via VPN — no data leaves PSI’s network

Backwards compatibility: The nightly build also exports comprehensive_dataset.csv to the LEADTIME share for spreadsheet consumers.


1.2 AFTEC ERP Database (UniData — Upstream Source)

Last Verified: 2026-02-26

PropertyValue
WhatCore ERP system — orders, jobs, parts, labor, inventory, purchasing, costing
WhereMRP-PROD server (on-premises UniData)
AccessVia PSI.UniData.API (REST), PSILocalService (WCF), or U2 Toolkit for .NET (direct)
Key TablesPROJECT.1287, MACHTYPE.1287, OPENWO, INVHIST, REDBOOK.1287, PRODUCT.1287, PRODUCT.CLASS.1287, PRODUCT.LDS.1287, INDSECTOR.1287, INDCLASS.1287, PART.PHOTOS.1287
RefreshReal-time (live queries)
Answers”What is the current status of job X?”, “What parts are on this BOM?”

AFTEC is the origin of most PSI data. Its 32 CSV exports feed into the nightly pipeline that builds Azure SQL PSI_Analytics. For analytics and reporting, query Azure SQL instead of AFTEC directly — it has the same data plus enrichments (AI classification, workforce features, etc.).

370 VB_ subroutines* provide structured access to ERP operations.

Key schema detail (PO promise dates): AFTEC stores up to 4 promise date/quantity pairs per PO line via VB_PO.LINE.REV1 subroutine, but the PSI.UniData.API currently only exposes 1 promise date. The CSV export (purchaseorders.csv) also only contains the current promise date, not the original. This means promise date drift cannot be tracked from existing exports. See Known Gaps.


1.3 AFTEC CSV Exports (\\ad.ptihome.com\DFS\DATA\unixshare\dgwreportfiles\)

Last Verified: 2026-03-12

PropertyValue
What32 CSV files exported daily from AFTEC — ingested nightly into Azure SQL
Where\\ad.ptihome.com\DFS\DATA\unixshare\dgwreportfiles\ (network share, DFS path)
FormatPipe-delimited (`
AccessDirect file read (VPN required), or query Azure SQL (preferred)
RefreshUpdated by AFTEC export jobs at 5AM; loaded into Azure SQL at 1AM ET
Total Records~9.6 million rows across 32 tables

For analytics and reporting, use Azure SQL PSI_Analytics instead of reading these CSV files directly. The raw AFTEC tables in Azure SQL have the exact same columns and data, with the benefits of SQL indexing and concurrent access. The CSVs remain as the upstream source that the pipeline reads from.

Critical Files for Lead Time Analysis

FileRowsKey FieldsAnswers
ganttJOB.csv903Job, Dept, Start, End, % Complete”What’s the schedule for each dept on each project?” ⚠️ Active projects only (~83 jobs). Historical projects are NOT retained — use LDS Gantt extraction instead.
ganttWO.csv7,776Job, WO, Part, Op, Start, End, % Comp”What’s the WO-level schedule?“
openwo.csv519,286WO, Job, Make/Release/Start/Due/Close Date, Status”How long do work orders take? Which are late?“
tslabor2.csv2,773,043Date, Dept, Emp, Job, WO, Op, Labor hrs”Where are labor hours being spent?“
wiprouteline.csv927,344Job, WO, Op, PlanStart/Comp, ActStart/Comp, % Comp”Planned vs actual at operation level”
wiplabor.csv253,395Job, WO, Part, Routed vs Actual hrs by dept”Are we over/under on hours by dept?“
wc-capacity.csv31Work center capacity hours”What’s our weekly capacity by work center?“
capacity.csv23,834Detailed capacity planning data”What’s the load vs capacity outlook?“
redbook.csv50,057RFC entries with dept flags, dates, days open”What quality issues are impacting schedules?“
ecn.csv28,266ECN entries with change type code, dept, dates, drawing”How many engineering changes per project? What type?“
purchaseorders.csv108,768PO, Promise Date, Lead Time, Vendor”Are vendors delivering on time?“
pohist.csv268,238PO receipt history”Actual vendor delivery performance”
jobparts.csv203,277Job-to-part BOM linkage”What parts are on each job?“
salesorder.csv351Order date, amount, customer”When was the order placed?“
budgetjob.csv1,058Job budget allocations”What’s the budget vs actual?“
partmaster.csv420,950Mfg LT, Pur LT, Cum LT per part”What are standard lead times?“

tslabor2.csv — Detailed Schema

Last Verified: 2026-02-26 — Schema confirmed via direct file read and Fabric table inspection.

ColumnDescriptionExample
IDRecord identifier”12345”
DateLabor date (MM/DD/YY)“02/13/23”
WE DateWeek-ending date”02/18/23”
DeptDepartment code”110E”
EmpEmployee number”651”
Employee NameFull name (LAST, FIRST MI)“SMITH, ANDREW L”
LineLine number”1”
JobJob/project number”2247”
WOWork order number”2247”
PartPart number”PROJECT WORK ORDERS”
Part DescriptionPart description text”PROJECT WORK ORDERS”
OpOperation code”80”
Op DescOperation description”CONTROLS DESIGN”
QtyQuantity”1”
SetupSetup hours”0.0”
LaborLabor hours”8.0”
P/CPart/component flag”C”

Department codes — HR/employment vs timesheet entry:

There are two numbering systems. HR/employment uses simplified codes (110, 111, 120). Timesheets use subdivided codes (110M, 110E, 122). Timesheet codes have been stable since ~2000.

HR DeptOfficial NameTS CodeTimesheet Notes
100Maintenance100Facility maintenance, janitorial
101Administration101
102Machining102VF4, ST30, prototyping. 7,213 project-dept pairs (most common)
104Welding104Bench weld, saw, bend, punch. 6,784 project-dept pairs. ⚠️ NOT Controls Engineering — frequently confused. CE is 110E.
106Mechanical Assembly106Assembly + machine shipping. 4,587 project-dept pairs
108Electrical Assembly108Panel building, wiring. 1,955 project-dept pairs
110Mechanical Engineering110MMech design. 600K+ total hrs, 112 unique employees. 1,790 project-dept pairs. ⚠️ Legacy code 110 has 0 hrs since 2010 — all migrated to 110M. Note: Dan Haveman (Emp 394) = PMO, logs scheduling here early — use P2 date for real design start.
111Controls & Electrical Eng110ECE software/hardware design. 531K+ total hrs, 73 unique employees. 1,842 project-dept pairs. ⚠️ Code 111 never used in timesheets (<2 hrs ever) — always 110E.
115Proposal Engineering115359 project-dept pairs
120MVI120Installation, travel, field service, meetings. 2,220 project-dept pairs.
(sub)Test/Startup/Runoff122Subdivision of MVI. Always separate in timesheets (96K hrs 2020+). Test/startup, runoff, laser alignment, troubleshooting, customer training. Most time on op 4800-TEST/STARTUP. 1,651 project-dept pairs.
125Process Services125Separate job shop division. 622+ employees, ~977K hrs (2020+). Exclude from machine-build workforce analysis.
130Process Development130Testing, sample processing, engineering studies. 526 project-dept pairs
135Customer Service / Field Svc135Post-handoff support, calibration. 62 project-dept pairs
140Sales140Low volume in timesheets
150Supply Chain150Receiving, stock
160Materials Flow160Warehouse, shipping
170Operations170
180Admin180

Controls Engineering operation codes (Op Desc):

Op CodeOp DescCategory
2055CONTROL ADMINISTRATIAdmin/coordination
2120CONTROLS DESIGNCore hardware design
2130CONTROL DOCUMENTATIODrawing/documentation
2140SOFTWARE DESIGNPLC/HMI programming
2150SOFTWARE TESTTesting/validation
2151CONTROL TRAVELTravel for CE work
2162FLOOR ENGINEERCE on shop floor (rework indicator)
2020MECH DESIGN PRODUCTProduct-level mechanical
2100MECH DESIGN PROJECTProject-level mechanical
2110MECH DOCUM PROJECTMechanical documentation

Analysis insight (Feb 2026): FLOOR ENGINEER hours (op 2162) are a strong rework indicator. Good projects average 0.8 hrs, bad projects average 22 hrs (28x difference). SOFTWARE DESIGN (op 2140) is the single largest driver of CE overruns — 26x more hours on bad projects.

ecn.csv — Detailed Schema

Last Verified: 2026-03-06 — Schema confirmed via direct file read.

ColumnDescriptionExample
ECN NoECN identifier”7890”
RevRevision number”A”
ProjectJob number”2422”
DrawingPart/drawing number affected”92345-1”
CodeChange type classification”Engineering”
DescriptionFree text description”Revised bracket dim”
Chg DeptDepartment initiating change”110M”
DateECN date”05/15/23”
Department flag columnsA102, A104, A106, A108, A110E, A110M, A120”Y” / ""
RFCNumberLink to RFC/Redbook if applicable”12345”

Code field distribution (28,266 records):

CodeCount%Meaning
Engineering22,41479.3%Engineer-initiated design changes
Customer2,3208.2%Customer-requested scope changes
Product Improvement9823.5%Proactive design improvements
Manufacturing9163.2%Shop floor process changes
Purchasing5862.1%Supply chain substitutions
Process Change1780.6%Manufacturing process changes
(blank/other)~8703.1%Unclassified

Key finding (2026-03-06): Customer-requested ECNs (8.2%) are uncorrelated with corrective labor hours (P/C=‘C’ in tslabor2) at project level (r=0.089). This means the corrective hours signal in workforce features genuinely captures rework, not customer scope changes.

ECN-to-RFC linkage: 1,928 ECNs link to 1,511 RFCs via the RFCNumber field, enabling join from ECN back to Redbook quality issues.

Dept code mapping (ganttjob.csv vs tslabor2): ganttjob uses compound codes like 110E and 110M which match tslabor2 exactly. The dept codes are the SAME between ganttjob and tslabor2. Verified Feb 2026.

Dept milestones (computed inline): The nightly pipeline computes FirstDate/LastDate/TotalHrs/UniqueEmps per (Job, Dept) directly from the tslabor2 stream during stream_tslabor2(). Output: department_summary.csv (~16K rows). The old fabric_dept_milestones.json has been eliminated — no Fabric dependency needed.

Comprehensive Analytics Dataset (CSV — Flat Export)

Last Verified: 2026-03-09

PropertyValue
WhatMaster per-project analytics dataset — 2,569 projects × 249 columns
Where\\ad.ptihome.com\DFS\Schedule\SS123\LEADTIME\comprehensive_dataset.csv
Repopsi-data-pipeline/pipeline/build_comprehensive_dataset.py (exports CSV from same pipeline that builds the DB)
FormatPipe-delimited CSV
RefreshNightly at 1 AM ET via GHE Actions on PS-PROXY (nightly-data-build.yml). Also manual: python pipeline/build_comprehensive_dataset.py
Sources13 data sources: otd_dataset, tslabor2 (dept milestones computed inline), ganttjob (LDS + PMO), ganttjob_lds_revisions, jobparts, purchaseorders, pohist, ecn, openwo, budgetjob, wiplabor, redbook_coq.db
Answers”Which factors correlate with DaysLate?”, “How do metrics differ by era?”, “Which hypothesis explains the OTD collapse?”

Key metric columns: ActiveProjectsAtShip, TS_Avg_Concurrent, TS_Avg_Weekly_Switches, DM_110E_Span, DM_CE_MVI_Overlap, RB_RFC_Count, ECN_Count, JP_Part_Count, PO_Days_From_Order, TS_Op2162_Hrs, WO_Avg_Cycle_Days, GJ_Revision_Count, GJ_Total_Resched_Days

Operation-level columns: Per-department category rollups (TS_{dept}_Productive_Hrs, TS_{dept}_Overhead_Hrs, TS_{dept}_Rework_Hrs × 8 depts) + 20 individual operation columns (TS_Op{code}_Hrs for key ops across all departments). Rework columns are key analysis signals — troubleshooting hours correlate strongly with delays.

P2/P98 substantial engagement columns: DM_{dept}_P2 (substantial start), DM_{dept}_P98 (substantial end), DM_{dept}_P2P98_Span (substantial duration) × 8 departments = 24 columns. These compute the dates at which 2% and 98% of cumulative labor hours were booked per department, trimming early check-in and late punch-list tails. Key finding: Machine Shop full span overstates engagement by 29 days median; Weld by 51 days. Using P2/P98 strengthens all causal correlations (e.g., Controls Eng finish → Elec Assembly start: r=0.825 with P2/P98 vs r=0.743 with First/Last).

Companion outputs: causation_analysis.txt, department_scorecard.csv, vendor_scorecard.csv

Full-fidelity detail outputs (all source columns preserved, in LEADTIME\detail\):

  • labor_detail.csv — 1.38M rows, all 17 tslabor2 columns for project jobs
  • department_summary.csv — 15,966 rows, per-(job, dept) aggregates with P2/P98 dates
  • jobparts_project.csv — 202K rows, all 10 columns
  • purchaseorders_project.csv — 13.6K rows, all 19 columns
  • wiplabor_project.csv — 204K rows, all 16 columns
  • budgetjob_project.csv — 1,059 rows, all 27 columns
  • operation_summary.csv — every (Job, Dept, Op) combination with hours

purchaseorders.csv — Schema Notes

Last Verified: 2026-02-26

Promise Date field is the CURRENT (possibly revised) promise date, not the original. AFTEC supports 4 promise date/qty revisions per line (via VB_PO.LINE.REV1), but only the latest is exported.

Schedule Date is 38.6% empty across all records. When present, 68% of non-empty Schedule dates are pushouts averaging +12 days from promise.

Confirm Date is the vendor order acknowledgment date, NOT the delivery confirmation.


1.4 Microsoft Fabric / Power BI — ProgressiveDataSet (Migrating to Azure SQL)

Last Verified: 2026-02-26

PropertyValue
What”Golden Set” semantic model — same 33 tables from CSV exports
WhereFabric workspace (John Tymes personal)
Workspace ID4879a6ff-c0c4-4f1a-87b5-638a21e04f5f
Dataset ID51d7822c-8d17-4fb5-b28f-50a4c6fdd19e
AccessDAX queries via Azure CLI (az rest) — must be logged in
RefreshDaily at 5:00 AM ET
AnswersSame as CSV files but queryable with DAX aggregations

Query Pattern:

az rest --method POST \
  --url "https://api.powerbi.com/v1.0/myorg/groups/{workspace}/datasets/{dataset}/executeQueries" \
  --resource "https://analysis.windows.net/powerbi/api" \
  --body '{"queries": [{"query": "EVALUATE ..."}], "serializerSettings": {"includeNulls": true}}'

CRITICAL LIMITATION — No Versioning or Change History:

Fabric’s ProgressiveDataSet does NOT retain versioning, changelogs, or historical snapshots. Each daily refresh at 5AM is a full overwrite of the previous data:

  • No snapshots of what data looked like yesterday vs today
  • No diff tracking on changed records (e.g., PO promise date changes)
  • No deleted record retention — if a row is removed from the CSV, it’s gone
  • No audit trail for field-level changes over time

Impact: This means we cannot answer questions like “What was the original promise date?” or “When did this WO status change?” from Fabric or CSV data alone. Only the current state is available.

Recommended fix: Implement daily snapshot archiving — copy CSVs to a dated folder before Fabric refreshes. This is a 10-minute PowerShell scheduled task on the file server. See Known Gaps.


1.5 Schedule Excel Files (\\ad.ptihome.com\DFS\Schedule\)

Last Verified: 2026-02-26

PropertyValue
WhatPMO scheduling workbooks — department schedules, project Gantt charts
Where\\ad.ptihome.com\DFS\Schedule\SS123\SCHEDULE\ (ENG/ and PROD/)
AccessVia PSI.UniData.API /api/schedule/dev/* endpoints
RefreshManual (updated by PMO team)

Department Schedule Files

DepartmentFilePath
Mech EngineeringMecheng.xlsm...\ENG\Mecheng.xlsm
Controls EngineeringCONTROLS.xlsm...\ENG\CONTROLS.xlsm
Machine ShopAMACH.xlsm...\PROD\AMACH.xlsm
WeldAWELD.xlsm...\PROD\AWELD.xlsm
Mech AssemblyAMECH.xlsm...\PROD\AMECH.xlsm
Elec AssemblyAELECT.xlsm...\PROD\AELECT.xlsm
MVI/StartupAMVI.xlsm...\PROD\AMVI.xlsm
Floor SpaceFloor Space Capacity Chart.xlsm...\PROD\Floor Space...

Each file contains: Priority, Project#, Customer, Hours Estimated, Deviation, Hours-to-Go, Weekly hour allocations

Per-Project LDS Gantt Files

PropertyValue
Path Pattern\\ad.ptihome.com\DFS\LDS\PROJECT\{JobNo}\lds\{JobNo}lds.xlsm (also K:\PROJECT\{JobNo}\lds\{JobNo}lds.xlsm via mapped K: drive)
Coverage~3,092 project directories; LDS files exist from approximately job 1900+ (projects from ~late 1990s onward)
Sheets ParsedProject Info, Project Release (milestones), Gantt (weekly bars)
Milestone FieldsDepartment, Description, PlannedDate, ActualDate, IsComplete
Gantt FieldsDepartment, WeekDate, Hours, Color (from cell fill)
Extraction Toolpsi-data-pipeline/pipeline/extract_lds_gantt.py — batch-extracts planned dept start/end dates from Gantt sheets into CSV format compatible with load_ganttjob(). Runs nightly via GHE Actions. Uses openpyxl to parse colored cells and date headers.

Gantt sheet format (per LDS file): Row layout varies by era but typically: issue date row (with month headers), day-of-month row, week number row. Department sections (MECHANICAL ENGINEERING, CONTROLS ENGINEERING, PURCHASING, MACHINE, WELD, etc.) have colored cells indicating planned weekly work. The extraction script auto-detects the header row layout and handles 2-digit/4-digit years, placeholder dates, and year rollovers.

Why ganttJOB.csv is insufficient: ganttJOB.csv only exports from the PMO scheduling system for currently active projects (~83 jobs). When a project ships, it rotates out. For historical Gantt data (planned dept start/end), the per-project LDS files are the only source.

API Access:

  • GET /api/schedule/dev/summary → PMO priorities, HTG, deviation, bay
  • GET /api/schedule/dev/departments → All 7 department weekly grids
  • GET /api/schedule/dev/lds/{jobNumber} → Project milestones + Gantt

1.6 Redbook Quality Database

Last Verified: 2026-02-26

PropertyValue
WhatQuality issue tracking — 50,000+ records since 1999
SourcesREDBOOK.1287 (UniData), redbook.csv (CSV export), redbook_coq.db (SQLite)
AccessPSI.UniData.API /api/redbook/*, CSV files, Streamlit dashboard
Answers”How many quality issues per project?”, “Resolution times?”, “Cost of quality?”

Key Metrics:

  • Days Open, Days to Close, Days to First Action
  • Department assignment/completion tracking (16 departments)
  • Problem type, root cause, severity (AI-classified)
  • Ship timing category (Early/Mid/Late/At-Ship/Post-Ship)
  • Cost of quality (labor + material with stage multipliers)

1.6b SolidWorks PDM (Engineering Drawings & BOMs)

Last Verified: 2026-03-18

PropertyValue
WhatProduct Data Management — engineering drawings, 3D models, Bills of Materials
WhereSolidWorks PDM vault (on-premises)
AccessSolidWorks client, PDM search, BOM exports to CSV
Key DataDrawings (PDF/DWG), BOMs (RawData/BOM_Exports/*.csv), part revisions
Answers”What’s the BOM for this assembly?”, “Where is this part used?”, “What revision is current?”

BOM hierarchy follows the GTCode structure: FACILITY → ASSEMBLY → MODULE → PHYS. (Physical Part). PDM provides “where used” queries to find all assemblies containing a specific part — critical for design reuse analysis. BOM exports feed the nightly pipeline via jobparts.csv and per-project BOM CSV files.


1.7 Project Explorer Dataset

Last Verified: 2026-02-26

PropertyValue
What25+ years of project history with timeline visualization
WhereC:\git\project-explorer\public\RawData\
Key FileProject1287List.xml (4.3 MB, 3,084 projects)
AccessParsed client-side in React app, or read XML directly
Answers”Lead time trends over 25 years”, “On-time delivery rate”

Lead Time Fields:

  • OrderDateShipDate = Actual Lead Time (days)
  • OrderDatePlannedShipDate = Planned Lead Time (days)
  • SlipDays = Actual - Planned (positive = late)

Supporting Data in same directory:

  • quote_revenue.json (16 MB) — Quote and revenue data
  • redbook_data.json (4.8 MB) — RFC data for all projects
  • control_components.json (7.3 MB) — Controls component data

1.8 PLC Complexity Scores

Last Verified: 2026-03-03

PropertyValue
WhatPer-project PLC program complexity scores for 487 Allen-Bradley ACD projects
WhereC:\git\PLC\fleet_complexity_scores.csv (raw scores), fleet_complexity_trend.csv (yearly summary), fleet_complexity_joined.csv (joined to lead time data)
RepoC:\git\ra-logix-designer-vcs-custom-tools\scripts\batch-complexity-scan.py + build-complexity-trend.js
FormatCSV
RefreshManual batch scan: C:/Python311/python.exe batch-complexity-scan.py --resume (~4-8 hrs full fleet). Supports incremental resume.
Answers”How complex is the PLC program for job X?”, “Has machine complexity trended up over time?”, “Which machine types are most complex?”

CSV columns: Job, Complexity, TotalRungs, TagCount, UdtDepth, RoutineCount, ProgramCount, AvgRungs, JsrDepth, TimerDensity, TonCount, AcdFile, LastModified, Error

Scoring algorithm — 7 weighted sub-metrics (0–100 scale):

Sub-metricWeightRaw ScaleMeasures
rungCount20%0–3500 rungsProgram size
avgRungs20%0–100 rungs/routineRoutine density
tagCount15%0–200 tagsTag usage
routineCount15%0–80 routinesProgram structure
udtDepth10%0–4 levelsData type sophistication
jsrDepth10%0–3 levelsCall depth / nesting
timerDensity10%0–15 timers per 100 rungsSequencing complexity

Extraction paths:

  • v29+ ACDs (109 projects): Full extraction via acd-tools ExportL5x — all 7 sub-metrics including tags and UDTs parsed from TagInfo.XML
  • v12–v20 ACDs (378 projects): Fallback path reads partial sqlite DB directly — 5 sub-metrics (tagCount=0, udtDepth=0 due to no TagInfo.XML)

Key finding (2026-03-03): Median PLC complexity was stable at 32–35 from 2010–2022, then accelerated sharply. Median: 37 (2022) → 41 (2024) → 48 (2025). P75 jumped from ~37 (2010–2021) to 55 (2025). Maximum complexity nearly doubled (47→73). The top quarter of recent projects are 50% more complex than anything built pre-2022. See §8 for full analysis record.

Join key: Jobcomprehensive_dataset.csv Job column. fleet_complexity_joined.csv has the pre-built result (451 of 487 matched = 92%).

1.9 Machine DNA Quality Prediction System

Last Verified: 2026-03-06

PropertyValue
WhatPer-project quality risk prediction — 6 dimensions, 82 features, 3,088 projects
WhereProcessed/redbook_coq.db (SQLite tables) + Processed/*.json
RepoC:\git\redbook-dashboard
RefreshManual: python Scripts/build_machine_dna.py then python Scripts/train_machine_dna_model.py
Answers”Which new projects are at risk?”, “What predicts quality problems?”, “Are R&D projects skewing our data?”

SQLite tables:

TableRowsDescription
machine_dna_features3,088Per-project feature matrix (config, novelty, complexity, comp history, workforce)
project_rd_classification~100R&D project flags with confidence, signals, override status

JSON outputs:

FileDescription
Processed/machine_dna_model_results.jsonAUC per tier, SHAP importance, hypothesis verdicts, per-process-type results
Processed/machine_type_profiles.jsonMachine type quality profiles

Feature dimensions:

DimensionPrefixFeaturesAvailable At
1. Configuration Identityconfig_Machine type, process type, customer history, value tierOrder
2. Physical Noveltynovelty_First-time parts %, new module %, reuse ratioBOM Release
3. BOM Complexitycomplexity_Part count, depth, unique partsBOM Release
4. Comp Machine Historycomp_Reference machine quality, cost propagationOrder
5. Controls/PLCplc_PLC complexity score (from §1.7)BOM Release
6. Workforcewf_Team tenure, HHI concentration, lead experience, corrective hoursStaffing/Build

Model performance (time-aware expanding-window CV, honest):

  • Tier 1 (At Order): AUC 0.507
  • Tier 1.5 (At Staffing): AUC 0.631
  • Tier 2 (BOM Release): AUC 0.639
  • Tier 3 (Early Build): AUC 0.630
  • Field Escape: AUC 0.614

Top SHAP features: wf_corrective_hrs_pct (0.709), wf_ce_top_person_pct (0.330), novelty_phys_first_time_pct (0.245), comp_avg_quality_cost (0.198)

R&D classification signals: XML keyword scan (Description/SPNName), statistical outliers (>3x process-type median COQ%), machine type heuristics (Tooling/Fixture/Test), extreme novelty (95%+ first-time parts on first build).


1.10 Additional Document Sources

Last Verified: 2026-03-06

Handoff Notes (.docx)

PropertyValue
WhatProject handoff documents — comp/reference machines, risks, specs, team assignments
WhereI:\QUOTES\{SPN}\PO-Contract\{SPN}ho*.docx (e.g., S9507ho - Project 2422.docx)
LinkageSPNNo field in Project1287List.xml → SPN folder on I: drive
Coverage~2,274 projects have SPN links (88% of shipped projects)
FormatWord .docx — parseable with python-docx
Answers”What was the reference machine?”, “What risks were identified at handoff?”, “Who was assigned?”

Key extractable fields: Machine-Same-As-Except (comp references), project risks, customer specs, team assignments, special requirements. The comp machine references are particularly valuable — they supplement the 80 projects currently extracted from SLDS Index headers.

OneNote Lessons Learned

PropertyValue
WhatPost-project lessons learned notes organized by project
WhereS:\OneNote\Lessons Learned\Projects.one (90 MB)
FormatBinary .one format — NOT directly parseable without COM API or Microsoft Graph
AccessRequires OneNote COM automation (Windows) or Graph API (cloud). HTML export possible per-section.
Answers”What went wrong on project X?”, “What lessons were captured?”

Limitation: Binary format prevents batch extraction. Options: (1) COM API automation on Windows, (2) Graph API if notebook is in SharePoint/OneDrive, (3) manual HTML/MHT export per section.

K: Drive Change Order Files (AFTEC Binary)

PropertyValue
WhatEngineering change order documents with structured BOM change data
WhereK:\PROJECT\{job}\chg\chg.NNN (e.g., K:\PROJECT\2422\chg\chg.001)
FormatAFTEC binary format (NOT WordPerfect) — contains structured fields extractable via raw byte string parsing
CoverageMost projects from ~1999 onward
Answers”What BOM changes were made?”, “Which parts were added/removed/modified?”, “Change order timeline?”

K: drive directory structure per project (K:\PROJECT\{job}\):

SubfolderContents
chg\Change order files (AFTEC binary, chg.NNN)
lds\LDS Gantt file ({job}lds.xlsm)
PO\Purchase order documents
bom\BOM export files
dwg\Engineering drawings

Note: K: is a mapped drive to \\ad.ptihome.com\DFS\LDS\PROJECT\.


1.11 Document Manager File Sources (DFS Shares)

Last Verified: 2026-03-16

The Document Manager desktop application (PSI.DocumentManager) is a WPF document viewer that provides access to project files, product documentation, industry data, and multimedia. It reads from several DFS network shares that are not covered by the analytics pipeline or REST API.

Sales Quotes

PropertyValue
WhatSales project folders — quotes, proposals, and related documents
Where\\ad.ptihome.com\dfs\SalesQuotes\QUOTES\{SPN}
AccessDirect file read (VPN required)
ConsumerDocument Manager (search with “S” prefix, e.g., “S9507”)
Answers”What documents exist for sales project X?”

Product Documentation (LDS)

PropertyValue
WhatProduct-level technical documents — organized by document type and product number
Where\\ad.ptihome.com\dfs\LDS\Product\{code}\{productNumber}.{extension}
FormatVarious document types per product
ConsumerDocument Manager (Products tab)
Answers”What documentation exists for product X?”

Document type codes:

Code / SubfolderDocument Type
ATT\{classId}.ATTProduct Attributes
STD\{classId}.STDProduct Standards
GGeneral Description
DODescription of Operation
MCManual Controls
FLTFaults and Messages
SPSpare Parts
MSMaintenance Schedule
MPMaintenance Procedures
VMVendor Manuals
TPTest Plan
RACRunoff Acceptance Criteria
SHPShipping Instructions
SWSoftware
OOutline

Industry Sectors (LDS)

PropertyValue
WhatIndustry sector strategy documents
Where\\ad.ptihome.com\dfs\LDS\Industry Sectors\{IDNumber}\{IDNumber}_strategy.doc
AFTEC TablesINDSECTOR.1287 (sectors), INDCLASS.1287 (classes)
ConsumerDocument Manager (Industries tab)
Answers”What is PSI’s strategy for industry sector X?”, “Which industry classes exist?”

AFTEC schema:

TableKey FieldsVB Subroutine
INDSECTOR.1287SECTOR (ID), NAME, IND.CLASS, SALES.PERSONVia PSILocalService
INDCLASS.1287IND.CLS (ID), DESC, IND (Industry)Via PSILocalService

MultiMedia — Part Photos & Videos

PropertyValue
WhatPart-level photos and videos, organized by industry application and process type
Where\\ad.ptihome.com\dfs\Data\MultiMedia\ (photos and videos); \\ad.ptihome.com\dfs\Data\MultiMedia\Photo\Lo Res Project Files\{projectId} (lo-res project photos)
AFTEC TablePART.PHOTOS.1287 — metadata index with file paths, titles, and project associations
ConsumerDocument Manager (Part Photos tab, Part Videos tab, and project lo-res photos)
Answers”What part photos/videos exist for this industry application and process type?”, “What does this project’s machine look like?”

AFTEC schema — PART.PHOTOS.1287:

FieldDescription
IDRecord ID
FILE PATHFile location on DFS
PART.TYPE.DESCPart type description
DESCTitle / description
PROJECTAssociated project number
IND.CLASSIndustry classification code
PROCESS.TYPEProcess type code (0001=ShotPeen, 0002=Blast, 0003=Plasma, 0004=WaterJet, 0005=Other)
MEDIA.TYPE1 = Photo, 2 = Video

VB subroutine: VB_PARTPHOTOLIST.REV1 — query: SSELECT PART.PHOTOS.1287 BY PART.TYPE.DESC BY DESC BY PROJECT WITH IND.CLASS = '...' WITH PROCESS.TYPE = '...' WITH MEDIA.TYPE = '...'

Application type hierarchy (both photos and videos):

CodeApplication Type
AEROAerospace
AUTOAutomotive
LGHTLight Industry
HEQPHeavy Equipment
GENLGeneral
MILTMilitary
ENGYEnergy
MDCLMedical
OTHEROther

Product Categories & Products (AFTEC)

PropertyValue
WhatProduct classification hierarchy — categories contain products with BOM and documentation links
AFTEC TablesPRODUCT.CLASS.1287 (categories), PRODUCT.LDS.1287 (product listing), PRODUCT.1287 (product details)
ConsumerDocument Manager (Products tab)
Answers”What product categories exist?”, “What products are in category X?”, “What is the BOM for product Y?”

AFTEC schema:

TableKey FieldsVB Subroutine
PRODUCT.CLASS.1287CLASS (ID), CLASS.NAMEVB_PRODUCTCLASSLIST.REV1
PRODUCT.LDS.1287Product ID, PHYS.NO, PROD.CLASS, INACT.OBSVB_PRODUCTLIST.REV1
PRODUCT.1287PARTNO, DESC, ENGSTAT, PHYS, GTCODE, PRODDWG + 20 more fieldsVia PSILocalService

Loading strategy: Virtualized — categories load first, then products load on-demand per category expansion. BOM loaded via GetBOMDetailsAsync(partNumber) when product node is expanded.

SolidWorks Master Drawings

PropertyValue
WhatEngineering drawings cache — used to check if SolidWorks drawings exist for products
WhereX:\MasterDwgs\*.slddrw
ConsumerDocument Manager (Products tab — drawing existence check)
CacheFile list cached in memory as a HashSet, refreshed every 5 minutes

Electrical Drawing PDFs (DWG Conversions)

PropertyValue
WhatAutoCAD electrical/pneumatic schematics converted from DWG to PDF for web viewing
Where\\ad.ptihome.com\DFS\CAD\PDF\dwg_pdf\{job}\{drawingNumber}.pdf
Source DWGs\\ad.ptihome.com\DFS\CAD\DWGFiles\{drawingNumber}.DWG
Drawing Index\\ad.ptihome.com\DFS\CAD\Controls\{job}\{job}.wdp (text file listing 6-digit drawing numbers)
Conversion ToolDwgToPdf.exe (eDrawings 2025 ActiveX Print5 method)
PipelineNightly via psi-plc-runner/scripts/batch-dwg-to-pdf.ps1 on PS-PLCRunner
ConsumerPSI Explorer Machine Explorer (Drawings tab → ViewerPage), MCP server (GET /files/drawing/:pn/pdf)
Current Coverage234 PDFs across 4 projects (2337, 2386, 2399, 2415)
Answers”Show me the E-stop schematic for machine 2399”, “What’s on drawing 359920?”

Note: These are separate from the mechanical PDFs at \\ad.ptihome.com\DFS\CAD\PDF\{partNumber}.pdf which are SolidWorks-generated part drawings keyed by part number. Electrical PDFs are keyed by 6-digit drawing number and organized by project.

PropertyValue
WhatUser-specific report files printed from AFTEC
Where\\ad.ptihome.com\dfs\Data\UnixShare\PrintQue\PrintToFile
ConsumerDocument Manager (My Reports tab)
FilterFiles filtered by logged-in user’s Windows username (case-insensitive)

1.12 Project Minutes (PROJMIN) — Weekly Status Narrative

Last Verified: 2026-06-18

PropertyValue
WhatWeekly engineering project-status meeting minutes (“Project Minutes” memo, issued by PMO). One file per week; each is a full snapshot of every active project (~105 jobs/week). The only narrative, week-by-week timeline of each project’s lifecycle.
Where\\ad.ptihome.com\DFS\DATA\Dept\Eng\GENERAL\STATINFO\PROJMIN\ (mapped S:\Dept\Eng\GENERAL\STATINFO\PROJMIN), in per-year folders MIN1997Min2026
Format.docx (≈2013→present, 628 files) and legacy WordPerfect .WPD (1997→2012, 797 files)
Coverage1,422 weekly files → 71,963 job-week records over 1,797 jobs, continuous 1997–2026
RefreshManual, ~weekly (Mondays)
Answers”When did job X actually ship / run off / install / start up?”, “How long did it sit built-and-accepted before shipping?”, “Why did it slip?”, “What was the new-build↔retrofit lineage?”, “Who were the PM/engineer?”

Document structure (uniform across all 29 years): header (Date / To = distribution initials / From / Subject: Project Minutes), then two sections — MACHINE ORDERS (new builds, 4-digit jobs) and WORK ORDERS (retrofits/upgrades, 5-digit 95xxx). Each project is a header line job ⟶ PM,Eng initials ⟶ Customer – Machine description [$] [(Closed – use 95931)] [(was 95855)] followed by status lines (modern: tab-delimited phase columns; pre-2013: free-text bullets, where a leading ? marks a change-since-last-week).

Extraction method (reproducible: projmin_extract.py)

  1. WPD → text. The 797 WordPerfect files are converted with LibreOffice headless (its WordPerfect import filter is libwpd) installed on PS-PROXY: soffice --headless --norestore --convert-to txt:Text --outdir OUT <files>, batched ~80 files/invocation (Windows command-line length limit). 796/797 convert; 1 is corrupt. Note: reading the DFS share from inside a remote WinRM session fails (Kerberos double-hop) — stage files over the session with Copy-Item -ToSession instead. The .WPD archive is static, so this conversion is one-time.
  2. Filename → meeting date (4 conventions): 1997 MIN+MDDYY (5 digits); 1998–2000 MN+MMDDYY; 2001–2015 Mn+MMDDYY (2-digit year); 2016–2026 Mn+MMDDYYYY. Clamp year to [1997, now+1] (one mis-named file otherwise yields “9021”).
  3. Parse each file into (meeting_date, job, section, pm/eng, status_text) rows — header regex ^(\d{4,5})(?:/\d+)?\t+([A-Z]{2,3}(?:,…)*)\t+(.*), accumulating following lines as the status block until the next job/section.
  4. Per-job milestones: first week matching \bshipped\b; first runoff (complete|done|passed|signed); terminal-milestone = last week matching handoff/startup/runoff/manufacturing/install complete. Lineage from (Closed – use N), (was N), Upgrades?/Update to N.
  5. Join to Project1287List.xml on IDNumberDisplay (= job number); compare to AFTEC ShipDate.

Timeline anchor (coalesce method)

anchor_date = AFTEC ShipDate → minutes "Shipped" week → terminal-milestone proxy, recording anchor_source. This resolves projects that have no ship date in either source: of the 1,722 minutes-jobs matched to AFTEC, the “neither” cases split into in-progress (Active — correctly date-less), cancelled (correct), and closed-needs-anchor (old jobs that rotated off the active list at startup/handoff before a “Shipped” line was ever logged, and whose ERP ShipDate was never populated). The terminal-milestone proxy recovers those. Result: 96% of matched jobs (1,666/1,722) get a usable anchor (1,574 AFTEC ShipDate + 36 minutes-ship-fills-ERP-blank + 56 terminal proxy); the residual 19 old jobs fall back to labor-cessation / DateClose (see 7. KNOWN GAPS & LIMITATIONS, and internal-project completion-dating).

Verification results (vs AFTEC ShipDate)

  • Of 1,116 jobs dated in both sources: median difference +5 days, 82% within ±10 days — independent validation of AFTEC ShipDate.
  • 79 jobs disagree by >30 days (reconciliation list); 36 jobs shipped per minutes but have a blank AFTEC ShipDate (ERP gaps); 75 minutes-jobs have no Project1287List.xml record.
  • Runoff-complete → ship hold: median 28 days, p90 77; 43% of jobs hold >30 days built-and-accepted before shipping — a legitimate on-time-metric refinement.

Join key: job (minutes) ↔ IDNumberDisplay (Project1287List.xml) ↔ Job/project (analytics tables). Caveat: milestone dates are weekly-resolution (±7 days). Status: POC validated; not yet a pipeline job (planned load_project_minutes.pyproject_minutes_weekly / project_milestones_minutes / job_lineage).


2. API & TOOL ACCESS

2.1 PSI.UniData.API (REST Gateway)

Last Verified: 2026-02-26

PropertyValue
URLhttps://api.progressivesurface.com/api
AuthAzure AD (production) or dev endpoints (no auth)
RepoC:\git\PSI.UniData.API

Endpoint Categories

CategoryEndpointsWhat They Access
BOM/bom/dev/job/{job}, /bom/dev/part/{part}BOM explosion with full hierarchy
Parts/parts/dev/{part}, /parts/searchPart details, search
Work Orders/wo/dev/{wo}WO details (status, dates, completion)
Projects/project/dev/{job}/info, /project/dev/{job}/siblingsProject metadata, similar machines
Schedule/schedule/dev/summary, /schedule/dev/departmentsExcel schedule data
Redbook/redbook/rfc/{rfc}, /redbook/by-project/{job}Quality issues
Data/data/dev/{tableName}, /data/dev/{table}/{id}Raw table queries
Dictionary/dictionary/tables, /dictionary/searchSchema discovery
Hours/hours/dev/burn/{job}Hours burndown data

2.2 PSI Machine Intelligence MCP Server

Last Verified: 2026-02-26. For current tool count and full tool list, see MCP Server.

PropertyValue
WhereC:\git\PLC\mcp-server\mcp-server.js
Used ByPSI Explorer “Ask the Fleet” AI chat, Claude Code, Claude Desktop
AnswersComplex multi-source questions about machines

Tool Categories

CategoryToolsKey Capabilities
Projectget_project_info, find_similar_machines, get_project_lineage, list_machine_typesProject metadata, machine type analysis
BOM/Supplyget_bom, get_part_details, get_part_manufacturer, get_part_vendors, search_partsFull BOM and supply chain data
PLC/Controlsget_plc_analysis, analyze_acd, lookup_fault, get_io_map, lookup_plc_tag, lookup_pc_read_write, get_robot_interfacePLC program analysis, I/O mapping
Engineeringlist_project_files, find_drawing, get_electrical_drawing_index, read_drawing_pdf, read_bom_file, get_drawing_metadata, check_commissioning_readinessDrawing and document access
Qualityget_rfcs, get_vendor_historyRedbook and vendor tracking
Work Ordersget_work_ordersWO list by project
Analysisdiagnose_fault, compare_projects, assess_obsolescence_riskCross-system diagnostics
Inventoryget_inventory_status, get_inventory_transactions, get_inventory_history, analyze_inventory_discrepancyInventory analytics
Costanalyze_part_cost_leakage, get_job_cost_attributionCost attribution
Utilityget_digital_thread, find_acd_filesPre-built composite reports

2.3 PSI Explorer Web

Last Verified: 2026-03-02

PropertyValue
URLhttps://explorer.progressivesurface.com
RepoC:\git\psi-explorer-web
WhatUnified manufacturing intelligence portal — projects, BOM, schedule, lead time analysis, PLC scorecard, AI chat
Data JoinsPROJECT.1287 + PROJECT.HRS.1287 + MACHTYPE.1287 + Schedule Excel + Accounts + comprehensive_dataset.csv + department_scorecard.csv

Key Views:

  • Project Dashboard — Ships next 30/60/90, behind schedule count, OTD rate
  • Capacity View — Department schedules, weekly hour grids
  • Project Detail — LDS milestones, Gantt, hours burndown, RFCs, WOs
  • Lead Time Analysis — 8-tab deep dive: Dashboard, Explorer, Era Comparison, Dept Heatmap, Capacity & OTD, Hypotheses, Timeline (2,569 projects, 138 columns, data back to 2000)
  • PLC Scorecard — Automated complexity/safety/reuse scoring for PLC programs
  • PSI Explorer — Hierarchical BOM viewer with search and drill-down

2.4 Redbook Analysis Dashboard (v7.3)

Last Verified: 2026-03-09

PropertyValue
URLhttps://quality.progressivesurface.com (Entra ID auth)
RepoC:\git\redbook-dashboard
DatabaseProcessed/redbook_coq.db (SQLite)
What10-tab quality analytics dashboard (Streamlit)

Tabs:

#TabWhat It Shows
1OverviewSummary metrics, KPIs, top-level quality trends
2By ProjectProject rankings by RFC count, cost, severity
3Root CausesAI-classified root cause breakdown and patterns
4TrendsTime series of quality issues, costs, categories
5Deep DivePreventability analysis of top design issues
6ProductsProduct-level quality tracking and recurring escapes
7ExplorerGeneral data explorer with flexible filtering
8Lead TimeLead time analysis and schedule performance
9Machine DNAMachine DNA quality profiling, model results, SHAP features
10Project ExplorerPer-project quality deep dive — synthesizes all data sources for a single project

Tab 10 — Project Explorer (new in v7.3):

Single-project deep dive that brings together every available data source into one view:

  • Project selector — searchable by job number or customer, shows RFC count and ship year
  • DNA radar chart — project’s 6-dimension feature profile vs machine type average
  • RFC timeline — scatter plot of issues by severity with ship date marker
  • Workforce profile — median tenure, junior %, CE concentration (HHI), corrective hours %
  • Risk recommendations — risk score (0–100), category, and generated recommendations
  • Comp machine comparison — reference machines with side-by-side quality and DNA metrics
  • Lessons learned — post-project review notes from OneNote extraction
  • ECN/NCN detail — expandable table of engineering changes and non-conformances

5 new data loaders (added in v7.3):

  • load_lessons_learned() — OneNote-extracted lessons by project
  • load_ecn_details(project) — ECN records filtered by project
  • load_ncn_details(project) — NCN records filtered by project
  • load_workforce_features() — 15 workforce features per project (tenure, concentration, corrective hrs)
  • load_comp_machine_references() — comp/reference machine mappings from SLDS Index

2.5 Document Manager (WPF Desktop — PSI.DocumentManager)

Last Verified: 2026-03-16

PropertyValue
WhatWPF desktop document viewer — project files, product documentation, industry data, multimedia
RepoC:\git\PSI.All\PSI.DocumentManager\Trunk\PSI.DocumentManager\
Data AccessPSILocalService (WCF) → U2 Toolkit → AFTEC UniData
AuthWindows domain authentication

Primary consumer of DFS file shares and several AFTEC tables not used by other applications. Document Manager is the main way users browse project folders, product documentation, industry sector data, and part photos/videos.

Data Access Layer — PSILocalService (WCF)

Document Manager accesses AFTEC via PSILocalService, a WCF service hosted by PSI.Service (PSI.Shared/trunk/PSI.Windows.Services/PSI.Service/). This is a separate access path from the PSI.UniData.API REST gateway — it uses the same U2 Toolkit for .NET but through WCF instead of HTTP.

Key service methods:

MethodAFTEC TableVB SubroutineReturns
GetOnePTIProjectAsync(job)PROJECT.1287Project metadata (PTIProjectModel)
GetProductCategoryList()PRODUCT.CLASS.1287VB_PRODUCTCLASSLIST.REV1Product categories
GetProductByClassListAsync(classId)PRODUCT.LDS.1287VB_PRODUCTLIST.REV1Products per category
GetBOMDetailsAsync(partNumber)BOM tablesBill of Materials hierarchy
LoadIndustrySectors()INDSECTOR.1287Industry sectors
LoadIndustryClass()INDCLASS.1287Industry classes
GetPartMedia(indClass, processType, mediaType)PART.PHOTOS.1287VB_PARTPHOTOLIST.REV1Part photos (type=1) or videos (type=2)

Tabs & Data Sources

TabAFTEC SourceDFS Source
ProjectsPROJECT.1287 (metadata)\\ad.ptihome.com\dfs\LDS\PROJECT\{job} (files)
Sales Projects\\ad.ptihome.com\dfs\SalesQuotes\QUOTES\{SPN}
ProductsPRODUCT.CLASS.1287, PRODUCT.LDS.1287, PRODUCT.1287, BOM\\ad.ptihome.com\dfs\LDS\Product\ (docs), X:\MasterDwgs\ (SolidWorks)
IndustriesINDSECTOR.1287, INDCLASS.1287\\ad.ptihome.com\dfs\LDS\Industry Sectors\
Part PhotosPART.PHOTOS.1287 (mediaType=1)\\ad.ptihome.com\dfs\Data\MultiMedia\
Part VideosPART.PHOTOS.1287 (mediaType=2)\\ad.ptihome.com\dfs\Data\MultiMedia\
My Reports\\ad.ptihome.com\dfs\Data\UnixShare\PrintQue\PrintToFile
My ComputerLocal/network drives

3. DATA RELATIONSHIPS (Join Keys)

Last Verified: 2026-02-26

PROJECT.1287 ←── JobNumber ──→ openwo (Job field)
     │                              │
     │                              ├──→ tslabor2 (Job field)
     │                              ├──→ wiplabor (Job No field)
     │                              ├──→ wiprouteline (Job field)
     │                              └──→ ganttWO (Job field)
     │
     ├── JobNumber ──→ ganttJOB (Job field)
     ├── JobNumber ──→ redbook (Project field)
     ├── JobNumber ──→ ecn (Project field)
     ├── JobNumber ──→ jobparts (Job field)
     ├── JobNumber ──→ budgetjob (Job field)
     ├── AccountNo ──→ customer (Customer No)
     ├── JobNumber ──→ LDS Excel (file path pattern)
     ├── JobNumber ──→ Schedule Excel (project row)
     └── JobNumber ──→ PART.PHOTOS.1287 (PROJECT field)

PRODUCT.CLASS.1287 ←── ClassID ──→ PRODUCT.LDS.1287 (PROD.CLASS)
                                          │
                                          └──→ PRODUCT.1287 (product details)

INDSECTOR.1287 ←── IND.CLASS ──→ INDCLASS.1287 (IND.CLS)
               └── IND.CLASS ──→ PART.PHOTOS.1287 (IND.CLASS)

openwo ←── PartNo ──→ partmaster (Part No)
                           │
                           ├──→ purchaseorders (Part No)
                           ├──→ inventory (Part No)
                           └──→ inventoryhistory (Part No)

employee ←── EmpNo ──→ tslabor2 (Emp field)
                   └──→ redbook (Ent By field)

tslabor2 ←── Job+Dept ──→ ganttJOB (Job+Dept)     [schedule vs actual labor]
         ←── Job+Emp  ──→ budgetjob (Job)          [budget vs actual hours]
         ←── Dept+Date ──→ (timeline heatmaps)     [when labor occurs in project lifecycle]

machine_dna_features ←── project ──→ PROJECT.1287 (JobNumber)
                     ←── project ──→ redbooks (Project)
                     ←── project ──→ comprehensive_dataset (Job)

project_rd_classification ←── project ──→ machine_dna_features (project)

ecn.csv ←── RFCNumber ──→ redbook.csv (ID)        [ECN-to-RFC linkage]
        ←── Project ──→ PROJECT.1287 (JobNumber)

Project1287List.xml ←── SPNNo ──→ I:\QUOTES\{SPN}\  [handoff notes]

Key Join Fields

FromToJoin KeyNotes
Any project dataWork ordersJobNumber = openwo.JobCore link
Work ordersLaborWO = tslabor2.WOTrack hours per WO
Work ordersRoutingWO = wiprouteline.WOPlanned vs actual by operation
ProjectsQualityJobNumber = redbook.ProjectRFCs per project
ProjectsScheduleJobNumber in Excel filesPMO schedule data
ProjectsCustomersAccountNo = customer.Customer NoCustomer name
ProjectsBudgetJobNumber = budgetjob.JobBudget vs actual by dept
PartsVendorsPartNo = purchaseorders.Part NoSupply chain
PartsLead TimesPartNo = partmaster.Part NoMfg/Pur/Cum LT
LaborTimelineJob + Date vs OrderDateNormalize labor to project timeline
LaborEngineersEmp = Employee NameCross-project engineer loading

Cross-Source Analysis Joins (Proven in Feb 2026 Investigation)

These joins have been validated through actual analysis:

AnalysisSources JoinedJoin Strategy
Lead time by stageProject1287 + ganttJOB + openwoJobNumber, then aggregate by Dept
Good vs bad projectsbudgetjob + tslabor2 + redbook + purchaseordersJobNumber across all tables
CE labor timeline heatmaptslabor2 (Dept=110E) + Project1287 (OrderDate)Job, then normalize dates by OrderDate
Vendor OTDpurchaseorders + pohistPO number + line, Promise vs Rcv Date
Department capacitytslabor2 (hours) + wc-capacity (capacity)Department code
Engineer cross-project loadingtslabor2 grouped by Emp + JobEmployee name, sum hours per project
Comprehensive causal datasetotd_dataset + 12 sourcesJobNumber across all; PO linkage via regex on Job/WO No field
CE/MVI timing overlapfabric_dept_milestones (110E LastDate vs 120 FirstDate)(Job, Dept) key; CE last date vs MVI first date
Employee concurrent loadtslabor2 grouped by (Emp, YYYY-MM)Count unique Jobs per employee per month
Part complexity trendjobparts grouped by Job NoCount parts per project, track by ship year
PLC complexity trendfleet_complexity_scores.csv + comprehensive_dataset.csvJob = Job; 451 matched (92%); fleet_complexity_joined.csv has pre-built result
Machine DNA quality predictionmachine_dna_features + labor_detail + comp_machine_references + project_complexity + redbook_coq.dbproject key across all; 6 dimensions, 82 features, 3,088 projects
Workforce feature engineeringlabor_detail.csv (1.38M rows) + department_summary.csvJob+Emp for tenure; Job+Dept for concentration; comp_machine_references for team overlap
R&D project classificationProject1287List.xml + redbook_coq.db + machine_dna_featuresXML keyword scan + statistical outlier detection + novelty extremes
ECN type analysisecn.csv (Code field) + redbook.csv (via RFCNumber)ECN→RFC linkage for 1,928 ECNs; Code breakdown per project

4. QUESTIONS THIS DATA CAN ANSWER

Last Verified: 2026-02-26

Lead Time & Schedule Analysis

QuestionPrimary SourceSupporting Sources
”What’s the average lead time trend by year?”Project1287List.xml (OrderDate, ShipDate)project-explorer LeadTimeView
”Which projects are behind schedule?”Schedule Excel (Deviation column)ganttJOB.csv (% Complete)
“Which department is the bottleneck?“ganttJOB.csv (Start/End per dept)tslabor2.csv (hours by dept)
“How long do work orders take?“openwo.csv (Make→Close dates)wiprouteline.csv (planned vs actual)
“Are we over on hours by department?”PROJECT.HRS.1287 (budget vs actual)wiplabor.csv (routed vs actual)
“What’s the hours-to-go for active projects?”Schedule summary APIDepartment Excel files
”How do quality issues impact schedule?“redbook.csv (Days Open, dept flags)ecn.csv (change volume)
“Are vendors delivering on time?“purchaseorders.csv (Promise vs Rcv)pohist.csv (receipt dates)
“What’s our capacity vs load?“wc-capacity.csvcapacity.csv
”What’s the on-time delivery rate?”Project1287List.xml (Planned vs Actual ship)project-explorer OTD calc
”Where do CE hours land on the project timeline?“tslabor2.csv (Dept=110E, Date, Job)Project1287 (OrderDate for normalization)
“Which engineers are overloaded?“tslabor2.csv (Emp, Job, Hours)budgetjob (CE budget vs actual)
“Is this a good or bad project?“budgetjob (budget vs actual %), tslabor2, redbookCross-source: budget %, RFC count, CE hours
”Are machines getting more complex over time?“fleet_complexity_trend.csvfleet_complexity_joined.csv for per-project detail
”What’s the PLC complexity of job X?“fleet_complexity_scores.csv (Job column)fleet_complexity_joined.csv for ship year / machine type context
”Which machine types have the most complex PLC programs?“fleet_complexity_joined.csvJoin Complexity + MachineType from comprehensive_dataset
”Which new projects are at risk for quality problems?“machine_dna_features + model_results.jsonTier 1 (at order) through Tier 3 (early build) predictions
”What predicts quality problems?“machine_dna_model_results.json (SHAP values)Corrective hours, CE concentration, novelty, comp history
”Is this project R&D or production?“project_rd_classificationKeyword, statistical, novelty signal detection
”What type of ECN changes drive rework?“ecn.csv (Code field)Engineering (79%), Customer (8%), Manufacturing (3%)
“Does this project’s team have enough experience?“machine_dna_features (wf_ columns)Median tenure, junior %, lead CE/ME tenure, HHI

Department-Level Analysis

DepartmentCodeSchedule SourceHours SourceQuality Source
Mech Eng110MMecheng.xlsmtslabor2 (Dept=110M)redbook A110M flags
Controls Eng110ECONTROLS.xlsmtslabor2 (Dept=110E)redbook A110E flags
Machine Shop102AMACH.xlsmtslabor2 (Dept=102)redbook A102 flags
Weld104AWELD.xlsmtslabor2 (Dept=104)redbook A102 flags
Mech Assembly106AMECH.xlsmtslabor2 (Dept=106)redbook A106 flags
Elec Assembly108AELECT.xlsmtslabor2 (Dept=108)redbook A108 flags
MVI/Startup120AMVI.xlsmtslabor2 (Dept=120)redbook A120 flags
Field Service135tslabor2 (Dept=135)service tickets

5. ACCESS PATTERNS (Quick Reference)

Last Verified: 2026-02-26

”I need to query data about…”

TopicFastest MethodCommand/Endpoint
Current project statusAPIGET /api/project/dev/{job}/info
Full BOM for a jobAPIGET /api/bom/dev/job/{job}
Department scheduleAPIGET /api/schedule/dev/department/{dept}
Project milestonesAPIGET /api/schedule/dev/lds/{job}
Work order detailsAPIGET /api/wo/dev/{wo}
Quality issues by projectAzure SQLSELECT * FROM redbooks WHERE project = '{job}'
Aggregate WO statisticsAzure SQLSELECT job, COUNT(*) FROM work_orders GROUP BY job
Labor hours by dept/yearAzure SQLSELECT dept, SUM(labor) FROM tslabor2 WHERE ...
CE labor by engineer/projectAzure SQLSELECT * FROM labor_detail WHERE dept = '110E'
ECN breakdown by typeAzure SQLSELECT code, COUNT(*) FROM ecn_detail WHERE project = '{job}' GROUP BY code
Workforce risk metricsAzure SQLSELECT * FROM workforce_features WHERE job = '{job}'
Lead time trendsProject ExplorerLeadTimeView component
Historical project dataAzure SQLSELECT * FROM projects WHERE ship_year >= 2020
Vendor lead timesAzure SQLSELECT * FROM partmaster WHERE [Part No] = '{part}'
Work center capacityAzure SQLSELECT * FROM [wc-capacity]
Budget vs actual by deptAzure SQLSELECT * FROM budgetjob WHERE Job = '{job}'
Engineer workload across projectsAzure SQLSELECT [Employee Name], Job, SUM(Labor) FROM tslabor2 GROUP BY ...

6. DATA FRESHNESS & RELIABILITY

Last Verified: 2026-03-12

SourceFreshnessReliabilityVersioningNotes
Azure SQL PSI_AnalyticsNightly (1 AM)HighGit (psi-data-pipeline)Primary data store — app/report query target
UniData API (live)Real-timeHighNoneDirect ERP queries
AFTEC CSVs (DFS share)~Daily (5 AM)HighDaily snapshotsAFTEC export jobs; archived by archive_csv_snapshots.py
psi_ingest.db (SQLite)Per-source refresh / workflow-drivenHighGit (psi-data-pipeline)Durable staging DB for scraper-owned tables; survives analytics rebuilds
psi_analytics.db (SQLite)Nightly (1 AM)HighGit (psi-data-pipeline)Rebuildable analytics artifact and local publish source before Azure SQL sync
Fabric ProgressiveDataSetDaily (5 AM)HighNone — full overwriteBeing migrated to Azure SQL
Schedule ExcelManualMediumNoneDepends on PMO updating
LDS Gantt ExcelManualMediumNoneDepends on PM updating per project
comprehensive_dataset.csvNightly (1 AM)HighGit (psi-data-pipeline)Flat export from Azure SQL pipeline
Redbook SQLite DBPipeline runHighNoneRun on demand
Project1287List.xmlExport-timeHighNoneSnapshot at export

No PSI data source currently has full change-data-capture. Most sources still reflect current state only. Daily CSV snapshots (archive_csv_snapshots.py) provide basic change tracking for AFTEC exports, and the ingest/publish workflows now verify that ingest-owned tables make it into both psi_analytics.db and Azure SQL before the workflow succeeds. Historical analysis still relies primarily on dates already embedded in records (e.g., OrderDate, ShipDate, MakeDate) rather than on system-level change logs.


7. KNOWN GAPS & LIMITATIONS

Last Verified: 2026-02-26

GapImpactWorkaroundPriority
No single “lead time by stage” datasetCan’t directly query stage durationsJoin ganttJOB + openwo + wiproutelineLow (join works)
No versioning in Fabric/CSVCan’t track field changes over time (PO promise dates, WO status)Implement daily CSV snapshot archivingHigh
PO promise date is current-onlyCan’t measure original vs revised promise date driftExpose VB_PO.LINE.REV1 (4 date/qty pairs) via APIHigh
Schedule Excel is manually maintainedMay lag realityCross-check with WO completion datesMedium
LDS milestones often unfilledMany projects have null planned/actual datesUse ganttJOB dept start/end insteadLow
Gantt planned schedule data 99.9% emptyFIXED (2026-03-02): LDS Gantt extraction now runs nightly. 457 LDS files parsed, 4,213 dept rows + 9,824 revision rows. GJ_* columns populated for 418+ projects. Pipeline moved to psi-data-pipeline/pipeline/.N/AResolved
Redbook detection timing is estimatedBased on entry date vs ship dateReasonable proxyLow
No automated root cause for delaysMust infer from data patternsCombine hours overrun + WO lateness + RFC volumeMedium
Capacity data is limitedOnly 31 rows in wc-capacitySupplement with PMO floor space dataMedium
No CE software complexity metricFIXED (2026-03-03): PLC complexity pipeline scored 487 ACDs — fleet_complexity_scores.csv has per-project scores (7 sub-metrics). Median complexity 48 in 2025 vs 32–35 baseline (2010–2022). See §1.7.N/AResolved
Schedule Date field 38.6% emptyInconsistent PO trackingUse Promise Date + pohist Rcv Date as primary OTD measureLow
pohist PO Date is placeholderPO Date in pohist.csv is almost always 01/01/01 (Jan 1 2001) — cannot compute vendor lead time from pohist aloneJoin pohist receipt dates to purchaseorders.csv Promise Date via PO number; or use purchaseorders.csv only for vendor LT analysisMedium
purchaseorders Job/WO linkage is weakJob/WO No field requires regex parsing (format varies: F58637-2322, etc.); only 34% of PO lines match to known project numbersBuild WO→Job mapping from openwo.csv, or parse with known job set. Coverage limited to recent active projects.Low
jobparts coverage is 51%Only 498 of 2,569 shipped projects have jobparts.csv data (BOM part list)May only cover recent or active projects. For historical BOM complexity, use redbook_coq.db project_boms table (4.3M rows, much better coverage)Low
comprehensive_dataset has ECN count but not typeFIXED (2026-03-11): ecn_detail table in Azure SQL includes Code field. Query: SELECT code, COUNT(*) FROM ecn_detail WHERE project = '{job}' GROUP BY codeN/AResolved
Handoff notes not parsed at scaleFIXED (2026-03-11): 551 handoff extractions loaded into handoff_extractions table in Azure SQL. Comp machine refs, risk notes extracted from I: drive .docx files.N/AResolved
OneNote Lessons Learned inaccessibleFIXED (2026-03-11): 1,166 lessons from 263+ projects extracted via UTF-16LE binary scan and loaded into lessons_learned table in Azure SQL.N/AResolved
K: drive change orders unexploredFIXED (2026-03-11): 8,395 change orders extracted from K: drive chg/ binaries and loaded into change_orders table in Azure SQL. Part numbers and drawing refs parsed.N/AResolved
Honest model AUC is modestTime-aware CV dropped from shuffled 0.749-0.806 to honest 0.507-0.639; Tier 1 barely above chanceExpected — temporal leakage inflated old scores. Tier 1.5+ useful for relative risk ranking, not absolute predictionLow

8. ANALYSIS HISTORY

Analyses performed against PSI data, with links to findings. This tracks what questions have been answered and what data combinations were used.

DateAnalysisSources UsedKey FindingReport
2026-02-26Lead time root cause11 sources, 9.6M recordsLead time doubled to 330d; Purchasing, CE, EA top contributorsLead Time Report
2026-02-26Good vs bad project comparisonbudgetjob, tslabor2, redbook, purchaseordersCE is #1 differentiator (25% vs 190% budget); 4x more RFCs on bad projectsInvestigation Log
2026-02-26CE labor timeline heatmaptslabor2 (110E), Project1287, budgetjobBad projects have 12x CE hours; SOFTWARE DESIGN is 26x overrun; two-wave patternC:\git\schedule\PSI CE Labor Timeline Analysis.docx
2026-02-26Vendor OTD deep divepurchaseorders, pohistOTD dropped 81%→69%; Kendall Electric 73% late; promise date tracking gap foundC:\git\schedule\PSI Lead Time Deep Dive - Good vs Bad Projects.docx
2026-02-27Comprehensive causal analysis13 sources, 2,569 projects, 138 columnsBuilt re-runnable dataset; tested 10 causal hypotheses. Key: CE span (r=+0.203) is strongest single correlator with DaysLate; concurrent projects jumped 10.8→14.2 in overload; CE finishes before MVI only 33-43% of the time; RFC count stable (38-42) but ECNs rose 11→15.C:\git\schedule\build_comprehensive_dataset.pycomprehensive_dataset.csv + causation_analysis.txt
2026-03-02Pipeline automation & Data Brain alignment13 sources, 2,569 projects, 152 columnsMoved pipeline to psi-data-pipeline/pipeline/. Eliminated Fabric JSON dependency (dept milestones computed inline from tslabor2 stream). Added full-fidelity detail outputs (6 files, all source columns preserved). Added LDS Gantt revision/reschedule data (14 new columns). Nightly automated builds on PS-PROXY via GHE Actions.psi-data-pipeline/pipeline/\\ad.ptihome.com\DFS\Schedule\SS123\LEADTIME\
2026-03-03PLC complexity trend analysis487 ACD projects (all versions) + comprehensive_dataset.csv (2,569 projects)Batch-scored entire ACD fleet (batch-complexity-scan.py using acd-tools). Two extraction paths: full (v29+, 109 projects, 7 sub-metrics) and fallback (v12–v20, 378 projects, 5 sub-metrics). 487 scored, 11 errors. Joined to lead time data: 451 matched (92%). Key finding: median complexity stable 32–35 from 2010–2022, then surged to 37→41→48 (2022→2024→2025). P75 jumped from ~37 to 55. Max complexity nearly doubled (47→73). Closes the “no SW complexity metric” gap.C:\git\PLC\fleet_complexity_scores.csv, fleet_complexity_trend.csv, fleet_complexity_joined.csv
2026-03-06Machine DNA quality prediction (honest rebuild)redbook_coq.db + labor_detail.csv + comp_machine_references + project_complexity + Project1287List.xmlBuilt 6-dimension, 82-feature quality prediction model across 3,088 projects. Three workstreams: (A) Workforce features — 15 wf_ columns from labor_detail (tenure, HHI concentration, lead experience, corrective hours). (B) R&D classification — flagged ~100 projects using keyword/statistical/novelty signals. (C) Honest model rebuild with time-aware expanding-window CV, COQ% target within process type, SHAP explainability. Results: Tier 1 AUC 0.507, Tier 1.5 AUC 0.631, Tier 2 AUC 0.639. Top SHAP: corrective_hrs_pct (0.709), ce_top_person_pct (0.330), novelty (0.245), comp_quality (0.198). Old shuffled baselines (0.749-0.806) were inflated by temporal leakage.C:\git\redbook-dashboard\Processed\machine_dna_model_results.json, Scripts/train_machine_dna_model.py
2026-03-06ECN change code analysisecn.csv (28,266 records) + tslabor2 corrective hoursDiscovered ecn.csv Code field classifies changes: Engineering (79.3%), Customer (8.2%), Product Improvement (3.5%), Manufacturing (3.2%). Customer ECN rate is UNCORRELATED with corrective labor (r=0.089) — corrective hours genuinely capture rework, not customer scope changes. ECN-to-RFC linkage: 1,928 ECNs link to 1,511 RFCs via RFCNumber field. 46% of quality cost is ECN-driven but mostly engineering-initiated.C:\git\redbook-dashboard\Documentation\DATA_CATALOG.md


Last updated: March 18, 2026 Maintenance: Update this page whenever new data sources, joins, or gotchas are discovered during analysis.