# ๐Ÿ“Š Endobest Clinical Research Dashboard - Architecture Summary **Last Updated:** 2025-11-08 **Project Status:** Production Ready with Excel Export Feature **Language:** Python 3.x --- ## ๐ŸŽฏ Executive Summary The **Endobest Clinical Research Dashboard** is a sophisticated, production-grade automated data collection and reporting system designed to aggregate patient inclusion data from the Endobest clinical research protocol across multiple healthcare organizations. The system combines high-performance multithreading, comprehensive quality assurance, and fully externalized configuration to enable non-technical users to manage complex data extraction workflows without code modifications. ### Core Value Propositions โœ… **100% Externalized Configuration** - All field definitions, quality rules, and export logic defined in Excel โœ… **High-Performance Architecture** - 4-5x faster via optimized API calls and parallel processing โœ… **Robust Resilience** - Automatic token refresh, retries, graceful degradation โœ… **Comprehensive Quality Assurance** - Coherence checks + config-driven regression testing โœ… **Multi-Format Export** - JSON + configurable Excel workbooks with data transformation โœ… **User-Friendly Interface** - Interactive prompts, progress tracking, clear error messages --- ## ๐Ÿ“ Project Structure ``` Endobest Dashboard/ โ”œโ”€โ”€ ๐Ÿ“œ MAIN SCRIPT โ”‚ โ””โ”€โ”€ eb_dashboard.py (57.5 KB, 1,021 lines) โ”‚ Core orchestrator for data collection, processing, and export โ”‚ โ”œโ”€โ”€ ๐Ÿ”ง UTILITY MODULES โ”‚ โ”œโ”€โ”€ eb_dashboard_utils.py (6.4 KB, 184 lines) โ”‚ โ”‚ Thread-safe HTTP clients, nested data navigation, config resolution โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ eb_dashboard_quality_checks.py (58.5 KB, 1,266 lines) โ”‚ โ”‚ Coherence checks, non-regression testing, data validation โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ eb_dashboard_excel_export.py (32 KB, ~1,000 lines) โ”‚ Configuration-driven Excel workbook generation โ”‚ โ”œโ”€โ”€ ๐Ÿ“š DOCUMENTATION โ”‚ โ”œโ”€โ”€ DOCUMENTATION_10_ARCHITECTURE.md (43.7 KB) โ”‚ โ”‚ System design, data flow, API integration, multithreading โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ DOCUMENTATION_11_FIELD_MAPPING.md (56.3 KB) โ”‚ โ”‚ Field extraction logic, custom functions, transformations โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ DOCUMENTATION_12_QUALITY_CHECKS.md (60.2 KB) โ”‚ โ”‚ Quality assurance framework, regression rules, validation logic โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ DOCUMENTATION_13_EXCEL_EXPORT.md (29.6 KB) โ”‚ โ”‚ Excel generation architecture, data transformation pipeline โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ DOCUMENTATION_98_USER_GUIDE.md (8.4 KB) โ”‚ โ”‚ End-user instructions, quick start, troubleshooting โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ DOCUMENTATION_99_CONFIG_GUIDE.md (24.8 KB) โ”‚ Administrator configuration reference โ”‚ โ”œโ”€โ”€ โš™๏ธ CONFIGURATION โ”‚ โ””โ”€โ”€ config/ โ”‚ โ”œโ”€โ”€ Endobest_Dashboard_Config.xlsx (Configuration file) โ”‚ โ”‚ Inclusions_Mapping โ”‚ โ”‚ Organizations_Mapping โ”‚ โ”‚ Excel_Workbooks โ”‚ โ”‚ Excel_Sheets โ”‚ โ”‚ Regression_Check โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ eb_org_center_mapping.xlsx (Organization enrichment) โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ templates/ โ”‚ โ”œโ”€โ”€ Endobest_Template.xlsx โ”‚ โ”œโ”€โ”€ Statistics_Template.xlsx โ”‚ โ””โ”€โ”€ (Other Excel templates) โ”‚ โ”œโ”€โ”€ ๐Ÿ“Š OUTPUT FILES โ”‚ โ”œโ”€โ”€ endobest_inclusions.json (~6-7 MB, patient data) โ”‚ โ”œโ”€โ”€ endobest_inclusions_old.json (backup) โ”‚ โ”œโ”€โ”€ endobest_organizations.json (~17-20 KB, stats) โ”‚ โ”œโ”€โ”€ endobest_organizations_old.json (backup) โ”‚ โ”œโ”€โ”€ [Excel outputs] (*.xlsx, configurable) โ”‚ โ””โ”€โ”€ dashboard.log (Execution log) โ”‚ โ””โ”€โ”€ ๐Ÿ”จ EXECUTABLES โ”œโ”€โ”€ eb_dashboard.exe (16.5 MB, PyInstaller build) โ””โ”€โ”€ [Various .bat launch scripts] ``` --- ## ๐Ÿ—๏ธ System Architecture Overview ### High-Level Component Diagram ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ ENDOBEST DASHBOARD MAIN PROCESS โ”‚ โ”‚ eb_dashboard.py โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”‚ PHASE 1: INITIALIZATION & AUTHENTICATION โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ User Login (IAM API) โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Token Exchange (RC-specific) โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Config Loading (Excel parsing & validation) โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€ Thread Pool Setup (20 workers main, 40 subtasks) โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ†“ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”‚ PHASE 2: ORGANIZATION & COUNTERS RETRIEVAL โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Get All Organizations (getAllOrganizations API) โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Fetch Counters Parallelized (20 workers) โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Enrich with Center Mapping (optional) โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€ Calculate Totals & Sort โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ†“ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”‚ PHASE 3: PATIENT INCLUSION DATA COLLECTION โ”‚ โ”‚ โ”‚ โ”‚ Outer Loop: Organizations (20 parallel workers) โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ For Each Organization: โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Get Inclusions List (POST /api/inclusions/search) โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€ For Each Patient (Sequential): โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Fetch Clinical Record (API) โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Fetch All Questionnaires (Optimized: 1 call) โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Fetch Lab Requests (Async pool) โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Process Field Mappings (extraction + transform) โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€ Update Progress Bars (thread-safe) โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ Inner Async: Lab/Questionnaire Fetches (40 workers) โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ (Non-blocking I/O during main processing) โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€ Combine Inclusions from All Orgs โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ†“ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”‚ PHASE 4: QUALITY ASSURANCE & VALIDATION โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Coherence Check (API stats vs actual data) โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€ Compares counters with detailed records โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Non-Regression Check (config-driven) โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€ Detects changes with severity levels โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€ Critical Issue Handling (user confirmation if needed) โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ†“ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”‚ PHASE 5: EXPORT & PERSISTENCE โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Backup Old Files (if quality passed) โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Write JSON Outputs (endobest_inclusions.json, etc.) โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Export to Excel (if configured) โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Load Templates โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Apply Filters & Sorts โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Fill Data into Sheets โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€ Replace Values โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€ Recalculate Formulas (win32com) โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€ Display Summary & Elapsed Time โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ†“ โ”‚ โ”‚ EXIT โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ†“ EXTERNAL DEPENDENCIES โ†“ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ EXTERNAL APIS โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ โ”‚ โ”‚ ๐Ÿ” AUTHENTICATION (IAM) โ”‚ โ”‚ โ””โ”€ api-auth.ziwig-connect.com โ”‚ โ”‚ โ”œโ”€ POST /api/auth/ziwig-pro/login โ”‚ โ”‚ โ””โ”€ POST /api/auth/refreshToken โ”‚ โ”‚ โ”‚ โ”‚ ๐Ÿฅ RESEARCH CLINIC (RC) โ”‚ โ”‚ โ””โ”€ api-hcp.ziwig-connect.com โ”‚ โ”‚ โ”œโ”€ POST /api/auth/config-token โ”‚ โ”‚ โ”œโ”€ GET /api/inclusions/getAllOrganizations โ”‚ โ”‚ โ”œโ”€ POST /api/inclusions/inclusion-statistics โ”‚ โ”‚ โ”œโ”€ POST /api/inclusions/search โ”‚ โ”‚ โ”œโ”€ POST /api/records/byPatient โ”‚ โ”‚ โ””โ”€ POST /api/surveys/filter/with-answers (optimized!) โ”‚ โ”‚ โ”‚ โ”‚ ๐Ÿงช LAB / DIAGNOSTICS (GDD) โ”‚ โ”‚ โ””โ”€ api-lab.ziwig-connect.com โ”‚ โ”‚ โ””โ”€ GET /api/requests/by-tube-id/{tubeId} โ”‚ โ”‚ โ”‚ โ”‚ ๐Ÿ“ EXCEL TEMPLATES โ”‚ โ”‚ โ””โ”€ config/templates/ โ”‚ โ”‚ โ”œโ”€ Endobest_Template.xlsx โ”‚ โ”‚ โ”œโ”€ Statistics_Template.xlsx โ”‚ โ”‚ โ””โ”€ (Custom templates) โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` --- ## ๐Ÿ”Œ Module Descriptions ### 1. **eb_dashboard.py** - Main Orchestrator (57.5 KB) **Responsibility:** Complete data collection workflow, API coordination, multithreaded execution **Structure (9 Blocks):** | Block | Purpose | Key Functions | |-------|---------|---| | **1** | Configuration & Infrastructure | Constants, global vars, progress bar setup | | **2** | Decorators & Resilience | `@api_call_with_retry`, retry logic | | **3** | Authentication | `login()`, token exchange, IAM integration | | **3B** | File Utilities | `load_json_file()` | | **4** | Inclusions Mapping Config | `load_inclusions_mapping_config()`, validation | | **5** | Data Search & Extraction | Questionnaire finding, field retrieval | | **6** | Custom Functions | Business logic, calculated fields | | **7** | Business API Calls | RC, GDD, organization endpoints | | **7b** | Organization Center Mapping | `load_org_center_mapping()` | | **8** | Processing Orchestration | `process_organization_patients()`, patient data processing | | **9** | Main Execution | Entry point, quality checks, export | **Key Technologies:** - `httpx` - HTTP client (with thread-local instances) - `openpyxl` - Excel parsing - `concurrent.futures.ThreadPoolExecutor` - Parallel execution - `tqdm` - Progress tracking - `questionary` - Interactive prompts --- ### 2. **eb_dashboard_utils.py** - Utility Functions (6.4 KB) **Responsibility:** Generic, reusable utility functions shared across modules **Core Functions:** ```python get_httpx_client() # Thread-local HTTP client management get_thread_position() # Progress bar positioning get_nested_value() # JSON path navigation with wildcard support (*) get_config_path() # Config folder resolution (script vs PyInstaller) get_old_filename() # Backup filename generation ``` **Key Features:** - Thread-safe HTTP client pooling - Wildcard support in nested JSON paths (e.g., `["items", "*", "value"]`) - Cross-platform path resolution --- ### 3. **eb_dashboard_quality_checks.py** - QA & Validation (58.5 KB) **Responsibility:** Quality assurance, data validation, regression checking **Core Functions:** | Function | Purpose | |----------|---------| | `load_regression_check_config()` | Load regression rules from Excel | | `run_quality_checks()` | Orchestrate all QA checks | | `coherence_check()` | Verify stats vs detailed data consistency | | `non_regression_check()` | Config-driven change validation | | `run_check_only_mode()` | Standalone validation mode | | `backup_output_files()` | Create versioned backups | **Quality Check Types:** 1. **Coherence Check** - Compares API-provided organization statistics vs. actual inclusion counts - Severity: Warning/Critical - Example: Total API count (145) vs. actual inclusions (143) 2. **Non-Regression Check** - Compares current vs. previous run data - Applies config-driven rules with transition patterns - Detects: new inclusions, deletions, field changes - Severity: Warning/Critical with exceptions --- ### 4. **eb_dashboard_excel_export.py** - Excel Generation & Orchestration (38 KB, v1.1+) **Responsibility:** Configuration-driven Excel workbook generation with data transformation + high-level orchestration **Core Functions (Low-Level):** | Function | Purpose | |----------|---------| | `load_excel_export_config()` | Load Excel_Workbooks + Excel_Sheets config | | `validate_excel_config()` | Validate templates and named ranges | | `export_to_excel()` | Main export orchestration (openpyxl + win32com) | | `_apply_filter()` | AND-condition filtering | | `_apply_sort()` | Multi-key sorting with datetime support | | `_apply_value_replacement()` | Strict type matching value transformation | | `_handle_output_exists()` | File conflict resolution | | `_recalculate_workbook()` | Formula recalculation via win32com | | `_process_sheet()` | Sheet-specific data filling | **High-Level Orchestration Functions (v1.1+):** | Function | Purpose | Called From | |----------|---------|-------------| | `export_excel_only()` | Complete --excel-only mode | main() CLI detection | | `run_normal_mode_export()` | Normal mode export phase | main() after JSON write | | `prepare_excel_export()` | Preparation + validation | Both orchestration functions | | `execute_excel_export()` | Execution with error handling | Both orchestration functions | | `_load_json_file_internal()` | Safe JSON loading | run_normal_mode_export() | **Data Transformation Pipeline:** ``` 1. Load Configuration (Excel_Workbooks + Excel_Sheets) 2. For each workbook: a. Load template (openpyxl) b. For each sheet: - Apply filter (AND conditions) - Apply sort (multi-key) - Apply value replacement (strict type matching) - Fill data into cells/named ranges c. Handle file conflicts (Overwrite/Increment/Backup) d. Save workbook (openpyxl) e. Recalculate formulas (win32com - optional) ``` **Orchestration Pattern (v1.1+):** As of v1.1, the system delegates all export orchestration to dedicated functions following the pattern established by `run_check_only_mode()` from quality_checks: 1. **--excel-only mode:** Main script calls single function โ†’ `export_excel_only()` handles everything 2. **Normal mode export:** Main script calls single function โ†’ `run_normal_mode_export()` handles everything This keeps the main script focused on business logic while all export mechanics are encapsulated in the module. --- ## ๐Ÿ”„ Complete Data Collection Workflow ### Phase 1: Initialization (2-3 seconds) 1. User provides credentials (with defaults) 2. IAM Login: `POST /api/auth/ziwig-pro/login` 3. Token Exchange: `POST /api/auth/config-token` 4. Load configuration from `Endobest_Dashboard_Config.xlsx` 5. Validate field mappings and quality check rules 6. Setup thread pools (main: 20 workers, subtasks: 40 workers) ### Phase 2: Organization Retrieval (5-8 seconds) 1. Get all organizations: `GET /api/inclusions/getAllOrganizations` 2. Filter excluded centers (config-driven) 3. Fetch counters in parallel (20 workers): - For each org: `POST /api/inclusions/inclusion-statistics` - Store: patients_count, preincluded_count, included_count, prematurely_terminated_count 4. Optional: Enrich with center mapping (from `eb_org_center_mapping.xlsx`) 5. Calculate totals and sort ### Phase 3: Patient Data Collection (2-4 minutes) **Nested Parallel Architecture:** **Outer Loop (20 workers):** For each organization - `POST /api/inclusions/search?limit=1000&page=1` โ†’ Get up to 1000 inclusions **Middle Loop (Sequential):** For each patient - Fetch clinical record: `POST /api/records/byPatient` - Fetch questionnaires: `POST /api/surveys/filter/with-answers` (**optimized: 1 call**) - Submit async lab request: `GET /api/requests/by-tube-id/{tubeId}` (in subtasks pool) **Inner Loop (40 async workers):** Non-blocking lab/questionnaire processing - Parallel fetches of lab requests while main thread processes fields **Field Processing (per patient):** - For each field in configuration: 1. Determine source (questionnaire, record, inclusion, request, calculated) 2. Extract raw value (supports JSON paths with wildcards) 3. Check field condition (optional) 4. Apply post-processing transformations 5. Format score dictionaries 6. Store in nested output structure ### Phase 4: Quality Assurance (10-15 seconds) 1. **Coherence Check:** Compare API counters vs. actual data 2. **Non-Regression Check:** Compare current vs. previous run with config rules 3. **Critical Issue Handling:** User confirmation if issues detected 4. If NO critical issues โ†’ continue to export 5. If YES critical issues โ†’ prompt user for override ### Phase 5: Export & Persistence (3-5 seconds) **Step 1: Backup & JSON Write** 1. Backup old files (if quality checks passed) 2. Write JSON outputs: - `endobest_inclusions.json` (6-7 MB) - `endobest_organizations.json` (17-20 KB) **Step 2: Excel Export (if configured)** Delegated to `run_normal_mode_export()` function which handles: 1. Load JSONs from filesystem (ensures consistency) 2. Load Excel configuration 3. Validate templates and named ranges 4. For each configured workbook: - Load template file - Apply filter conditions (AND logic) - Apply multi-key sort - Apply value replacements (strict type matching) - Fill data into cells/named ranges - Handle file conflicts (Overwrite/Increment/Backup) - Save workbook - Recalculate formulas (optional, via win32com) 5. Display results and return status **Step 3: Summary** 1. Display elapsed time 2. Report file locations 3. Note any warnings/errors during export --- ## โš™๏ธ Configuration System ### Three-Layer Configuration Architecture #### Layer 1: Excel Configuration (`Endobest_Dashboard_Config.xlsx`) **Sheet 1: Inclusions_Mapping** (Field Extraction) - Define which patient fields to extract - Specify sources (questionnaire, record, inclusion, request, calculated) - Configure transformations (value labels, templates, conditions) - ~50+ fields typically configured **Sheet 2: Organizations_Mapping** (Organization Fields) - Define which organization fields to export - Rarely modified **Sheet 3: Excel_Workbooks** (Excel Export Metadata) - Workbook names - Template paths - Output filenames (with template variables) - File conflict handling strategy (Overwrite/Increment/Backup) **Sheet 4: Excel_Sheets** (Sheet Configurations) - Workbook name (reference to Excel_Workbooks) - Sheet name (in template) - Source type (Inclusions/Organizations/Variable) - Target (cell or named range) - Column mapping (JSON) - Filter conditions (JSON with AND logic) - Sort keys (JSON, multi-key with datetime support) - Value replacements (JSON, strict type matching) **Sheet 5: Regression_Check** (Quality Rules) - Rule names - Field selection pipeline (include/exclude patterns) - Scope (all organizations or specific org list) - Transition patterns (expected state changes) - Severity levels (Warning/Critical) #### Layer 2: Organization Mapping (`eb_org_center_mapping.xlsx`) - Optional mapping file - Sheet: `Org_Center_Mapping` - Maps organization names to center identifiers - Gracefully degraded if missing #### Layer 3: Excel Templates (`config/templates/`) - Excel workbook templates with: - Sheet definitions - Named ranges (for data fill targets) - Formula structures - Formatting and styles ### Configuration Constants (in code) ```python # API Configuration IAM_URL = "https://api-auth.ziwig-connect.com" RC_URL = "https://api-hcp.ziwig-connect.com" GDD_URL = "https://api-lab.ziwig-connect.com" RC_APP_ID = "602aea51-cdb2-4f73-ac99-fd84050dc393" RC_ENDOBEST_PROTOCOL_ID = "3c7bcb4d-91ed-4e9f-b93f-99d8447a276e" # Threading & Performance MAX_THREADS = 20 # Main thread pool workers ASYNC_THREADS = 40 # Subtasks thread pool workers ERROR_MAX_RETRY = 10 # Maximum retry attempts WAIT_BEFORE_RETRY = 0.5 # Seconds between retries # Excluded Organizations RC_ENDOBEST_EXCLUDED_CENTERS = ["e18e7487-...", "5582bd75-...", "e053512f-..."] ``` --- ## ๐Ÿ” API Integration ### Authentication Flow ``` 1. IAM Login POST https://api-auth.ziwig-connect.com/api/auth/ziwig-pro/login Request: {"username": "...", "password": "..."} Response: {"access_token": "jwt_master", "userId": "uuid"} 2. Token Exchange (RC-specific) POST https://api-hcp.ziwig-connect.com/api/auth/config-token Headers: Authorization: Bearer {master_token} Request: {"userId": "...", "clientId": "...", "userAgent": "..."} Response: {"access_token": "jwt_rc", "refresh_token": "refresh_token"} 3. Automatic Token Refresh (on 401) POST https://api-hcp.ziwig-connect.com/api/auth/refreshToken Headers: Authorization: Bearer {current_token} Request: {"refresh_token": "..."} Response: {"access_token": "jwt_new", "refresh_token": "new_refresh"} ``` ### Key API Endpoints | Endpoint | Method | Purpose | |----------|--------|---------| | `/api/inclusions/getAllOrganizations` | GET | List all organizations | | `/api/inclusions/inclusion-statistics` | POST | Get patient counts per org | | `/api/inclusions/search` | POST | Get inclusions list for org (paginated) | | `/api/records/byPatient` | POST | Get clinical record for patient | | `/api/surveys/filter/with-answers` | POST | **OPTIMIZED:** Get all questionnaires for patient | | `/api/requests/by-tube-id/{tubeId}` | GET | Get lab test results | ### Performance Optimization: Questionnaire Batching **Problem:** Multiple API calls per patient (1 call per questionnaire ร— N patients = slow) **Solution:** Single optimized call retrieves all questionnaires with answers ``` BEFORE (inefficient): for qcm_id in questionnaire_ids: GET /api/surveys/{qcm_id}/answers?subject={patient_id} # Result: N API calls per patient AFTER (optimized): POST /api/surveys/filter/with-answers { "context": "clinic_research", "subject": patient_id } # Result: 1 API call per patient # Impact: 4-5x performance improvement ``` --- ## โšก Multithreading & Performance Optimization ### Thread Pool Architecture ``` Main Application Thread โ†“ โ”Œโ”€ Phase 1: Counter Fetching โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ ThreadPoolExecutor(max_workers=user_input, cap=20) โ”‚ โ”‚ โ”œโ”€ Task 1: Get counters for Org 1 โ”‚ โ”‚ โ”œโ”€ Task 2: Get counters for Org 2 โ”‚ โ”‚ โ””โ”€ Task N: Get counters for Org N โ”‚ โ”‚ [Sequential wait: tqdm.as_completed] โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ†“ โ”Œโ”€ Phase 2: Inclusion Data Collection (Nested) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Outer: ThreadPoolExecutor(max_workers=user_input) โ”‚ โ”‚ โ”‚ โ”‚ For Org 1: โ”‚ โ”‚ โ”‚ Inner: ThreadPoolExecutor(max_workers=40) โ”‚ โ”‚ โ”‚ โ”œโ”€ Patient 1: Async lab/questionnaire fetch โ”‚ โ”‚ โ”‚ โ”œโ”€ Patient 2: Async lab/questionnaire fetch โ”‚ โ”‚ โ”‚ โ””โ”€ Patient N: Async lab/questionnaire fetch โ”‚ โ”‚ โ”‚ [Sequential outer wait: as_completed] โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ For Org 2: โ”‚ โ”‚ โ”‚ [Similar parallel processing] โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ For Org N: โ”‚ โ”‚ โ”‚ [Similar parallel processing] โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` ### Performance Optimizations 1. **Thread-Local HTTP Clients** - Each thread maintains its own `httpx.Client` - Avoids connection conflicts - Implementation via `get_httpx_client()` 2. **Nested Parallelization** - Main pool: Organizations (20 workers) - Subtasks pool: Lab requests (40 workers) - Non-blocking I/O during processing 3. **Questionnaire Batching** (4-5x improvement) - Single call retrieves all questionnaires + answers - Eliminates N filtered calls per patient 4. **Configurable Worker Threads** - User input selection (1-20 workers) - Tunable for network bandwidth and API rate limits ### Progress Tracking (Multi-Level) ``` Overall Progress [โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘] 847/1200 1/15 - Center 1 [โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘] 73/95 2/15 - Center 2 [โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘] 42/110 3/15 - Center 3 [โ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘] 28/85 ``` **Thread-Safe Updates:** ```python with _global_pbar_lock: if global_pbar: global_pbar.update(1) ``` --- ## ๐Ÿ›ก๏ธ Error Handling & Resilience ### Token Management Strategy 1. **Automatic Token Refresh on 401** - Triggered by `@api_call_with_retry` decorator - Thread-safe via `_token_refresh_lock` 2. **Retry Mechanism** - Max retries: 10 attempts - Delay between retries: 0.5 seconds - Decorators: `@api_call_with_retry` 3. **Thread-Safe Token Refresh** ```python def new_token(): global access_token, refresh_token with _token_refresh_lock: # Only one thread refreshes at a time for attempt in range(ERROR_MAX_RETRY): try: # POST /api/auth/refreshToken # Update global tokens except: sleep(WAIT_BEFORE_RETRY) ``` ### Exception Handling Categories | Category | Examples | Handling | |----------|----------|----------| | **API Errors** | Network timeouts, HTTP errors | Retry with exponential spacing | | **File I/O Errors** | Missing config, permission denied | Graceful error + exit | | **Validation Errors** | Invalid config, incoherent data | Log warning + prompt user | | **Thread Errors** | Worker thread failures | Shutdown gracefully + propagate | ### Graceful Degradation 1. **Missing Organization Mapping:** Skip silently, use fallback (org name) 2. **Critical Quality Issues:** Prompt user for confirmation before export 3. **Thread Failure:** Shutdown all workers gracefully, preserve partial results 4. **Invalid Configuration:** Clear error messages with remediation suggestions --- ## ๐Ÿ“Š Data Output Structure ### JSON Output: `endobest_inclusions.json` ```json [ { "Patient_Identification": { "Organisation_Id": "uuid", "Organisation_Name": "Hospital Name", "Center_Name": "HOSP-A", "Patient_Id": "internal_id", "Pseudo": "ENDO-001", "Patient_Name": "Doe, John", "Patient_Birthday": "1975-05-15", "Patient_Age": 49 }, "Inclusion": { "Consent_Signed": true, "Inclusion_Date": "15/10/2024", "Inclusion_Status": "incluse", "isPrematurelyTerminated": false }, "Extended_Fields": { "Custom_Field_1": "value", "Custom_Field_2": 42, "Composite_Score": "8/10" }, "Endotest": { "Request_Sent": true, "Diagnostic_Status": "Completed" } } ] ``` ### JSON Output: `endobest_organizations.json` ```json [ { "id": "org-uuid", "name": "Hospital A", "Center_Name": "HOSP-A", "patients_count": 45, "preincluded_count": 8, "included_count": 35, "prematurely_terminated_count": 2 } ] ``` --- ## ๐Ÿš€ Execution Modes ### Mode 1: Normal (Full Collection) ```bash python eb_dashboard.py ``` - Authenticates - Collects from APIs - Runs quality checks - Exports JSON + Excel - Duration: 2.5-5 minutes (typical) ### Mode 2: Excel-Only (Fast Export) ```bash python eb_dashboard.py --excel-only ``` - Skips data collection - Uses existing JSON files - Regenerates Excel workbooks - Duration: 5-15 seconds - Use case: Reconfigure reports, test templates ### Mode 3: Check-Only (Validation Only) ```bash python eb_dashboard.py --check-only ``` - Loads existing JSON - Runs quality checks - No export - Duration: 5-10 seconds - Use case: Verify data before distribution ### Mode 4: Debug (Verbose Output) ```bash python eb_dashboard.py --debug ``` - Executes normal mode - Enables detailed logging - Shows field-by-field changes - Check `dashboard.log` for details --- ## ๐Ÿ“ˆ Performance Metrics & Benchmarks ### Typical Execution Times (Full Dataset: 1,200+ patients, 15+ organizations) | Phase | Duration | Notes | |-------|----------|-------| | **Login & Config** | 2-3 sec | Sequential, network-dependent | | **Fetch Counters** | 5-8 sec | 20 workers, parallelized | | **Collect Inclusions** | 2-4 min | Includes API calls + field processing | | **Quality Checks** | 10-15 sec | File loads, data comparison | | **Export to JSON** | 3-5 sec | File I/O | | **Export to Excel** | 5-15 sec | Template processing + fill | | **TOTAL** | **~2.5-5 min** | Depends on network, API perf | ### Network Optimization Impact **With old questionnaire approach (N filtered calls per patient):** - 1,200 patients ร— 15 questionnaires = 18,000 API calls - Estimated: 15-30 minutes **With optimized single-call questionnaire:** - 1,200 patients ร— 1 call = 1,200 API calls - Estimated: 2-5 minutes - **Improvement: 3-6x faster** โœ… --- ## ๐Ÿ” Field Extraction & Processing Logic ### Complete Field Processing Pipeline ``` For each field in INCLUSIONS_MAPPING_CONFIG: โ”‚ โ”œโ”€ Step 1: Determine Source Type โ”‚ โ”œโ”€ q_id / q_name / q_category โ†’ Find questionnaire โ”‚ โ”œโ”€ record โ†’ Use clinical record โ”‚ โ”œโ”€ inclusion โ†’ Use patient inclusion data โ”‚ โ”œโ”€ request โ†’ Use lab request data โ”‚ โ””โ”€ calculated โ†’ Execute custom function โ”‚ โ”œโ”€ Step 2: Extract Raw Value โ”‚ โ”œโ”€ Navigate JSON using field_path โ”‚ โ”œโ”€ Supports wildcard (*) for list traversal โ”‚ โ””โ”€ Return value or "undefined" โ”‚ โ”œโ”€ Step 3: Check Field Condition (optional) โ”‚ โ”œโ”€ If condition undefined โ†’ Set to "undefined" โ”‚ โ”œโ”€ If condition not boolean โ†’ Error flag โ”‚ โ”œโ”€ If condition false โ†’ Set to "N/A" โ”‚ โ””โ”€ If condition true โ†’ Continue โ”‚ โ”œโ”€ Step 4: Apply Post-Processing Transformations โ”‚ โ”œโ”€ true_if_any: Convert to boolean โ”‚ โ”œโ”€ value_labels: Map to localized text โ”‚ โ”œโ”€ field_template: Apply formatting โ”‚ โ””โ”€ List joining: Flatten arrays with pipe delimiter โ”‚ โ”œโ”€ Step 5: Format Score Dictionaries โ”‚ โ”œโ”€ If {total, max} โ†’ Format as "total/max" โ”‚ โ””โ”€ Otherwise โ†’ Keep as-is โ”‚ โ””โ”€ Store: output_inclusion[field_group][field_name] = final_value ``` ### Custom Functions for Calculated Fields | Function | Purpose | Syntax | |----------|---------|--------| | `search_in_fields_using_regex` | Search multiple fields for pattern | `["search_in_fields_using_regex", "pattern", "field1", "field2"]` | | `extract_parentheses_content` | Extract text within parentheses | `["extract_parentheses_content", "field_name"]` | | `append_terminated_suffix` | Add suffix if patient terminated | `["append_terminated_suffix", "status_field", "is_terminated_field"]` | | `if_then_else` | Unified conditional with 8 operators | `["if_then_else", "operator", arg1, arg2_optional, true_result, false_result]` | **if_then_else Operators:** - `is_true` / `is_false` - Boolean field test - `is_defined` / `is_undefined` - Existence test - `all_true` / `all_defined` - Multiple field test - `==` / `!=` - Value comparison --- ## โœ… Quality Assurance Framework ### Coherence Check **Purpose:** Verify API-provided statistics match actual collected data **Logic:** ``` For each organization: API_Count = statistic.total Actual_Count = count of inclusion records if API_Count != Actual_Count: Report discrepancy with severity โ”œโ”€ ยฑ10%: Warning โ””โ”€ >ยฑ10%: Critical ``` ### Non-Regression Check **Purpose:** Detect unexpected changes between data runs **Configuration-Driven Rules:** - Field selection pipeline (include/exclude patterns) - Transition patterns (expected state changes) - Severity levels (Warning/Critical) - Exception handling (exclude specific organizations) **Logic:** ``` Load previous inclusion data (_old file) For each rule: โ”œโ”€ Build candidate fields via pipeline โ”œโ”€ Determine key field for matching โ””โ”€ For each inclusion: โ”œโ”€ Find matching old inclusion by key โ”œโ”€ Check for unexpected transitions โ”œโ”€ Apply exceptions โ””โ”€ Report violations ``` --- ## ๐Ÿ“‹ Documentation Structure The system includes comprehensive documentation: | Document | Size | Content | |----------|------|---------| | **DOCUMENTATION_10_ARCHITECTURE.md** | 43.7 KB | System design, workflow, APIs, multithreading | | **DOCUMENTATION_11_FIELD_MAPPING.md** | 56.3 KB | Field extraction logic, custom functions, examples | | **DOCUMENTATION_12_QUALITY_CHECKS.md** | 60.2 KB | QA framework, regression rules, configuration | | **DOCUMENTATION_13_EXCEL_EXPORT.md** | 29.6 KB | Excel generation, data transformation, config | | **DOCUMENTATION_98_USER_GUIDE.md** | 8.4 KB | End-user instructions, troubleshooting, FAQ | | **DOCUMENTATION_99_CONFIG_GUIDE.md** | 24.8 KB | Administrator reference, Excel tables, examples | --- ## ๐Ÿ”ง Key Technical Features ### Thread Safety - Per-thread HTTP clients (no connection conflicts) - Synchronized access to global state via locks - Thread-safe progress bar updates ### Error Recovery - Automatic token refresh on 401 errors - Exponential backoff retry logic (configurable) - Graceful degradation for optional features - User confirmation on critical issues ### Configuration Flexibility - 100% externalized to Excel (zero code changes) - Supports multiple data sources - Custom business logic functions - Field dependencies and conditions - Value transformations and templates ### Performance - Optimized API calls (4-5x improvement) - Parallel processing (20+ workers) - Async I/O operations - Configurable thread pools ### Data Quality - Coherence checking (stats vs actual data) - Non-regression testing (config-driven) - Comprehensive validation - Audit trail logging --- ## ๐Ÿ“ฆ Dependencies ### Core Libraries - **httpx** - HTTP client with connection pooling - **openpyxl** - Excel file reading/writing - **questionary** - Interactive CLI prompts - **tqdm** - Progress bars - **rich** - Rich text formatting - **pywin32** - Windows COM automation (optional, for formula recalculation) - **pytz** - Timezone support (optional) ### Python Version - Python 3.7+ ### External Services - Ziwig IAM API - Ziwig Research Clinic (RC) API - Ziwig Lab (GDD) API --- ## ๐ŸŽ“ Usage Patterns ### For End Users 1. Configure fields in Excel (no code needed) 2. Run: `python eb_dashboard.py` 3. Review results in JSON or Excel ### For Administrators 1. Add new fields to `Inclusions_Mapping` 2. Define quality rules in `Regression_Check` 3. Configure Excel export in `Excel_Workbooks` + `Excel_Sheets` 4. Restart: script picks up config automatically ### For Developers 1. Add custom function to Block 6 (eb_dashboard.py) 2. Register in field config (Inclusions_Mapping) 3. Use via: `"source_id": "function_name"` 4. No code recompile needed for other changes --- ## ๐ŸŽฏ Summary The **Endobest Clinical Research Dashboard** represents a mature, production-ready system that successfully combines: โœ… **Architectural Excellence** - Clean modular design with separation of concerns โœ… **User-Centric Configuration** - 100% externalized, no code changes needed โœ… **Performance Optimization** - 4-5x faster via API and threading improvements โœ… **Robust Resilience** - Comprehensive error handling, automatic recovery, graceful degradation โœ… **Quality Assurance** - Multi-level validation, coherence checks, regression testing โœ… **Comprehensive Documentation** - 250+ KB of technical and user guides โœ… **Maintainability** - Clear code structure, extensive logging, audit trails The system successfully enables non-technical users to configure complex data extraction and reporting workflows while maintaining enterprise-grade reliability and performance standards. --- **Document Version:** 1.0 **Last Updated:** 2025-11-08 **Status:** โœ… Complete & Production Ready