Version fonctionnelle

2025-12-12 23:07:26 +01:00
commit cb8b5d9a12
42 changed files with 465285 additions and 0 deletions
--- a/DOCUMENTATION/DOCUMENTATION_30_ARCHITECTURE_SUMMARY.md
+++ b/DOCUMENTATION/DOCUMENTATION_30_ARCHITECTURE_SUMMARY.md
@@ -0,0 +1,990 @@
+# 📊 Endobest Clinical Research Dashboard - Architecture Summary
+
+**Last Updated:** 2025-11-08
+**Project Status:** Production Ready with Excel Export Feature
+**Language:** Python 3.x
+
+---
+
+## 🎯 Executive Summary
+
+The **Endobest Clinical Research Dashboard** is a sophisticated, production-grade automated data collection and reporting system designed to aggregate patient inclusion data from the Endobest clinical research protocol across multiple healthcare organizations. The system combines high-performance multithreading, comprehensive quality assurance, and fully externalized configuration to enable non-technical users to manage complex data extraction workflows without code modifications.
+
+### Core Value Propositions
+
+✅ **100% Externalized Configuration** - All field definitions, quality rules, and export logic defined in Excel
+✅ **High-Performance Architecture** - 4-5x faster via optimized API calls and parallel processing
+✅ **Robust Resilience** - Automatic token refresh, retries, graceful degradation
+✅ **Comprehensive Quality Assurance** - Coherence checks + config-driven regression testing
+✅ **Multi-Format Export** - JSON + configurable Excel workbooks with data transformation
+✅ **User-Friendly Interface** - Interactive prompts, progress tracking, clear error messages
+
+---
+
+## 📁 Project Structure
+
+```
+Endobest Dashboard/
+├── 📜 MAIN SCRIPT
+│   └── eb_dashboard.py                      (57.5 KB, 1,021 lines)
+│       Core orchestrator for data collection, processing, and export
+│
+├── 🔧 UTILITY MODULES
+│   ├── eb_dashboard_utils.py                (6.4 KB, 184 lines)
+│   │   Thread-safe HTTP clients, nested data navigation, config resolution
+│   │
+│   ├── eb_dashboard_quality_checks.py       (58.5 KB, 1,266 lines)
+│   │   Coherence checks, non-regression testing, data validation
+│   │
+│   └── eb_dashboard_excel_export.py         (32 KB, ~1,000 lines)
+│       Configuration-driven Excel workbook generation
+│
+├── 📚 DOCUMENTATION
+│   ├── DOCUMENTATION_10_ARCHITECTURE.md     (43.7 KB)
+│   │   System design, data flow, API integration, multithreading
+│   │
+│   ├── DOCUMENTATION_11_FIELD_MAPPING.md    (56.3 KB)
+│   │   Field extraction logic, custom functions, transformations
+│   │
+│   ├── DOCUMENTATION_12_QUALITY_CHECKS.md   (60.2 KB)
+│   │   Quality assurance framework, regression rules, validation logic
+│   │
+│   ├── DOCUMENTATION_13_EXCEL_EXPORT.md     (29.6 KB)
+│   │   Excel generation architecture, data transformation pipeline
+│   │
+│   ├── DOCUMENTATION_98_USER_GUIDE.md       (8.4 KB)
+│   │   End-user instructions, quick start, troubleshooting
+│   │
+│   └── DOCUMENTATION_99_CONFIG_GUIDE.md     (24.8 KB)
+│       Administrator configuration reference
+│
+├── ⚙️  CONFIGURATION
+│   └── config/
+│       ├── Endobest_Dashboard_Config.xlsx   (Configuration file)
+│       │   Inclusions_Mapping
+│       │   Organizations_Mapping
+│       │   Excel_Workbooks
+│       │   Excel_Sheets
+│       │   Regression_Check
+│       │
+│       ├── eb_org_center_mapping.xlsx       (Organization enrichment)
+│       │
+│       └── templates/
+│           ├── Endobest_Template.xlsx
+│           ├── Statistics_Template.xlsx
+│           └── (Other Excel templates)
+│
+├── 📊 OUTPUT FILES
+│   ├── endobest_inclusions.json             (~6-7 MB, patient data)
+│   ├── endobest_inclusions_old.json         (backup)
+│   ├── endobest_organizations.json          (~17-20 KB, stats)
+│   ├── endobest_organizations_old.json      (backup)
+│   ├── [Excel outputs]                      (*.xlsx, configurable)
+│   └── dashboard.log                        (Execution log)
+│
+└── 🔨 EXECUTABLES
+    ├── eb_dashboard.exe                     (16.5 MB, PyInstaller build)
+    └── [Various .bat launch scripts]
+```
+
+---
+
+## 🏗️ System Architecture Overview
+
+### High-Level Component Diagram
+
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│                   ENDOBEST DASHBOARD MAIN PROCESS                   │
+│                        eb_dashboard.py                              │
+├─────────────────────────────────────────────────────────────────────┤
+│                                                                     │
+│  ┌──────────────────────────────────────────────────────────────┐  │
+│  │  PHASE 1: INITIALIZATION & AUTHENTICATION                   │  │
+│  │  ├─ User Login (IAM API)                                    │  │
+│  │  ├─ Token Exchange (RC-specific)                            │  │
+│  │  ├─ Config Loading (Excel parsing & validation)            │  │
+│  │  └─ Thread Pool Setup (20 workers main, 40 subtasks)       │  │
+│  └──────────────────────────────────────────────────────────────┘  │
+│                              ↓                                      │
+│  ┌──────────────────────────────────────────────────────────────┐  │
+│  │  PHASE 2: ORGANIZATION & COUNTERS RETRIEVAL                │  │
+│  │  ├─ Get All Organizations (getAllOrganizations API)        │  │
+│  │  ├─ Fetch Counters Parallelized (20 workers)               │  │
+│  │  ├─ Enrich with Center Mapping (optional)                  │  │
+│  │  └─ Calculate Totals & Sort                                │  │
+│  └──────────────────────────────────────────────────────────────┘  │
+│                              ↓                                      │
+│  ┌──────────────────────────────────────────────────────────────┐  │
+│  │  PHASE 3: PATIENT INCLUSION DATA COLLECTION                │  │
+│  │  Outer Loop: Organizations (20 parallel workers)           │  │
+│  │  ├─ For Each Organization:                                 │  │
+│  │  │  ├─ Get Inclusions List (POST /api/inclusions/search)  │  │
+│  │  │  └─ For Each Patient (Sequential):                      │  │
+│  │  │     ├─ Fetch Clinical Record (API)                      │  │
+│  │  │     ├─ Fetch All Questionnaires (Optimized: 1 call)    │  │
+│  │  │     ├─ Fetch Lab Requests (Async pool)                  │  │
+│  │  │     ├─ Process Field Mappings (extraction + transform)  │  │
+│  │  │     └─ Update Progress Bars (thread-safe)               │  │
+│  │  │                                                         │  │
+│  │  │  Inner Async: Lab/Questionnaire Fetches (40 workers)   │  │
+│  │  │     (Non-blocking I/O during main processing)           │  │
+│  │  └─ Combine Inclusions from All Orgs                       │  │
+│  └──────────────────────────────────────────────────────────────┘  │
+│                              ↓                                      │
+│  ┌──────────────────────────────────────────────────────────────┐  │
+│  │  PHASE 4: QUALITY ASSURANCE & VALIDATION                   │  │
+│  │  ├─ Coherence Check (API stats vs actual data)             │  │
+│  │  │  └─ Compares counters with detailed records             │  │
+│  │  ├─ Non-Regression Check (config-driven)                   │  │
+│  │  │  └─ Detects changes with severity levels                │  │
+│  │  └─ Critical Issue Handling (user confirmation if needed)  │  │
+│  └──────────────────────────────────────────────────────────────┘  │
+│                              ↓                                      │
+│  ┌──────────────────────────────────────────────────────────────┐  │
+│  │  PHASE 5: EXPORT & PERSISTENCE                             │  │
+│  │  ├─ Backup Old Files (if quality passed)                   │  │
+│  │  ├─ Write JSON Outputs (endobest_inclusions.json, etc.)   │  │
+│  │  ├─ Export to Excel (if configured)                        │  │
+│  │  │  ├─ Load Templates                                      │  │
+│  │  │  ├─ Apply Filters & Sorts                               │  │
+│  │  │  ├─ Fill Data into Sheets                               │  │
+│  │  │  ├─ Replace Values                                      │  │
+│  │  │  └─ Recalculate Formulas (win32com)                     │  │
+│  │  └─ Display Summary & Elapsed Time                         │  │
+│  └──────────────────────────────────────────────────────────────┘  │
+│                              ↓                                      │
+│                           EXIT                                      │
+└─────────────────────────────────────────────────────────────────────┘
+
+                    ↓ EXTERNAL DEPENDENCIES ↓
+
+┌─────────────────────────────────────────────────────────────────────┐
+│                        EXTERNAL APIS                                │
+├─────────────────────────────────────────────────────────────────────┤
+│                                                                     │
+│  🔐 AUTHENTICATION (IAM)                                           │
+│     └─ api-auth.ziwig-connect.com                                  │
+│        ├─ POST /api/auth/ziwig-pro/login                           │
+│        └─ POST /api/auth/refreshToken                              │
+│                                                                     │
+│  🏥 RESEARCH CLINIC (RC)                                           │
+│     └─ api-hcp.ziwig-connect.com                                   │
+│        ├─ POST /api/auth/config-token                              │
+│        ├─ GET /api/inclusions/getAllOrganizations                  │
+│        ├─ POST /api/inclusions/inclusion-statistics                │
+│        ├─ POST /api/inclusions/search                              │
+│        ├─ POST /api/records/byPatient                              │
+│        └─ POST /api/surveys/filter/with-answers (optimized!)      │
+│                                                                     │
+│  🧪 LAB / DIAGNOSTICS (GDD)                                        │
+│     └─ api-lab.ziwig-connect.com                                   │
+│        └─ GET /api/requests/by-tube-id/{tubeId}                    │
+│                                                                     │
+│  📝 EXCEL TEMPLATES                                                │
+│     └─ config/templates/                                           │
+│        ├─ Endobest_Template.xlsx                                   │
+│        ├─ Statistics_Template.xlsx                                 │
+│        └─ (Custom templates)                                       │
+│                                                                     │
+└─────────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## 🔌 Module Descriptions
+
+### 1. **eb_dashboard.py** - Main Orchestrator (57.5 KB)
+
+**Responsibility:** Complete data collection workflow, API coordination, multithreaded execution
+
+**Structure (9 Blocks):**
+
+| Block | Purpose | Key Functions |
+|-------|---------|---|
+| **1** | Configuration & Infrastructure | Constants, global vars, progress bar setup |
+| **2** | Decorators & Resilience | `@api_call_with_retry`, retry logic |
+| **3** | Authentication | `login()`, token exchange, IAM integration |
+| **3B** | File Utilities | `load_json_file()` |
+| **4** | Inclusions Mapping Config | `load_inclusions_mapping_config()`, validation |
+| **5** | Data Search & Extraction | Questionnaire finding, field retrieval |
+| **6** | Custom Functions | Business logic, calculated fields |
+| **7** | Business API Calls | RC, GDD, organization endpoints |
+| **7b** | Organization Center Mapping | `load_org_center_mapping()` |
+| **8** | Processing Orchestration | `process_organization_patients()`, patient data processing |
+| **9** | Main Execution | Entry point, quality checks, export |
+
+**Key Technologies:**
+- `httpx` - HTTP client (with thread-local instances)
+- `openpyxl` - Excel parsing
+- `concurrent.futures.ThreadPoolExecutor` - Parallel execution
+- `tqdm` - Progress tracking
+- `questionary` - Interactive prompts
+
+---
+
+### 2. **eb_dashboard_utils.py** - Utility Functions (6.4 KB)
+
+**Responsibility:** Generic, reusable utility functions shared across modules
+
+**Core Functions:**
+
+```python
+get_httpx_client()          # Thread-local HTTP client management
+get_thread_position()       # Progress bar positioning
+get_nested_value()          # JSON path navigation with wildcard support (*)
+get_config_path()           # Config folder resolution (script vs PyInstaller)
+get_old_filename()          # Backup filename generation
+```
+
+**Key Features:**
+- Thread-safe HTTP client pooling
+- Wildcard support in nested JSON paths (e.g., `["items", "*", "value"]`)
+- Cross-platform path resolution
+
+---
+
+### 3. **eb_dashboard_quality_checks.py** - QA & Validation (58.5 KB)
+
+**Responsibility:** Quality assurance, data validation, regression checking
+
+**Core Functions:**
+
+| Function | Purpose |
+|----------|---------|
+| `load_regression_check_config()` | Load regression rules from Excel |
+| `run_quality_checks()` | Orchestrate all QA checks |
+| `coherence_check()` | Verify stats vs detailed data consistency |
+| `non_regression_check()` | Config-driven change validation |
+| `run_check_only_mode()` | Standalone validation mode |
+| `backup_output_files()` | Create versioned backups |
+
+**Quality Check Types:**
+
+1. **Coherence Check**
+   - Compares API-provided organization statistics vs. actual inclusion counts
+   - Severity: Warning/Critical
+   - Example: Total API count (145) vs. actual inclusions (143)
+
+2. **Non-Regression Check**
+   - Compares current vs. previous run data
+   - Applies config-driven rules with transition patterns
+   - Detects: new inclusions, deletions, field changes
+   - Severity: Warning/Critical with exceptions
+
+---
+
+### 4. **eb_dashboard_excel_export.py** - Excel Generation & Orchestration (38 KB, v1.1+)
+
+**Responsibility:** Configuration-driven Excel workbook generation with data transformation + high-level orchestration
+
+**Core Functions (Low-Level):**
+
+| Function | Purpose |
+|----------|---------|
+| `load_excel_export_config()` | Load Excel_Workbooks + Excel_Sheets config |
+| `validate_excel_config()` | Validate templates and named ranges |
+| `export_to_excel()` | Main export orchestration (openpyxl + win32com) |
+| `_apply_filter()` | AND-condition filtering |
+| `_apply_sort()` | Multi-key sorting with datetime support |
+| `_apply_value_replacement()` | Strict type matching value transformation |
+| `_handle_output_exists()` | File conflict resolution |
+| `_recalculate_workbook()` | Formula recalculation via win32com |
+| `_process_sheet()` | Sheet-specific data filling |
+
+**High-Level Orchestration Functions (v1.1+):**
+
+| Function | Purpose | Called From |
+|----------|---------|-------------|
+| `export_excel_only()` | Complete --excel-only mode | main() CLI detection |
+| `run_normal_mode_export()` | Normal mode export phase | main() after JSON write |
+| `prepare_excel_export()` | Preparation + validation | Both orchestration functions |
+| `execute_excel_export()` | Execution with error handling | Both orchestration functions |
+| `_load_json_file_internal()` | Safe JSON loading | run_normal_mode_export() |
+
+**Data Transformation Pipeline:**
+```
+1. Load Configuration (Excel_Workbooks + Excel_Sheets)
+2. For each workbook:
+   a. Load template (openpyxl)
+   b. For each sheet:
+      - Apply filter (AND conditions)
+      - Apply sort (multi-key)
+      - Apply value replacement (strict type matching)
+      - Fill data into cells/named ranges
+   c. Handle file conflicts (Overwrite/Increment/Backup)
+   d. Save workbook (openpyxl)
+   e. Recalculate formulas (win32com - optional)
+```
+
+**Orchestration Pattern (v1.1+):**
+
+As of v1.1, the system delegates all export orchestration to dedicated functions following the pattern established by `run_check_only_mode()` from quality_checks:
+
+1. **--excel-only mode:** Main script calls single function → `export_excel_only()` handles everything
+2. **Normal mode export:** Main script calls single function → `run_normal_mode_export()` handles everything
+
+This keeps the main script focused on business logic while all export mechanics are encapsulated in the module.
+
+---
+
+## 🔄 Complete Data Collection Workflow
+
+### Phase 1: Initialization (2-3 seconds)
+1. User provides credentials (with defaults)
+2. IAM Login: `POST /api/auth/ziwig-pro/login`
+3. Token Exchange: `POST /api/auth/config-token`
+4. Load configuration from `Endobest_Dashboard_Config.xlsx`
+5. Validate field mappings and quality check rules
+6. Setup thread pools (main: 20 workers, subtasks: 40 workers)
+
+### Phase 2: Organization Retrieval (5-8 seconds)
+1. Get all organizations: `GET /api/inclusions/getAllOrganizations`
+2. Filter excluded centers (config-driven)
+3. Fetch counters in parallel (20 workers):
+   - For each org: `POST /api/inclusions/inclusion-statistics`
+   - Store: patients_count, preincluded_count, included_count, prematurely_terminated_count
+4. Optional: Enrich with center mapping (from `eb_org_center_mapping.xlsx`)
+5. Calculate totals and sort
+
+### Phase 3: Patient Data Collection (2-4 minutes)
+**Nested Parallel Architecture:**
+
+**Outer Loop (20 workers):** For each organization
+- `POST /api/inclusions/search?limit=1000&page=1` → Get up to 1000 inclusions
+
+**Middle Loop (Sequential):** For each patient
+- Fetch clinical record: `POST /api/records/byPatient`
+- Fetch questionnaires: `POST /api/surveys/filter/with-answers` (**optimized: 1 call**)
+- Submit async lab request: `GET /api/requests/by-tube-id/{tubeId}` (in subtasks pool)
+
+**Inner Loop (40 async workers):** Non-blocking lab/questionnaire processing
+- Parallel fetches of lab requests while main thread processes fields
+
+**Field Processing (per patient):**
+- For each field in configuration:
+  1. Determine source (questionnaire, record, inclusion, request, calculated)
+  2. Extract raw value (supports JSON paths with wildcards)
+  3. Check field condition (optional)
+  4. Apply post-processing transformations
+  5. Format score dictionaries
+  6. Store in nested output structure
+
+### Phase 4: Quality Assurance (10-15 seconds)
+1. **Coherence Check:** Compare API counters vs. actual data
+2. **Non-Regression Check:** Compare current vs. previous run with config rules
+3. **Critical Issue Handling:** User confirmation if issues detected
+4. If NO critical issues → continue to export
+5. If YES critical issues → prompt user for override
+
+### Phase 5: Export & Persistence (3-5 seconds)
+
+**Step 1: Backup & JSON Write**
+1. Backup old files (if quality checks passed)
+2. Write JSON outputs:
+   - `endobest_inclusions.json` (6-7 MB)
+   - `endobest_organizations.json` (17-20 KB)
+
+**Step 2: Excel Export (if configured)**
+Delegated to `run_normal_mode_export()` function which handles:
+1. Load JSONs from filesystem (ensures consistency)
+2. Load Excel configuration
+3. Validate templates and named ranges
+4. For each configured workbook:
+   - Load template file
+   - Apply filter conditions (AND logic)
+   - Apply multi-key sort
+   - Apply value replacements (strict type matching)
+   - Fill data into cells/named ranges
+   - Handle file conflicts (Overwrite/Increment/Backup)
+   - Save workbook
+   - Recalculate formulas (optional, via win32com)
+5. Display results and return status
+
+**Step 3: Summary**
+1. Display elapsed time
+2. Report file locations
+3. Note any warnings/errors during export
+
+---
+
+## ⚙️ Configuration System
+
+### Three-Layer Configuration Architecture
+
+#### Layer 1: Excel Configuration (`Endobest_Dashboard_Config.xlsx`)
+
+**Sheet 1: Inclusions_Mapping** (Field Extraction)
+- Define which patient fields to extract
+- Specify sources (questionnaire, record, inclusion, request, calculated)
+- Configure transformations (value labels, templates, conditions)
+- ~50+ fields typically configured
+
+**Sheet 2: Organizations_Mapping** (Organization Fields)
+- Define which organization fields to export
+- Rarely modified
+
+**Sheet 3: Excel_Workbooks** (Excel Export Metadata)
+- Workbook names
+- Template paths
+- Output filenames (with template variables)
+- File conflict handling strategy (Overwrite/Increment/Backup)
+
+**Sheet 4: Excel_Sheets** (Sheet Configurations)
+- Workbook name (reference to Excel_Workbooks)
+- Sheet name (in template)
+- Source type (Inclusions/Organizations/Variable)
+- Target (cell or named range)
+- Column mapping (JSON)
+- Filter conditions (JSON with AND logic)
+- Sort keys (JSON, multi-key with datetime support)
+- Value replacements (JSON, strict type matching)
+
+**Sheet 5: Regression_Check** (Quality Rules)
+- Rule names
+- Field selection pipeline (include/exclude patterns)
+- Scope (all organizations or specific org list)
+- Transition patterns (expected state changes)
+- Severity levels (Warning/Critical)
+
+#### Layer 2: Organization Mapping (`eb_org_center_mapping.xlsx`)
+- Optional mapping file
+- Sheet: `Org_Center_Mapping`
+- Maps organization names to center identifiers
+- Gracefully degraded if missing
+
+#### Layer 3: Excel Templates (`config/templates/`)
+- Excel workbook templates with:
+  - Sheet definitions
+  - Named ranges (for data fill targets)
+  - Formula structures
+  - Formatting and styles
+
+### Configuration Constants (in code)
+
+```python
+# API Configuration
+IAM_URL = "https://api-auth.ziwig-connect.com"
+RC_URL = "https://api-hcp.ziwig-connect.com"
+GDD_URL = "https://api-lab.ziwig-connect.com"
+RC_APP_ID = "602aea51-cdb2-4f73-ac99-fd84050dc393"
+RC_ENDOBEST_PROTOCOL_ID = "3c7bcb4d-91ed-4e9f-b93f-99d8447a276e"
+
+# Threading & Performance
+MAX_THREADS = 20                # Main thread pool workers
+ASYNC_THREADS = 40              # Subtasks thread pool workers
+ERROR_MAX_RETRY = 10            # Maximum retry attempts
+WAIT_BEFORE_RETRY = 0.5         # Seconds between retries
+
+# Excluded Organizations
+RC_ENDOBEST_EXCLUDED_CENTERS = ["e18e7487-...", "5582bd75-...", "e053512f-..."]
+```
+
+---
+
+## 🔐 API Integration
+
+### Authentication Flow
+
+```
+1. IAM Login
+   POST https://api-auth.ziwig-connect.com/api/auth/ziwig-pro/login
+   Request: {"username": "...", "password": "..."}
+   Response: {"access_token": "jwt_master", "userId": "uuid"}
+
+2. Token Exchange (RC-specific)
+   POST https://api-hcp.ziwig-connect.com/api/auth/config-token
+   Headers: Authorization: Bearer {master_token}
+   Request: {"userId": "...", "clientId": "...", "userAgent": "..."}
+   Response: {"access_token": "jwt_rc", "refresh_token": "refresh_token"}
+
+3. Automatic Token Refresh (on 401)
+   POST https://api-hcp.ziwig-connect.com/api/auth/refreshToken
+   Headers: Authorization: Bearer {current_token}
+   Request: {"refresh_token": "..."}
+   Response: {"access_token": "jwt_new", "refresh_token": "new_refresh"}
+```
+
+### Key API Endpoints
+
+| Endpoint | Method | Purpose |
+|----------|--------|---------|
+| `/api/inclusions/getAllOrganizations` | GET | List all organizations |
+| `/api/inclusions/inclusion-statistics` | POST | Get patient counts per org |
+| `/api/inclusions/search` | POST | Get inclusions list for org (paginated) |
+| `/api/records/byPatient` | POST | Get clinical record for patient |
+| `/api/surveys/filter/with-answers` | POST | **OPTIMIZED:** Get all questionnaires for patient |
+| `/api/requests/by-tube-id/{tubeId}` | GET | Get lab test results |
+
+### Performance Optimization: Questionnaire Batching
+
+**Problem:** Multiple API calls per patient (1 call per questionnaire × N patients = slow)
+
+**Solution:** Single optimized call retrieves all questionnaires with answers
+
+```
+BEFORE (inefficient):
+for qcm_id in questionnaire_ids:
+    GET /api/surveys/{qcm_id}/answers?subject={patient_id}
+    # Result: N API calls per patient
+
+AFTER (optimized):
+POST /api/surveys/filter/with-answers
+{
+  "context": "clinic_research",
+  "subject": patient_id
+}
+# Result: 1 API call per patient
+# Impact: 4-5x performance improvement
+```
+
+---
+
+## ⚡ Multithreading & Performance Optimization
+
+### Thread Pool Architecture
+
+```
+Main Application Thread
+    ↓
+┌─ Phase 1: Counter Fetching ──────────────────────────┐
+│ ThreadPoolExecutor(max_workers=user_input, cap=20)   │
+│ ├─ Task 1: Get counters for Org 1                     │
+│ ├─ Task 2: Get counters for Org 2                     │
+│ └─ Task N: Get counters for Org N                     │
+│ [Sequential wait: tqdm.as_completed]                  │
+└──────────────────────────────────────────────────────┘
+    ↓
+┌─ Phase 2: Inclusion Data Collection (Nested) ────────┐
+│ Outer: ThreadPoolExecutor(max_workers=user_input)    │
+│                                                       │
+│ For Org 1:                                            │
+│ │   Inner: ThreadPoolExecutor(max_workers=40)        │
+│ │   ├─ Patient 1: Async lab/questionnaire fetch      │
+│ │   ├─ Patient 2: Async lab/questionnaire fetch      │
+│ │   └─ Patient N: Async lab/questionnaire fetch      │
+│ │   [Sequential outer wait: as_completed]            │
+│ │                                                     │
+│ For Org 2:                                            │
+│ │   [Similar parallel processing]                    │
+│ │                                                     │
+│ For Org N:                                            │
+│ │   [Similar parallel processing]                    │
+└──────────────────────────────────────────────────────┘
+```
+
+### Performance Optimizations
+
+1. **Thread-Local HTTP Clients**
+   - Each thread maintains its own `httpx.Client`
+   - Avoids connection conflicts
+   - Implementation via `get_httpx_client()`
+
+2. **Nested Parallelization**
+   - Main pool: Organizations (20 workers)
+   - Subtasks pool: Lab requests (40 workers)
+   - Non-blocking I/O during processing
+
+3. **Questionnaire Batching** (4-5x improvement)
+   - Single call retrieves all questionnaires + answers
+   - Eliminates N filtered calls per patient
+
+4. **Configurable Worker Threads**
+   - User input selection (1-20 workers)
+   - Tunable for network bandwidth and API rate limits
+
+### Progress Tracking (Multi-Level)
+
+```
+Overall Progress [████████████░░░░░░░░░░░░] 847/1200
+  1/15 - Center 1 [██████████░░░░░░░░░░░░░░░]  73/95
+  2/15 - Center 2 [██████░░░░░░░░░░░░░░░░░░░]  42/110
+  3/15 - Center 3 [████░░░░░░░░░░░░░░░░░░░░░]  28/85
+```
+
+**Thread-Safe Updates:**
+```python
+with _global_pbar_lock:
+    if global_pbar:
+        global_pbar.update(1)
+```
+
+---
+
+## 🛡️ Error Handling & Resilience
+
+### Token Management Strategy
+
+1. **Automatic Token Refresh on 401**
+   - Triggered by `@api_call_with_retry` decorator
+   - Thread-safe via `_token_refresh_lock`
+
+2. **Retry Mechanism**
+   - Max retries: 10 attempts
+   - Delay between retries: 0.5 seconds
+   - Decorators: `@api_call_with_retry`
+
+3. **Thread-Safe Token Refresh**
+   ```python
+   def new_token():
+       global access_token, refresh_token
+       with _token_refresh_lock:  # Only one thread refreshes at a time
+           for attempt in range(ERROR_MAX_RETRY):
+               try:
+                   # POST /api/auth/refreshToken
+                   # Update global tokens
+               except:
+                   sleep(WAIT_BEFORE_RETRY)
+   ```
+
+### Exception Handling Categories
+
+| Category | Examples | Handling |
+|----------|----------|----------|
+| **API Errors** | Network timeouts, HTTP errors | Retry with exponential spacing |
+| **File I/O Errors** | Missing config, permission denied | Graceful error + exit |
+| **Validation Errors** | Invalid config, incoherent data | Log warning + prompt user |
+| **Thread Errors** | Worker thread failures | Shutdown gracefully + propagate |
+
+### Graceful Degradation
+
+1. **Missing Organization Mapping:** Skip silently, use fallback (org name)
+2. **Critical Quality Issues:** Prompt user for confirmation before export
+3. **Thread Failure:** Shutdown all workers gracefully, preserve partial results
+4. **Invalid Configuration:** Clear error messages with remediation suggestions
+
+---
+
+## 📊 Data Output Structure
+
+### JSON Output: `endobest_inclusions.json`
+
+```json
+[
+  {
+    "Patient_Identification": {
+      "Organisation_Id": "uuid",
+      "Organisation_Name": "Hospital Name",
+      "Center_Name": "HOSP-A",
+      "Patient_Id": "internal_id",
+      "Pseudo": "ENDO-001",
+      "Patient_Name": "Doe, John",
+      "Patient_Birthday": "1975-05-15",
+      "Patient_Age": 49
+    },
+    "Inclusion": {
+      "Consent_Signed": true,
+      "Inclusion_Date": "15/10/2024",
+      "Inclusion_Status": "incluse",
+      "isPrematurelyTerminated": false
+    },
+    "Extended_Fields": {
+      "Custom_Field_1": "value",
+      "Custom_Field_2": 42,
+      "Composite_Score": "8/10"
+    },
+    "Endotest": {
+      "Request_Sent": true,
+      "Diagnostic_Status": "Completed"
+    }
+  }
+]
+```
+
+### JSON Output: `endobest_organizations.json`
+
+```json
+[
+  {
+    "id": "org-uuid",
+    "name": "Hospital A",
+    "Center_Name": "HOSP-A",
+    "patients_count": 45,
+    "preincluded_count": 8,
+    "included_count": 35,
+    "prematurely_terminated_count": 2
+  }
+]
+```
+
+---
+
+## 🚀 Execution Modes
+
+### Mode 1: Normal (Full Collection)
+```bash
+python eb_dashboard.py
+```
+- Authenticates
+- Collects from APIs
+- Runs quality checks
+- Exports JSON + Excel
+- Duration: 2.5-5 minutes (typical)
+
+### Mode 2: Excel-Only (Fast Export)
+```bash
+python eb_dashboard.py --excel-only
+```
+- Skips data collection
+- Uses existing JSON files
+- Regenerates Excel workbooks
+- Duration: 5-15 seconds
+- Use case: Reconfigure reports, test templates
+
+### Mode 3: Check-Only (Validation Only)
+```bash
+python eb_dashboard.py --check-only
+```
+- Loads existing JSON
+- Runs quality checks
+- No export
+- Duration: 5-10 seconds
+- Use case: Verify data before distribution
+
+### Mode 4: Debug (Verbose Output)
+```bash
+python eb_dashboard.py --debug
+```
+- Executes normal mode
+- Enables detailed logging
+- Shows field-by-field changes
+- Check `dashboard.log` for details
+
+---
+
+## 📈 Performance Metrics & Benchmarks
+
+### Typical Execution Times (Full Dataset: 1,200+ patients, 15+ organizations)
+
+| Phase | Duration | Notes |
+|-------|----------|-------|
+| **Login & Config** | 2-3 sec | Sequential, network-dependent |
+| **Fetch Counters** | 5-8 sec | 20 workers, parallelized |
+| **Collect Inclusions** | 2-4 min | Includes API calls + field processing |
+| **Quality Checks** | 10-15 sec | File loads, data comparison |
+| **Export to JSON** | 3-5 sec | File I/O |
+| **Export to Excel** | 5-15 sec | Template processing + fill |
+| **TOTAL** | **~2.5-5 min** | Depends on network, API perf |
+
+### Network Optimization Impact
+
+**With old questionnaire approach (N filtered calls per patient):**
+- 1,200 patients × 15 questionnaires = 18,000 API calls
+- Estimated: 15-30 minutes
+
+**With optimized single-call questionnaire:**
+- 1,200 patients × 1 call = 1,200 API calls
+- Estimated: 2-5 minutes
+- **Improvement: 3-6x faster** ✅
+
+---
+
+## 🔍 Field Extraction & Processing Logic
+
+### Complete Field Processing Pipeline
+
+```
+For each field in INCLUSIONS_MAPPING_CONFIG:
+  │
+  ├─ Step 1: Determine Source Type
+  │  ├─ q_id / q_name / q_category → Find questionnaire
+  │  ├─ record → Use clinical record
+  │  ├─ inclusion → Use patient inclusion data
+  │  ├─ request → Use lab request data
+  │  └─ calculated → Execute custom function
+  │
+  ├─ Step 2: Extract Raw Value
+  │  ├─ Navigate JSON using field_path
+  │  ├─ Supports wildcard (*) for list traversal
+  │  └─ Return value or "undefined"
+  │
+  ├─ Step 3: Check Field Condition (optional)
+  │  ├─ If condition undefined → Set to "undefined"
+  │  ├─ If condition not boolean → Error flag
+  │  ├─ If condition false → Set to "N/A"
+  │  └─ If condition true → Continue
+  │
+  ├─ Step 4: Apply Post-Processing Transformations
+  │  ├─ true_if_any: Convert to boolean
+  │  ├─ value_labels: Map to localized text
+  │  ├─ field_template: Apply formatting
+  │  └─ List joining: Flatten arrays with pipe delimiter
+  │
+  ├─ Step 5: Format Score Dictionaries
+  │  ├─ If {total, max} → Format as "total/max"
+  │  └─ Otherwise → Keep as-is
+  │
+  └─ Store: output_inclusion[field_group][field_name] = final_value
+```
+
+### Custom Functions for Calculated Fields
+
+| Function | Purpose | Syntax |
+|----------|---------|--------|
+| `search_in_fields_using_regex` | Search multiple fields for pattern | `["search_in_fields_using_regex", "pattern", "field1", "field2"]` |
+| `extract_parentheses_content` | Extract text within parentheses | `["extract_parentheses_content", "field_name"]` |
+| `append_terminated_suffix` | Add suffix if patient terminated | `["append_terminated_suffix", "status_field", "is_terminated_field"]` |
+| `if_then_else` | Unified conditional with 8 operators | `["if_then_else", "operator", arg1, arg2_optional, true_result, false_result]` |
+
+**if_then_else Operators:**
+- `is_true` / `is_false` - Boolean field test
+- `is_defined` / `is_undefined` - Existence test
+- `all_true` / `all_defined` - Multiple field test
+- `==` / `!=` - Value comparison
+
+---
+
+## ✅ Quality Assurance Framework
+
+### Coherence Check
+
+**Purpose:** Verify API-provided statistics match actual collected data
+
+**Logic:**
+```
+For each organization:
+  API_Count = statistic.total
+  Actual_Count = count of inclusion records
+
+  if API_Count != Actual_Count:
+    Report discrepancy with severity
+    ├─ ±10%: Warning
+    └─ >±10%: Critical
+```
+
+### Non-Regression Check
+
+**Purpose:** Detect unexpected changes between data runs
+
+**Configuration-Driven Rules:**
+- Field selection pipeline (include/exclude patterns)
+- Transition patterns (expected state changes)
+- Severity levels (Warning/Critical)
+- Exception handling (exclude specific organizations)
+
+**Logic:**
+```
+Load previous inclusion data (_old file)
+
+For each rule:
+  ├─ Build candidate fields via pipeline
+  ├─ Determine key field for matching
+  └─ For each inclusion:
+     ├─ Find matching old inclusion by key
+     ├─ Check for unexpected transitions
+     ├─ Apply exceptions
+     └─ Report violations
+```
+
+---
+
+## 📋 Documentation Structure
+
+The system includes comprehensive documentation:
+
+| Document | Size | Content |
+|----------|------|---------|
+| **DOCUMENTATION_10_ARCHITECTURE.md** | 43.7 KB | System design, workflow, APIs, multithreading |
+| **DOCUMENTATION_11_FIELD_MAPPING.md** | 56.3 KB | Field extraction logic, custom functions, examples |
+| **DOCUMENTATION_12_QUALITY_CHECKS.md** | 60.2 KB | QA framework, regression rules, configuration |
+| **DOCUMENTATION_13_EXCEL_EXPORT.md** | 29.6 KB | Excel generation, data transformation, config |
+| **DOCUMENTATION_98_USER_GUIDE.md** | 8.4 KB | End-user instructions, troubleshooting, FAQ |
+| **DOCUMENTATION_99_CONFIG_GUIDE.md** | 24.8 KB | Administrator reference, Excel tables, examples |
+
+---
+
+## 🔧 Key Technical Features
+
+### Thread Safety
+- Per-thread HTTP clients (no connection conflicts)
+- Synchronized access to global state via locks
+- Thread-safe progress bar updates
+
+### Error Recovery
+- Automatic token refresh on 401 errors
+- Exponential backoff retry logic (configurable)
+- Graceful degradation for optional features
+- User confirmation on critical issues
+
+### Configuration Flexibility
+- 100% externalized to Excel (zero code changes)
+- Supports multiple data sources
+- Custom business logic functions
+- Field dependencies and conditions
+- Value transformations and templates
+
+### Performance
+- Optimized API calls (4-5x improvement)
+- Parallel processing (20+ workers)
+- Async I/O operations
+- Configurable thread pools
+
+### Data Quality
+- Coherence checking (stats vs actual data)
+- Non-regression testing (config-driven)
+- Comprehensive validation
+- Audit trail logging
+
+---
+
+## 📦 Dependencies
+
+### Core Libraries
+- **httpx** - HTTP client with connection pooling
+- **openpyxl** - Excel file reading/writing
+- **questionary** - Interactive CLI prompts
+- **tqdm** - Progress bars
+- **rich** - Rich text formatting
+- **pywin32** - Windows COM automation (optional, for formula recalculation)
+- **pytz** - Timezone support (optional)
+
+### Python Version
+- Python 3.7+
+
+### External Services
+- Ziwig IAM API
+- Ziwig Research Clinic (RC) API
+- Ziwig Lab (GDD) API
+
+---
+
+## 🎓 Usage Patterns
+
+### For End Users
+1. Configure fields in Excel (no code needed)
+2. Run: `python eb_dashboard.py`
+3. Review results in JSON or Excel
+
+### For Administrators
+1. Add new fields to `Inclusions_Mapping`
+2. Define quality rules in `Regression_Check`
+3. Configure Excel export in `Excel_Workbooks` + `Excel_Sheets`
+4. Restart: script picks up config automatically
+
+### For Developers
+1. Add custom function to Block 6 (eb_dashboard.py)
+2. Register in field config (Inclusions_Mapping)
+3. Use via: `"source_id": "function_name"`
+4. No code recompile needed for other changes
+
+---
+
+## 🎯 Summary
+
+The **Endobest Clinical Research Dashboard** represents a mature, production-ready system that successfully combines:
+
+✅ **Architectural Excellence** - Clean modular design with separation of concerns
+✅ **User-Centric Configuration** - 100% externalized, no code changes needed
+✅ **Performance Optimization** - 4-5x faster via API and threading improvements
+✅ **Robust Resilience** - Comprehensive error handling, automatic recovery, graceful degradation
+✅ **Quality Assurance** - Multi-level validation, coherence checks, regression testing
+✅ **Comprehensive Documentation** - 250+ KB of technical and user guides
+✅ **Maintainability** - Clear code structure, extensive logging, audit trails
+
+The system successfully enables non-technical users to configure complex data extraction and reporting workflows while maintaining enterprise-grade reliability and performance standards.
+
+---
+
+**Document Version:** 1.0
+**Last Updated:** 2025-11-08
+**Status:** ✅ Complete & Production Ready