Files
EB_Dashboard/DOCUMENTATION/DOCUMENTATION_31_FLOWCHART_DIAGRAMS.md

684 lines
28 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 📊 Endobest Dashboard - Visual Flowcharts & Diagrams
**Visual Reference for Understanding System Flow**
---
## 1⃣ Main Execution Flow
```
START
├─→ [USER INPUT]
│ ├─ Credentials (login)
│ ├─ Number of threads (1-20)
│ └─ Execution mode (normal/excel_only/check_only)
├─→ [AUTHENTICATION]
│ ├─ IAM Login: POST /api/auth/ziwig-pro/login
│ ├─ Token Exchange: POST /api/auth/config-token
│ └─ ✅ Get access_token + refresh_token
├─→ [CONFIGURATION LOADING]
│ ├─ Load Endobest_Dashboard_Config.xlsx
│ ├─ Parse 5 sheets:
│ │ ├─ Inclusions_Mapping
│ │ ├─ Organizations_Mapping
│ │ ├─ Excel_Workbooks
│ │ ├─ Excel_Sheets
│ │ └─ Regression_Check
│ └─ ✅ Validate all configs
├─→ [PHASE 2: ORG + COUNTERS] (5-8 sec)
│ ├─ GET /api/inclusions/getAllOrganizations
│ ├─ POST /api/inclusions/inclusion-statistics (20 workers)
│ ├─ Optional: Load eb_org_center_mapping.xlsx
│ └─ ✅ Organizations with counts ready
├─→ [PHASE 3: DATA COLLECTION] (2-4 min)
│ ├─ Outer Loop: 20 workers per organization
│ │ ├─ GET /api/inclusions/search
│ │ └─ For each patient (sequential):
│ │ ├─ POST /api/records/byPatient
│ │ ├─ POST /api/surveys/filter/with-answers
│ │ ├─ Submit: GET /api/requests/by-tube-id (async)
│ │ └─ Process: Field mapping + transformation
│ └─ ✅ All inclusion data collected
├─→ [PHASE 4: QUALITY CHECKS] (10-15 sec)
│ ├─ Coherence Check:
│ │ └─ Compare API stats vs actual data
│ ├─ Non-Regression Check:
│ │ └─ Compare current vs previous run
│ └─ Decision Point:
│ ├─ ✅ No critical issues → Continue
│ └─ ❌ Critical issues → Prompt user
├─→ [USER CONFIRMATION]
│ └─ "Critical issues detected. Write anyway? [Y/N]"
│ ├─ YES → Continue to export
│ └─ NO → Exit (files NOT modified)
├─→ [PHASE 5: EXPORT]
│ ├─ Backup old files
│ ├─ Write JSON outputs
│ │ ├─ endobest_inclusions.json
│ │ └─ endobest_organizations.json
│ ├─ Generate Excel (if configured)
│ │ ├─ Load templates
│ │ ├─ Apply filters/sorts
│ │ ├─ Fill data
│ │ └─ Recalculate formulas
│ └─ ✅ All files written
├─→ [COMPLETION]
│ ├─ Display elapsed time
│ ├─ Log summary
│ └─ Press Enter to exit
END
```
---
## 2⃣ Data Collection Detail (Phase 3)
```
[PHASE 3: DATA COLLECTION]
├─ Setup: Thread Pools
│ ├─ Main Pool: ThreadPoolExecutor(N workers, max 20)
│ └─ Async Pool: ThreadPoolExecutor(40 workers)
├─ Outer Loop: FOR each Organization (20 workers)
│ │
│ ├─ [1] GET /api/inclusions/search
│ │ └─ Response: list of patients for this org
│ │
│ ├─ For Each Patient (Sequential):
│ │ │
│ │ ├─ [2] POST /api/records/byPatient
│ │ │ └─ Response: clinical_record
│ │ │
│ │ ├─ [3] POST /api/surveys/filter/with-answers ⚡ OPTIMIZED
│ │ │ └─ Response: all questionnaires + answers in 1 call
│ │ │
│ │ ├─ [4] Submit async task (Async Pool):
│ │ │ │ GET /api/requests/by-tube-id/{tubeId}
│ │ │ │ └─ Fetched in background while main thread processes
│ │ │ │
│ │ │ └─ [CONTINUE] Process Field Mappings:
│ │ │ │
│ │ │ ├─ For Each Field in Config:
│ │ │ │ ├─ Determine Source
│ │ │ │ │ ├─ q_id/q_name/q_category → Find questionnaire
│ │ │ │ │ ├─ record → Use clinical record
│ │ │ │ │ ├─ inclusion → Use patient data
│ │ │ │ │ ├─ request → Use lab request
│ │ │ │ │ └─ calculated → Execute function
│ │ │ │ │
│ │ │ │ ├─ Extract Value (JSON path + wildcard support)
│ │ │ │ ├─ Check Condition (if defined)
│ │ │ │ ├─ Apply Transformations:
│ │ │ │ │ ├─ true_if_any (convert to boolean)
│ │ │ │ │ ├─ value_labels (map to text)
│ │ │ │ │ ├─ field_template (format)
│ │ │ │ │ └─ List joining (flatten arrays)
│ │ │ │ │
│ │ │ │ └─ Store: output_inclusion[group][field] = value
│ │ │ │
│ │ │ ├─ Wait: Lab request from async pool
│ │ │ └─ Final: Complete inclusion record ready
│ │ │
│ │ ├─ [5] Update Progress Bars
│ │ │ ├─ Organization progress
│ │ │ ├─ Overall progress
│ │ │ └─ (Thread-safe with lock)
│ │ │
│ │ └─ [NEXT PATIENT]
│ │
│ └─ [NEXT ORGANIZATION]
└─ Combine All Inclusions
└─ ✅ All data collected
```
---
## 3⃣ Field Processing Pipeline
```
[INPUT: Configuration field entry]
├─ Field Name: "Patient_Age"
├─ Source Type: "calculated"
├─ Field Path: ["extract_age_from_birthday", "Patient_Birthday"]
├─ Condition: undefined
├─ Value Labels: undefined
├─ Field Template: undefined
└─ True If Any: undefined
[STEP 1: Determine Source]
├─ Is source "calculated"?
│ └─ YES → Call custom function
│ execute_calculated_field(
│ "extract_age_from_birthday",
│ ["Patient_Birthday"]
│ )
[STEP 2: Extract Raw Value]
├─ Get function result
├─ raw_value = 49 (calculated)
[STEP 3: Check Condition]
├─ Condition field: undefined
├─ Action: Skip condition check
[STEP 4: Apply Transformations]
├─ true_if_any: undefined (skip)
├─ value_labels: undefined (skip)
├─ field_template: undefined (skip)
├─ List joining: N/A (not a list)
[STEP 5: Format Score Dictionary]
├─ Is value dict with {total, max}?
│ └─ NO → Keep as 49
[OUTPUT: final_value = 49]
└─ Store: output_inclusion["Patient_Identification"]["Patient_Age"] = 49
```
---
## 4⃣ Questionnaire Optimization Impact
```
SCENARIO: Patient with 15 questionnaires
❌ OLD APPROACH (SLOW):
├─ Loop: for each questionnaire ID
│ ├─ API Call 1: GET /api/surveys/qcm_1/answers?subject=patient_id
│ ├─ API Call 2: GET /api/surveys/qcm_2/answers?subject=patient_id
│ ├─ API Call 3: GET /api/surveys/qcm_3/answers?subject=patient_id
│ └─ ... (15 calls total)
├─ Per Patient: 15 API calls
├─ Total (1200 patients): 18,000 API calls
├─ Estimated Time: 15-30 minutes
└─ Result: SLOW ⏳
✅ NEW APPROACH (FAST - OPTIMIZED):
├─ Single API Call:
│ │
│ └─ POST /api/surveys/filter/with-answers
│ {
│ "context": "clinic_research",
│ "subject": patient_id
│ }
│ ↓
│ Response: [
│ {questionnaire: {...}, answers: {...}},
│ {questionnaire: {...}, answers: {...}},
│ ... (all 15 questionnaires in 1 response)
│ ]
├─ Per Patient: 1 API call
├─ Total (1200 patients): 1,200 API calls
├─ Estimated Time: 2-5 minutes
└─ Result: FAST ⚡ (4-5x improvement)
```
---
## 5⃣ Multithreading Architecture
```
MAIN THREAD
├─────────────────────────────────────────────────┐
│ PHASE 2: Counter Fetching (Sequential Wait) │
│ │
│ Thread Pool: 20 workers │
│ ├─ Worker 1: Fetch counters for Org 1 │
│ ├─ Worker 2: Fetch counters for Org 2 │
│ ├─ Worker 3: Fetch counters for Org 3 │
│ └─ ... │
│ │ │
│ └─ Wait: tqdm.as_completed() [All done] │
│ │
└─────────────────────────────────────────────────┘
│ (Sequential barrier)
┌─────────────────────────────────────────────────┐
│ PHASE 3: Inclusion Collection (Nested) │
│ │
│ Outer Thread Pool: 20 workers (Orgs) │
│ │ │
│ ├─ Worker 1: Process Org 1 │
│ │ ├─ GET /api/inclusions/search │
│ │ └─ For each patient: │
│ │ ├─ Process clinical record │
│ │ ├─ Process questionnaires │
│ │ └─ [ASYNC] Submit lab fetch: │
│ │ │ │
│ │ └─ Inner Pool (40 workers) │
│ │ ├─ Worker A: Fetch lab for Pat1 │
│ │ ├─ Worker B: Fetch lab for Pat2 │
│ │ └─ ... │
│ │ │
│ ├─ Worker 2: Process Org 2 [same pattern] │
│ ├─ Worker 3: Process Org 3 [same pattern] │
│ └─ ... │
│ │ │
│ └─ Wait: tqdm.as_completed() [All done] │
│ │
└─────────────────────────────────────────────────┘
│ (Sequential barrier)
┌─────────────────────────────────────────────────┐
│ PHASE 4: Quality Checks (Sequential) │
│ ├─ Coherence check (single-threaded) │
│ ├─ Non-regression check (single-threaded) │
│ └─ Results ready │
│ │
└─────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────┐
│ PHASE 5: Export (Sequential) │
│ ├─ Write JSON files │
│ ├─ Generate Excel (single-threaded) │
│ └─ Done │
│ │
└─────────────────────────────────────────────────┘
```
---
## 6⃣ Quality Checks Logic
```
[INPUT: Current data, Previous data, Config rules]
┌─────────────────────────────────────────┐
│ COHERENCE CHECK │
├─────────────────────────────────────────┤
│ │
│ For each organization: │
│ ├─ API Count = from statistics │
│ ├─ Actual Count = from inclusions │
│ └─ Compare: │
│ ├─ Δ ≤ 10% → Warning │
│ └─ Δ > 10% → CRITICAL ⚠️ │
│ │
│ Result: has_coherence_critical │
│ │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ NON-REGRESSION CHECK │
├─────────────────────────────────────────┤
│ │
│ For each regression rule: │
│ ├─ Load previous data │
│ ├─ Build field selection (pipeline) │
│ ├─ Find key field for matching │
│ └─ For each inclusion: │
│ ├─ Match with previous record │
│ ├─ Check transitions │
│ ├─ Apply exceptions │
│ └─ Report violations: │
│ ├─ ⚠️ Warning │
│ └─ 🔴 CRITICAL │
│ │
│ Result: has_regression_critical │
│ │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ DECISION POINT │
├─────────────────────────────────────────┤
│ │
│ If has_coherence_critical OR │
│ has_regression_critical: │
│ │ │
│ └─→ [PROMPT USER] │
│ "Critical issues detected! │
│ Write results anyway? [Y/N]" │
│ ├─ YES → Proceed to export │
│ └─ NO → Cancel, exit │
│ Else: │
│ └─→ [PROCEED] to export │
│ │
└─────────────────────────────────────────┘
[EXPORT PHASE]
```
---
## 7⃣ Excel Export Pipeline
```
[INPUT: Inclusions data, Organizations data, Config]
┌──────────────────────────────────────────────────┐
│ LOAD CONFIGURATION │
├──────────────────────────────────────────────────┤
│ │
│ Excel_Workbooks table: │
│ ├─ workbook_name: "Endobest_Output" │
│ ├─ template_path: "templates/Endobest.xlsx" │
│ ├─ output_filename: "{name}_{date_time}.xlsx" │
│ └─ output_exists_action: "Increment" │
│ │
│ Excel_Sheets table: │
│ ├─ workbook_name: "Endobest_Output" │
│ ├─ sheet_name: "Inclusions" │
│ ├─ source_type: "Inclusions" │
│ ├─ target: "DataTable" │
│ ├─ column_mapping: JSON │
│ ├─ filter_condition: JSON │
│ ├─ sort_keys: JSON │
│ └─ value_replacement: JSON │
│ │
└──────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────┐
│ FOR EACH WORKBOOK │
├──────────────────────────────────────────────────┤
│ │
│ [1] Load template from Excel file │
│ │ │
│ [2] FOR EACH SHEET IN WORKBOOK │
│ │ │ │
│ │ ├─ Determine data source │
│ │ │ ├─ Inclusions → Use patient data │
│ │ │ ├─ Organizations → Use org stats │
│ │ │ └─ Variable → Use template variables │
│ │ │ │
│ │ ├─ [FILTER] Apply AND conditions │
│ │ │ └─ Keep only rows matching all filters │
│ │ │ │
│ │ ├─ [SORT] Apply multi-key sorting │
│ │ │ ├─ Primary: Organization Name (desc) │
│ │ │ ├─ Secondary: Date (asc) │
│ │ │ └─ Tertiary: Patient Name (asc) │
│ │ │ │
│ │ ├─ [REPLACE] Apply value transformations │
│ │ │ ├─ Boolean: true→"Yes", false→"No" │
│ │ │ ├─ Status codes: 1→"Active", 0→"Inactive" │
│ │ │ └─ Enum values: Mapped to display text │
│ │ │ │
│ │ ├─ [FILL] Write data to sheet │
│ │ │ ├─ Load column mapping │
│ │ │ ├─ For each row in filtered+sorted data│
│ │ │ │ └─ Write to row in Excel │
│ │ │ └─ Fill target (cell or named range) │
│ │ │ │
│ │ └─ [NEXT SHEET] │
│ │ │
│ [3] Handle file conflicts │
│ │ ├─ Overwrite: Replace existing file │
│ │ ├─ Increment: Create _1, _2, _3 variants │
│ │ └─ Backup: Rename existing to _backup │
│ │ │
│ [4] Save workbook (openpyxl) │
│ │ │
│ [5] Recalculate formulas (win32com) [OPTIONAL] │
│ │ ├─ If win32com available │
│ │ ├─ Open in Excel │
│ │ ├─ Force recalculation │
│ │ ├─ Save │
│ │ └─ Close │
│ │ │
│ └─ [NEXT WORKBOOK] │
│ │
└──────────────────────────────────────────────────┘
[OUTPUT: Excel files created ✅]
```
---
## 8⃣ Error Handling & Recovery
```
[API CALL]
[TRY] Execute HTTP request
├─ SUCCESS (200-299)
│ └─→ Return response
└─ FAILURE
├─ HTTP 401 (Unauthorized)
│ │
│ ├─→ Lock acquired (token_refresh_lock)
│ ├─→ new_token(): Refresh token
│ │ POST /api/auth/refreshToken
│ │ └─ Update global tokens
│ ├─→ Lock released
│ └─→ RETRY the original request
├─ Network Error (timeout, connection refused, etc.)
│ │
│ ├─→ Log warning with attempt number
│ ├─→ RETRY after WAIT_BEFORE_RETRY seconds
│ └─→ Increment attempt counter
└─ Other HTTP Error (4xx, 5xx)
├─→ Log warning with attempt number
├─→ RETRY after WAIT_BEFORE_RETRY seconds
└─→ Increment attempt counter
[LOOP: Repeat for ERROR_MAX_RETRY times (10 attempts)]
├─ Attempt 1, 2, 3, ..., 9
│ └─→ If any succeeds: Return response
└─ Attempt 10
├─ Still failing?
│ │
│ └─→ Log CRITICAL error
│ └─→ Raise exception (propagate to main)
└─ Main catches exception
├─→ Display error message
├─→ Log to dashboard.log
└─→ Exit gracefully
```
---
## 9⃣ Configuration-Driven Execution
```
┌─────────────────────────────────────────────┐
│ ENDOBEST_DASHBOARD_CONFIG.XLSX │
├─────────────────────────────────────────────┤
│ │
│ Sheet 1: Inclusions_Mapping │
│ ├─ Row 1: Headers │
│ ├─ Row 2+: Field definitions │
│ │ ├─ field_group: "Patient_Identification" │
│ │ ├─ field_name: "Patient_Id" │
│ │ ├─ source_id: "q_name" │
│ │ ├─ source_value: "Demographics" │
│ │ ├─ field_path: ["patient_id"] │
│ │ └─ ... (more transformations) │
│ │ │
│ Sheet 2: Organizations_Mapping │
│ │ ├─ Defines org fields │
│ │ └─ Rarely modified │
│ │ │
│ Sheet 3: Excel_Workbooks │
│ │ ├─ Workbook metadata │
│ │ ├─ Template references │
│ │ └─ Output filename templates │
│ │ │
│ Sheet 4: Excel_Sheets │
│ │ ├─ Sheet configurations │
│ │ ├─ Data transformation rules │
│ │ └─ Filters, sorts, replacements │
│ │ │
│ Sheet 5: Regression_Check │
│ │ ├─ Quality check rules │
│ │ ├─ Field selection pipelines │
│ │ ├─ Transition patterns │
│ │ └─ Severity levels │
│ │ │
└─────────────────────────────────────────────┘
│ [APPLICATION LOADS]
┌─────────────────────────────────────────┐
│ LOADED IN MEMORY │
├─────────────────────────────────────────┤
│ │
│ INCLUSIONS_MAPPING_CONFIG = [...] │
│ REGRESSION_CHECK_CONFIG = [...] │
│ EXCEL_EXPORT_CONFIG = {...} │
│ │
│ [USED BY] │
│ ├─ Field extraction (all fields) │
│ ├─ Quality checks (regression rules) │
│ └─ Excel generation (workbook config) │
│ │
└─────────────────────────────────────────┘
│ [CHANGES = AUTOMATIC ON NEXT RUN]
└─→ NO CODE RECOMPILATION NEEDED ✅
└─→ NO RESTART NEEDED ✅
```
---
## 🔟 File I/O & Backup Strategy
```
[INITIAL STATE]
├─ endobest_inclusions.json (today's data)
├─ endobest_inclusions_old.json (yesterday's data)
├─ endobest_organizations.json (today's stats)
└─ endobest_organizations_old.json (yesterday's stats)
[EXECUTION STARTS]
├─ [Collect new data]
├─ [Run quality checks]
└─ [If quality checks passed]
└─→ [Backup old → old.old]
├─ endobest_inclusions.json → endobest_inclusions_old.json
└─ endobest_organizations.json → endobest_organizations_old.json
└─→ [Write new files]
├─ NEW endobest_inclusions.json (written)
└─ NEW endobest_organizations.json (written)
└─→ [COMPLETION]
[FINAL STATE]
├─ endobest_inclusions.json (NEW data)
├─ endobest_inclusions_old.json (TODAY's data, now old)
├─ endobest_organizations.json (NEW stats)
└─ endobest_organizations_old.json (TODAY's stats, now old)
[IF CRITICAL ISSUES]
├─ [Skip backup & write]
├─ [Old files preserved]
├─ [User prompted: write anyway?]
│ ├─ YES → Write new files (override)
│ └─ NO → Abort, keep old files
└─ [No data loss]
```
---
## File Format Examples
### endobest_inclusions.json Structure
```
[
{
"Patient_Identification": {
"Organisation_Id": "...",
"Organisation_Name": "...",
"Center_Name": "..." (from mapping),
"Patient_Id": "...",
"Pseudo": "...",
"Patient_Name": "...",
"Patient_Birthday": "...",
"Patient_Age": ...
},
"Inclusion": {
"Consent_Signed": true/false,
"Inclusion_Date": "...",
"Inclusion_Status": "...",
...
},
"Extended_Fields": {
"Custom_Field_1": "...",
"Custom_Field_2": ...,
...
},
"Endotest": {
"Request_Sent": true/false,
"Diagnostic_Status": "...",
...
}
}
]
```
### endobest_organizations.json Structure
```
[
{
"id": "org-uuid",
"name": "Hospital Name",
"Center_Name": "HOSP-A" (from mapping),
"patients_count": 45,
"preincluded_count": 8,
"included_count": 35,
"prematurely_terminated_count": 2
}
]
```
---
**All diagrams above show the complete system flow from start to finish.**
**For detailed implementation, see technical documentation files.**