# Endobest Excel Export Feature & Architecture ## Part 4: Configuration-Driven Excel Workbook Generation **Document Version:** 1.1 **Last Updated:** 2025-11-11 **Audience:** Developers, Business Analysts, System Architects **Language:** English --- ## Table of Contents 1. [Overview](#overview) 2. [Architecture & Design](#architecture--design) 3. [Core Components](#core-components) 4. [High-Level Orchestration Functions (v1.1+)](#high-level-orchestration-functions-v11) 5. [Configuration System](#configuration-system) 6. [Data Flow & Processing Pipeline](#data-flow--processing-pipeline) 7. [Excel Export Functions](#excel-export-functions) 8. [Filter, Sort & Replacement Logic](#filter-sort--replacement-logic) 9. [Template Variables](#template-variables) 10. [File Conflict Handling](#file-conflict-handling) 11. [Integration with Main Dashboard](#integration-with-main-dashboard) 12. [Error Handling & Validation](#error-handling--validation) 13. [Configuration Examples](#configuration-examples) 14. [Troubleshooting & Debugging](#troubleshooting--debugging) --- ## Overview The **Excel Export Feature** enables generation of configurable Excel workbooks from patient inclusion data and organization statistics. The system is entirely configuration-driven, allowing non-technical users to define export behavior through Excel configuration tables without code modifications. ### Key Characteristics **Configuration-Driven Design:** - All export behavior defined in `Endobest_Dashboard_Config.xlsx` - Two tables: `Excel_Workbooks` (metadata) and `Excel_Sheets` (sheet definitions) - No code changes needed to modify export behavior **Modular Architecture:** - New module: `eb_dashboard_excel_export.py` - Separation of concerns: Excel logic isolated from main dashboard - Dependency injection for testing and flexibility **Data Transformation:** - **Filter:** AND conditions with nested field support - **Sort:** Multi-key sorting with case-insensitive strings, datetime parsing, natural alphanumeric sorting (`*natsort`) - **Replace:** Strict type matching with first-match-wins logic - **Fill:** Direct cell or named range targeting > **Note:** For complete configuration details and up-to-date column specifications, refer to `DOCUMENTATION_99_CONFIG_GUIDE.md` **Three Operating Modes:** 1. **Normal:** Full collection → Quality checks → JSON export → Excel export 2. **--excel-only:** Load existing JSON → Excel export (fast iteration) 3. **--check-only:** Quality checks only (unchanged, for backward compatibility) --- ## Architecture & Design ### Module Structure ``` eb_dashboard_excel_export.py ├── Imports & Dependencies ├── Constants & Configuration │ └── EXCEL_RECALC_TIMEOUT = 60 ├── Module-Level Variables (Injected) │ ├── console (Rich Console instance) │ ├── DASHBOARD_CONFIG_FILE_NAME │ └── Other global references │ ├── Public API (Called from main) │ ├── load_excel_export_config(console) │ ├── validate_excel_config(excel_config, console, inclusions_mapping, organizations_mapping) │ └── export_to_excel(inclusions_data, organizations_data, excel_config, console) │ └── Internal Functions (Helpers) ├── _prepare_template_variables() ├── _apply_filter(item, filter_condition) ├── _apply_sort(items, sort_keys) ├── _apply_value_replacement(value, replacements) ├── _handle_output_exists(output_path, action) ├── _get_named_range_dimensions(workbook, range_name) [openpyxl - validation phase] ├── _get_table_dimensions_xlwings(workbook_xw, range_name) [xlwings - data processing] ├── _recalculate_workbook(workbook_path) ├── _process_sheet_xlwings(workbook_xw, sheet_config, ...) [xlwings - data fill] └── set_dependencies(...) ``` ### Design Principles 1. **Configuration-First:** Behavior determined by config, not code 2. **Pure Functions:** Helper functions are pure (no side effects) except I/O 3. **xlwings-First Architecture:** Data processing uses xlwings exclusively (native Excel COM API) - Configuration validation uses openpyxl (read-only, lighter footprint) - Data fill & processing uses xlwings (preserves workbook structure, formulas, images) - Automatic formula recalculation via xlwings COM API during cell updates - No redundant file reloads - metadata read via COM API without reloading 4. **Early Validation:** Config errors detected at startup, before data collection --- ## Core Components ### 1. Configuration Loading **Function:** `load_excel_export_config(console)` Loads Excel export configuration from the `Endobest_Dashboard_Config.xlsx` file. **Responsibilities:** - Read `Excel_Workbooks` table - Read `Excel_Sheets` table - Parse JSON fields (filter_condition, sort_keys, value_replacement) - Validate structure and presence of required columns - Return parsed config and error status **Return Value:** ```python (config_dict, has_error: bool) ``` **Config Structure:** ```python { "workbooks": [ { "workbook_name": str, "template_path": str, "output_filename": str, "output_exists_action": "Overwrite" | "Increment" | "Backup" }, ... ], "sheets": [ { "workbook_name": str, "sheet_name": str, "source_type": "Variable" | "Inclusions" | "Organizations", "target": str, "column_mapping": dict | None, "filter_condition": dict | None, "sort_keys": list | None, "value_replacement": list | None }, ... ] } ``` ### 2. Configuration Validation **Function:** `validate_excel_config(excel_config, console, inclusions_mapping, organizations_mapping)` Validates that all referenced templates exist and have correct structure. **Validations Performed:** - Template files exist in `config/` directory - Template files are valid Excel (`.xlsx`) - Named ranges exist in templates - Named range dimensions correct (height=1 for tables, width≥max index) - Column mappings reference valid fields - Source types are valid **Return Value:** ```python (has_critical_error: bool, error_messages: list) ``` ### 3. Excel Export Orchestration **Function:** `export_to_excel(inclusions_data, organizations_data, excel_config, console)` Main orchestration function for Excel export. **Workflow:** 1. Prepare template variables (timestamp, extract_date_time, etc.) 2. For each workbook in config: - Resolve output filename using template variables - Handle file conflicts (Overwrite/Increment/Backup) - Copy template to output location - **XLWINGS PHASE (native Excel COM API):** - Load workbook with xlwings - For each sheet config: - Apply filters, sorts, replacements - Read metadata via xlwings COM API (no file reloads) - Fill cells/named ranges with data - Formulas automatically recalculated by Excel COM API - Save workbook 3. Log summary and completion **Architecture Change (v1.2+):** - Migration from openpyxl to xlwings eliminated need for separate win32com recalculation phase - xlwings uses native Excel COM API, which automatically recalculates formulas during cell updates - Simplified workflow: one Excel session, no hand-off between libraries --- ## High-Level Orchestration Functions (v1.1+) **New in v1.1:** Three high-level orchestration functions were added to completely externalize Excel export orchestration from the main script. These functions follow the established pattern from the quality_checks module. ### 1. `export_excel_only(sys_argv, console_instance, inclusions_filename, organizations_filename, inclusions_mapping_config, organizations_mapping_config)` **Purpose:** Complete orchestration of `--excel-only` CLI mode **Workflow:** 1. Initialize console and set default filenames 2. Call `prepare_excel_export()` to load and validate 3. Handle critical configuration errors with user confirmation 4. Call `execute_excel_export()` to perform export 5. Display results and return **Usage in Main Script:** ```python if excel_only_mode: export_excel_only(sys.argv, console, INCLUSIONS_FILE_NAME, ORGANIZATIONS_FILE_NAME, INCLUSIONS_MAPPING_CONFIG, {}) return ``` **Impact:** Reduces main script from 34 lines to 4 lines (87% reduction) --- ### 2. `run_normal_mode_export(inclusions_data, organizations_data, excel_enabled, excel_config, console_instance, inclusions_mapping_config, organizations_mapping_config)` **Purpose:** Orchestrates Excel export phase during normal workflow **Workflow:** 1. Check if export enabled (returns early if not) 2. Load JSONs from filesystem (ensures consistency) 3. Call `execute_excel_export()` to perform export 4. Display results and return status tuple **Returns:** `(success: bool, error_message: str)` **Usage in Main Script:** ```python # After JSONs are written to disk run_normal_mode_export(output_inclusions, organizations_list, EXCEL_EXPORT_ENABLED, EXCEL_EXPORT_CONFIG, console, INCLUSIONS_MAPPING_CONFIG, {}) ``` **Impact:** Reduces main script from 19 lines to 2 lines (89% reduction) --- ### 3. `prepare_excel_export(inclusions_filename, organizations_filename, console_instance, inclusions_mapping_config, organizations_mapping_config)` **Purpose:** Centralized preparation function - loads JSONs, config, and validates **Responsibility:** - Load inclusions JSON from filesystem - Load organizations JSON from filesystem - Load Excel export configuration - Validate configuration against templates - Aggregate and return all errors **Returns:** `(prep_success: bool, inclusions_data, organizations_data, excel_config, has_critical_errors: bool, error_messages: list)` **Used By:** Both `export_excel_only()` and potentially `run_normal_mode_export()` --- ### 4. `execute_excel_export(inclusions_data, organizations_data, excel_config, console_instance, inclusions_mapping_config, organizations_mapping_config)` **Purpose:** Execute Excel export with comprehensive error handling **Responsibility:** - Call core `export_to_excel()` function - Catch and log all exceptions - Return success/failure status to caller **Returns:** `(success: bool, error_message: str)` **Error Handling:** All exceptions caught and returned as error messages (never raises) --- ### 5. `_load_json_file_internal(filename)` **Purpose:** Internal helper for safe JSON file loading **Responsibility:** - Check file existence - Load and parse JSON - Handle errors gracefully - Return None on failure (instead of raising) **Used By:** `run_normal_mode_export()` internally --- ### Design Pattern: Consistency with Quality Checks The orchestration functions follow the exact pattern established by `run_check_only_mode()` from the quality_checks module: | Aspect | Quality Checks | Excel Export | |--------|---|---| | Standalone mode orchestration | `run_check_only_mode()` | `export_excel_only()` | | Config loading in module | ✅ Yes | ✅ Yes | | User confirmation in module | ✅ Yes | ✅ Yes | | Error handling in module | ✅ Yes | ✅ Yes | | Main script integration | 1 line call | 1 line call | **Result:** Consistent architecture across all major features (quality checks, excel export, etc.) --- ## Configuration System ### Two-Table Configuration The Excel export is configured through two tables in `Endobest_Dashboard_Config.xlsx`: #### Table 1: Excel_Workbooks Defines metadata for each Excel workbook to generate. | Column | Type | Required | Example | Description | |--------|------|----------|---------|-------------| | workbook_name | Text | Yes | "Endobest_Output" | Unique identifier for workbook | | template_path | Text | Yes | "templates/Endobest_Template.xlsx" | Path relative to config/ folder | | output_filename | Text | Yes | "{workbook_name}_{extract_date_time}.xlsx" | Template for output filename | | output_exists_action | Text | Yes | "Increment" | How to handle conflicts (Overwrite/Increment/Backup) | #### Table 2: Excel_Sheets Defines how to fill each sheet in the workbooks. | Column | Type | Required | Example | Description | |--------|------|----------|---------|-------------| | workbook_name | Text | Yes | "Endobest_Output" | Must match Excel_Workbooks entry | | sheet_name | Text | Yes | "Inclusions" | Sheet name in template | | source_type | Text | Yes | "Inclusions" | Variable / Inclusions / Organizations | | target | Text | Yes | "DataTable" | Named range or cell reference | | column_mapping | JSON | Conditional | `{"col_id": "patient_id"}` | For source_type=Inclusions/Organizations only | | filter_condition | JSON | No | `{"status": "active"}` | AND conditions for filtering | | sort_keys | JSON | No | `[["date", "asc"], ["id", "asc", "*natsort"]]` | Sort specification with optional datetime/natsort | | value_replacement | JSON | No | `[{"type": "bool", "true": "Yes", "false": "No"}]` | Value transformations | --- ## Data Flow & Processing Pipeline ### Overview ``` Input Data (inclusions + organizations) ↓ Filter (AND conditions) ↓ Sort (multi-key with datetime) ↓ Value Replacement (strict typing) ↓ Fill Excel Cells/Ranges (via xlwings) ↓ Save Workbook (xlwings) ↓ Formulas Automatically Recalculated (xlwings COM API) ↓ Final Excel File ``` ### Detailed Processing Steps #### Step 1: Filter Applies AND conditions to select matching items. **Logic:** - Start with all items - For each field in filter_condition: - Keep only items where field value equals expected value - Support nested field paths (dot notation: `patient.status`) - Return filtered items **Example:** ```json { "status": "active", "visit_type": "inclusion" } ``` Keeps only items where BOTH conditions are true. #### Step 2: Sort Multi-key sort with datetime awareness and missing field handling. **Logic:** - Apply sort keys in order (first key is primary, second is secondary, etc.) - Detect datetime fields automatically (ISO format: YYYY-MM-DD) - Items with missing fields go to end of sort - Reverse order for `"desc"` order specification **Example:** ```json [ {"field": "visit_type", "order": "asc"}, {"field": "date_visit", "order": "desc"} ] ``` #### Step 3: Value Replacement Transform cell values based on rules (first-match-wins). **Logic:** - Evaluate rules in order - Stop at first matching rule - Strict type matching (e.g., boolean `True` ≠ string `"true"`) - Return original value if no match **Supported Types:** - `"bool"`: Boolean replacement with `"true"` and `"false"` fields - `"str"`: String replacement with `"from"` and `"to"` fields - `"int"`: Integer replacement with `"from"` and `"to"` fields #### Step 4: Fill Excel Place transformed data into Excel cells or named ranges. **Two Modes:** - **Variable (Single Cell):** Write evaluated template string to target cell - **Table (Named Range):** Write filtered/sorted/replaced items to target range ##### 4.1 Variable Mode (Template String Substitution) For `source_type = "Variable"`: 1. Evaluate the source template string using `.format(**template_vars)` 2. Write result to the target named cell 3. Example: `{extract_date_time_french}` → `"2025-01-15 14:30:45+01:00"` ##### 4.2 Table Mode (Data Fill with Column Mapping) For `source_type = "Inclusions"` or `"Organizations"`: **Key Concept:** The first row of the table target serves as BOTH TEMPLATE and FIRST DATA ROW. Some columns may contain formulas that should NOT be overwritten (unmapped columns). **Algorithm:** 1. **Extract Column Mapping** - Load mapping from Inclusions_Mapping or Organizations_Mapping table - Mapping column name comes from Excel_Sheets.source parameter - Mapping contains indices (0, 1, 2...) indicating Excel column positions - Example: ``` Inclusions_Mapping: | field_name | field_group | MainReport_PatientsList | | Patient_Id | Patient_Identification | 0 | | Status | Inclusion | 1 | | Date | Inclusion | 3 | (Column 2 not mapped - preserves template formula!) ``` - Result: `{0: "Patient_Identification.Patient_Id", 1: "Inclusion.Status", 3: "Inclusion.Date"}` 2. **Filter and Sort Data** - Apply AND filter conditions - Apply multi-key sort with datetime parsing - Example: 5 items match filter, sorted by Patient_Id ascending 3. **Extend Table Rows** - Delete any existing data rows below the template row - Keep the first row (template + first data) - For each filtered/sorted item: a. Create new row (or use template row for first item) b. Copy ALL cells from template row (preserves formulas!) c. Overwrite ONLY mapped columns with JSON data d. Apply value_replacement to mapped values 4. **Preserve Formulas in Unmapped Columns** - Unmapped columns (those without index in mapping) keep template values - If template column contains formula, it's preserved and recalculates later - Allows mixed rows: some columns from JSON, some from formulas **Example:** Template Row (Row 1): ``` | A: P001 | B: Active | C: =SUM(...) | D: 2025-01 | | (mapped 0) | (mapped 1) | (formula!) | (mapped 3) | ``` After processing (3 data items): ``` | A: P001 | B: Active | C: 45 | D: 2025-01 | ← Template + first data | A: P002 | B: Active | C: 67 | D: 2025-02 | ← Data 2 (formula copied) | A: P003 | B: Active | C: 89 | D: 2025-03 | ← Data 3 (formula copied) ``` Result: - Columns A, B, D filled with JSON data and value replacement - Column C: Formula `=SUM(...)` copied to all rows, will recalculate - All rows have consistent formatting from template --- ## Excel Export Functions ### Public Functions (3) #### load_excel_export_config(console=None) ```python def load_excel_export_config(console_instance=None): """Load Excel export configuration from config file. Reads Excel_Workbooks and Excel_Sheets tables from Endobest_Dashboard_Config.xlsx, parses JSON fields. Args: console_instance: Optional Rich Console for messages Returns: (config_dict, has_error: bool) Raises: None (returns error status instead) """ ``` #### validate_excel_config(excel_config, console, inclusions_mapping, organizations_mapping) ```python def validate_excel_config(excel_config, console_instance, inclusions_mapping_config, organizations_mapping_config): """Validate Excel configuration against templates. Checks that: - Template files exist and are valid - Named ranges exist in templates - Dimensions are correct - Mappings reference valid fields Args: excel_config: Config dict from load_excel_export_config() console_instance: Rich Console instance inclusions_mapping_config: List of valid inclusions fields organizations_mapping_config: Dict of valid organizations fields Returns: (has_critical_error: bool, error_messages: list) """ ``` #### export_to_excel(inclusions_data, organizations_data, excel_config, console=None) ```python def export_to_excel(inclusions_data, organizations_data, excel_config, console_instance=None): """Main orchestration: Generate Excel files from data and config. xlwings-based processing with automatic formula recalculation: - Load template via xlwings - Apply data transformations (filter, sort, replace) - Fill cells/ranges with data - Save workbook (formulas auto-recalculated by Excel COM API) Args: inclusions_data: List of inclusion dicts organizations_data: List of organization dicts excel_config: Config dict from load_excel_export_config() console_instance: Optional Rich Console Returns: None (creates files as side effect) Raises: Catches and logs exceptions, continues with next workbook """ ``` ### Internal Functions (10) #### _prepare_template_variables() ```python def _prepare_template_variables(): """Extract variables for template string substitution. Variables: - extract_date_time: Full ISO datetime (UTC→Paris TZ) - extract_year: Year - extract_month: Month (2-digit) - extract_day: Day (2-digit) Returns: dict: Variables for .format(**locals()) """ ``` #### _apply_filter(item, filter_condition) ```python def _apply_filter(item, filter_condition): """Apply AND filter to item. Returns True only if ALL conditions match. Supports nested field paths (dot notation). Args: item: Dict to filter filter_condition: Dict of field:value conditions Returns: bool: True if matches, False otherwise """ ``` #### _apply_sort(items, sort_keys) ```python def _apply_sort(items, sort_keys): """Multi-key sort with datetime parsing and natural alphanumeric support. Handles: - String fields (case-insensitive comparison) - Numeric and datetime fields - Natural alphanumeric sorting (*natsort option) - Missing fields (placed at end) - Mixed ascending and descending order Args: items: List of dicts to sort sort_keys: List of [field, order] or [field, order, option] where option can be: - datetime format string (e.g., "%Y-%m-%d") - "*natsort" for natural alphanumeric sorting Returns: list: Sorted items """ ``` #### _apply_value_replacement(value, replacements) ```python def _apply_value_replacement(value, replacements): """Transform value using first-matching rule. Strict type matching. Returns original if no match. Args: value: Original value replacements: List of replacement rules Returns: Replaced value or original """ ``` #### _handle_output_exists(output_path, action) ```python def _handle_output_exists(output_path, action): """Handle file conflicts: Overwrite/Increment/Backup. Overwrite: Returns same path (existing file will be overwritten) Increment: Returns path with _1, _2, etc. suffix Backup: Renames existing to _backup_1, etc.; returns original path Args: output_path: Target file path action: "Overwrite" | "Increment" | "Backup" Returns: str: Actual path to use """ ``` #### _get_named_range_dimensions(workbook, range_name) ```python def _get_named_range_dimensions(workbook, range_name): """Extract position and dimensions from named range. Uses openpyxl named_ranges to find range definition. Args: workbook: openpyxl Workbook object range_name: Name of the named range Returns: (sheet_name, start_cell, height, width) Raises: ValueError if range not found """ ``` #### _process_sheet_xlwings(workbook_xw, sheet_config, inclusions_data, organizations_data, ...) ```python def _process_sheet_xlwings(workbook_xw, sheet_config, inclusions_data, organizations_data, inclusions_mapping_config, organizations_mapping_config, template_vars): """Fill single sheet using xlwings (native Excel COM API). Routes based on source_type: - Variable: Evaluate template string, write to cell - Inclusions/Organizations: Filter, sort, fill table (bulk operation) Automatic formula recalculation occurs via xlwings COM API. Args: workbook_xw: xlwings Book object (open) sheet_config: Single sheet configuration dict inclusions_data, organizations_data: Source data inclusions_mapping_config, organizations_mapping_config: Field mappings template_vars: Variables for template strings Returns: bool: Success status """ ``` #### set_dependencies(console_obj, inclusions_file, organizations_file) ```python def set_dependencies(console_instance, inclusions_filename, organizations_filename, ...): """Inject module-level variables (dependency injection). Called from main dashboard to provide: - console: Rich Console instance - File names and configuration Args: console_instance: Rich Console object ... (other global references) Returns: None """ ``` --- ## Filter, Sort & Replacement Logic ### AND Filter Logic Conditions combined with AND (all must be true): ```python filter_condition = {"status": "active", "type": "inclusion"} # Matches: {"status": "active", "type": "inclusion", "date": "2025-01-15"} # Does NOT match: {"status": "active", "type": "follow-up"} (type different) ``` **Nested Field Support:** ```python filter_condition = {"patient.status": "active"} # Matches: {"patient": {"status": "active"}} ``` ### Multi-Key Sort Logic Sort keys applied in order (first is primary): ```python sort_keys = [ ["status", "asc"], # Primary sort ["date_visit", "desc"], # Secondary sort ["patient_id", "asc", "*natsort"] # Tertiary sort with natural alphanumeric ] ``` **String Comparison:** - **Case-insensitive by default:** `"Centre"` comes before `"CHU"` (natural alphabetical order) - Tiebreaker: Case-sensitive if lowercase versions are equal **Datetime Handling:** - Provide strptime format as third parameter: `["date_field", "desc", "%Y-%m-%d"]` - Custom formats supported: `"%d/%m/%Y"`, `"%Y-%m-%d %H:%M:%S"`, etc. **Natural Alphanumeric Sorting:** - Use `"*natsort"` as third parameter for proper numeric segment handling - Correctly sorts: `"ENDOBEST-003-3-BA"` < `"ENDOBEST-003-20-BA"` < `"ENDOBEST-003-100-BA"` - Also handles: `"v1.2"` < `"v1.10"`, `"file2.txt"` < `"file10.txt"` - Perfect for patient IDs, version codes, sequential identifiers **Missing Values:** - Items with missing/null/undefined field values placed at end ### Value Replacement Rules First-matching rule wins; strict type matching: ```python replacements = [ {"type": "bool", "true": "Yes", "false": "No"}, {"type": "str", "from": "active", "to": "Active"}, ] # True (boolean) → "Yes" # "active" (string) → "Active" # "true" (string) → "true" (no match, unchanged) ``` --- ## Template Variables ### Available Variables Template variables available in `output_filename` and Variable cell content: | Variable | Type | Example | Notes | |----------|------|---------|-------| | `extract_date_time` | ISO datetime | `2025-01-15T14:30:45+01:00` | Full timestamp (UTC→Paris TZ) | | `extract_year` | Year | `2025` | 4-digit year | | `extract_month` | Month | `01` | 2-digit month | | `extract_day` | Day | `15` | 2-digit day | | `workbook_name` | Text | `"Endobest_Output"` | From config | ### Usage Examples **Filename Template:** ``` {workbook_name}_{extract_date_time}.xlsx → Endobest_Output_2025-01-15T14-30-45.xlsx ``` **Variable Cell Template:** ``` Extracted: {extract_date_time} → Extracted: 2025-01-15T14:30:45+01:00 ``` --- ## File Conflict Handling ### Three Strategies #### 1. Overwrite - Deletes existing file - Writes new file with same name ``` output_path: report.xlsx result: report.xlsx (new) ``` #### 2. Increment - Finds next available number - Appends _1, _2, etc. to filename ``` existing: report.xlsx, report_1.xlsx, report_2.xlsx output_path: report.xlsx result: report_3.xlsx ``` #### 3. Backup - Renames existing to _backup_N - Writes new file with original name ``` existing: report.xlsx output_path: report.xlsx result: - report_backup_1.xlsx (renamed) - report.xlsx (new) ``` --- ## Integration with Main Dashboard ### Integration Points 1. **Startup Validation (before collection):** ```python EXCEL_EXPORT_CONFIG, error = load_excel_export_config(console) if error: # Ask user confirmation EXCEL_EXPORT_ENABLED = False ``` 2. **After JSON Export (after collection):** ```python if EXCEL_EXPORT_ENABLED: inclusions = load_json_file(INCLUSIONS_FILE_NAME) organizations = load_json_file(ORGANIZATIONS_FILE_NAME) export_to_excel(inclusions, organizations, EXCEL_EXPORT_CONFIG, console) ``` 3. **--excel-only Mode:** ```python if "--excel-only" in sys.argv: inclusions = load_json_file(INCLUSIONS_FILE_NAME) organizations = load_json_file(ORGANIZATIONS_FILE_NAME) export_to_excel(inclusions, organizations, EXCEL_EXPORT_CONFIG, console) ``` ### Global Variables Added to `eb_dashboard.py`: ```python EXCEL_EXPORT_CONFIG = None # Loaded config EXCEL_EXPORT_ENABLED = False # Flag to enable/disable export # Constants EXCEL_WORKBOOKS_TABLE_NAME = "Excel_Workbooks" EXCEL_SHEETS_TABLE_NAME = "Excel_Sheets" ``` --- ## Error Handling & Validation ### Validation Stages #### Stage 1: Config Loading (Startup) - File exists and valid Excel format - Required columns present - JSON parsing succeeds - Returns error status #### Stage 2: Config Validation (Startup) - Templates exist in `config/` folder - Templates valid `.xlsx` files - Named ranges exist - Dimensions correct - Returns critical error status #### Stage 3: User Confirmation (Startup) - If critical errors found: - Display error messages - Ask user to continue or abort - Set EXCEL_EXPORT_ENABLED flag #### Stage 4: Runtime Error Handling - Try/except wraps main export - Logs detailed errors - Continues with next workbook - Displays summary ### Error Messages **Critical Config Error:** ``` ⚠ CRITICAL CONFIGURATION ERROR(S) DETECTED ──────────────────────────────────── Error 1: Template file missing: config/templates/Missing.xlsx Error 2: Named range not found: MyRange in sheet MySheet ... Do you want to continue anyway? [y/N]: ``` **Runtime Error:** ``` ✗ Excel export failed: [Specific error message] (See dashboard.log for full traceback) ``` --- ## Configuration Examples ### Example 1: Simple Inclusion List **Excel_Workbooks:** | workbook_name | template_path | output_filename | output_exists_action | |---|---|---|---| | Inclusions_Report | templates/Simple.xlsx | Inclusions_{extract_date_time}.xlsx | Increment | **Excel_Sheets:** | workbook_name | sheet_name | source_type | target | column_mapping | filter_condition | sort_keys | value_replacement | |---|---|---|---|---|---|---|---| | Inclusions_Report | Data | Inclusions | DataTable | {"col_id": "patient_id", "col_name": "name"} | {"status": "active"} | [{"field": "date_inclusion", "order": "asc"}] | null | ### Example 2: Multi-Sheet with Variables **Excel_Sheets (multiple rows):** | workbook_name | sheet_name | source_type | target | ... | |---|---|---|---|---| | Report | Title | Variable | TitleCell | ... | | Report | Inclusions | Inclusions | InclusionTable | ... | | Report | Organizations | Organizations | OrgTable | ... | ### Example 3: Value Replacement **Excel_Sheets:** ``` value_replacement: [ { "type": "bool", "true": "Yes", "false": "No" }, { "type": "str", "from": "active", "to": "Active Status" } ] ``` --- ## Troubleshooting & Debugging ### Common Issues #### "Template file missing" **Cause:** Template path incorrect or file not in `config/` folder **Solution:** Verify file exists at `config/{template_path}` #### "Named range not found" **Cause:** Range name in config doesn't exist in template **Solution:** Check range name in Excel (Formulas → Define Names → Name Manager) #### "Dimensions mismatch" **Cause:** Column count in mapping exceeds named range width **Solution:** Verify named range dimensions and column mapping count match #### "Formulas not recalculating" **Cause:** xlwings not installed or Excel not available on system **Solution:** Ensure xlwings is installed (`pip install xlwings`) and Excel is available. Formulas are automatically recalculated by xlwings via COM API. ### Debug Mode ```bash python eb_dashboard.py --debug ``` Enables verbose logging with detailed Excel export operations. ### Log File Check `dashboard.log` for: - Configuration load/validation results - Each workbook processing - Filter/sort/replace operations - File creation details - Error details and tracebacks --- ## Notes for Developers ### Adding New Features 1. **New Transformation Step:** Add function to `eb_dashboard_excel_export.py`, call from `_process_sheet_xlwings()` 2. **New Source Type:** Add case to `_process_sheet_xlwings()` router (update SOURCE_TYPES in constants) 3. **New Template Variable:** Add to `_prepare_template_variables()` 4. **Update Constants:** Add new values to `eb_dashboard_constants.py` (single source of truth) ### Testing - Unit tests: `test_core_logic.py` (26 tests, 100% pass) - No external dependencies needed (pure function testing) - Integration tests: Use `--excel_only` mode with real data ### Performance Considerations - **Data Filtering:** O(n) per filter rule - **Sorting:** O(n log n) - **Excel Fill:** O(n) for cells, time depends on file size - **Typical Duration:** 1-5 seconds per workbook (depends on data volume and template complexity) --- **End of Excel Export Architecture Documentation**