Files
EB_Dashboard/DOCUMENTATION/DOCUMENTATION_13_EXCEL_EXPORT.md

33 KiB

Endobest Excel Export Feature & Architecture

Part 4: Configuration-Driven Excel Workbook Generation

Document Version: 1.1 Last Updated: 2025-11-11 Audience: Developers, Business Analysts, System Architects Language: English


Table of Contents

  1. Overview
  2. Architecture & Design
  3. Core Components
  4. High-Level Orchestration Functions (v1.1+)
  5. Configuration System
  6. Data Flow & Processing Pipeline
  7. Excel Export Functions
  8. Filter, Sort & Replacement Logic
  9. Template Variables
  10. File Conflict Handling
  11. Integration with Main Dashboard
  12. Error Handling & Validation
  13. Configuration Examples
  14. Troubleshooting & Debugging

Overview

The Excel Export Feature enables generation of configurable Excel workbooks from patient inclusion data and organization statistics. The system is entirely configuration-driven, allowing non-technical users to define export behavior through Excel configuration tables without code modifications.

Key Characteristics

Configuration-Driven Design:

  • All export behavior defined in Endobest_Dashboard_Config.xlsx
  • Two tables: Excel_Workbooks (metadata) and Excel_Sheets (sheet definitions)
  • No code changes needed to modify export behavior

Modular Architecture:

  • New module: eb_dashboard_excel_export.py
  • Separation of concerns: Excel logic isolated from main dashboard
  • Dependency injection for testing and flexibility

Data Transformation:

  • Filter: AND conditions with nested field support
  • Sort: Multi-key sorting with case-insensitive strings, datetime parsing, natural alphanumeric sorting (*natsort)
  • Replace: Strict type matching with first-match-wins logic
  • Fill: Direct cell or named range targeting

Note: For complete configuration details and up-to-date column specifications, refer to DOCUMENTATION_99_CONFIG_GUIDE.md

Three Operating Modes:

  1. Normal: Full collection → Quality checks → JSON export → Excel export
  2. --excel-only: Load existing JSON → Excel export (fast iteration)
  3. --check-only: Quality checks only (unchanged, for backward compatibility)

Architecture & Design

Module Structure

eb_dashboard_excel_export.py
├── Imports & Dependencies
├── Constants & Configuration
│   └── EXCEL_RECALC_TIMEOUT = 60
├── Module-Level Variables (Injected)
│   ├── console (Rich Console instance)
│   ├── DASHBOARD_CONFIG_FILE_NAME
│   └── Other global references
│
├── Public API (Called from main)
│   ├── load_excel_export_config(console)
│   ├── validate_excel_config(excel_config, console, inclusions_mapping, organizations_mapping)
│   └── export_to_excel(inclusions_data, organizations_data, excel_config, console)
│
└── Internal Functions (Helpers)
    ├── _prepare_template_variables()
    ├── _apply_filter(item, filter_condition)
    ├── _apply_sort(items, sort_keys)
    ├── _apply_value_replacement(value, replacements)
    ├── _handle_output_exists(output_path, action)
    ├── _get_named_range_dimensions(workbook, range_name) [openpyxl - validation phase]
    ├── _get_table_dimensions_xlwings(workbook_xw, range_name) [xlwings - data processing]
    ├── _recalculate_workbook(workbook_path)
    ├── _process_sheet_xlwings(workbook_xw, sheet_config, ...) [xlwings - data fill]
    └── set_dependencies(...)

Design Principles

  1. Configuration-First: Behavior determined by config, not code
  2. Pure Functions: Helper functions are pure (no side effects) except I/O
  3. xlwings-First Architecture: Data processing uses xlwings exclusively (native Excel COM API)
    • Configuration validation uses openpyxl (read-only, lighter footprint)
    • Data fill & processing uses xlwings (preserves workbook structure, formulas, images)
    • Automatic formula recalculation via xlwings COM API during cell updates
    • No redundant file reloads - metadata read via COM API without reloading
  4. Early Validation: Config errors detected at startup, before data collection

Core Components

1. Configuration Loading

Function: load_excel_export_config(console)

Loads Excel export configuration from the Endobest_Dashboard_Config.xlsx file.

Responsibilities:

  • Read Excel_Workbooks table
  • Read Excel_Sheets table
  • Parse JSON fields (filter_condition, sort_keys, value_replacement)
  • Validate structure and presence of required columns
  • Return parsed config and error status

Return Value:

(config_dict, has_error: bool)

Config Structure:

{
    "workbooks": [
        {
            "workbook_name": str,
            "template_path": str,
            "output_filename": str,
            "output_exists_action": "Overwrite" | "Increment" | "Backup"
        },
        ...
    ],
    "sheets": [
        {
            "workbook_name": str,
            "sheet_name": str,
            "source_type": "Variable" | "Inclusions" | "Organizations",
            "target": str,
            "column_mapping": dict | None,
            "filter_condition": dict | None,
            "sort_keys": list | None,
            "value_replacement": list | None
        },
        ...
    ]
}

2. Configuration Validation

Function: validate_excel_config(excel_config, console, inclusions_mapping, organizations_mapping)

Validates that all referenced templates exist and have correct structure.

Validations Performed:

  • Template files exist in config/ directory
  • Template files are valid Excel (.xlsx)
  • Named ranges exist in templates
  • Named range dimensions correct (height=1 for tables, width≥max index)
  • Column mappings reference valid fields
  • Source types are valid

Return Value:

(has_critical_error: bool, error_messages: list)

3. Excel Export Orchestration

Function: export_to_excel(inclusions_data, organizations_data, excel_config, console)

Main orchestration function for Excel export.

Workflow:

  1. Prepare template variables (timestamp, extract_date_time, etc.)
  2. For each workbook in config:
    • Resolve output filename using template variables
    • Handle file conflicts (Overwrite/Increment/Backup)
    • Copy template to output location
    • XLWINGS PHASE (native Excel COM API):
      • Load workbook with xlwings
      • For each sheet config:
        • Apply filters, sorts, replacements
        • Read metadata via xlwings COM API (no file reloads)
        • Fill cells/named ranges with data
        • Formulas automatically recalculated by Excel COM API
      • Save workbook
  3. Log summary and completion

Architecture Change (v1.2+):

  • Migration from openpyxl to xlwings eliminated need for separate win32com recalculation phase
  • xlwings uses native Excel COM API, which automatically recalculates formulas during cell updates
  • Simplified workflow: one Excel session, no hand-off between libraries

High-Level Orchestration Functions (v1.1+)

New in v1.1: Three high-level orchestration functions were added to completely externalize Excel export orchestration from the main script. These functions follow the established pattern from the quality_checks module.

1. export_excel_only(sys_argv, console_instance, inclusions_filename, organizations_filename, inclusions_mapping_config, organizations_mapping_config)

Purpose: Complete orchestration of --excel-only CLI mode

Workflow:

  1. Initialize console and set default filenames
  2. Call prepare_excel_export() to load and validate
  3. Handle critical configuration errors with user confirmation
  4. Call execute_excel_export() to perform export
  5. Display results and return

Usage in Main Script:

if excel_only_mode:
    export_excel_only(sys.argv, console, INCLUSIONS_FILE_NAME, ORGANIZATIONS_FILE_NAME,
                     INCLUSIONS_MAPPING_CONFIG, {})
    return

Impact: Reduces main script from 34 lines to 4 lines (87% reduction)


2. run_normal_mode_export(inclusions_data, organizations_data, excel_enabled, excel_config, console_instance, inclusions_mapping_config, organizations_mapping_config)

Purpose: Orchestrates Excel export phase during normal workflow

Workflow:

  1. Check if export enabled (returns early if not)
  2. Load JSONs from filesystem (ensures consistency)
  3. Call execute_excel_export() to perform export
  4. Display results and return status tuple

Returns: (success: bool, error_message: str)

Usage in Main Script:

# After JSONs are written to disk
run_normal_mode_export(output_inclusions, organizations_list, EXCEL_EXPORT_ENABLED,
                      EXCEL_EXPORT_CONFIG, console, INCLUSIONS_MAPPING_CONFIG, {})

Impact: Reduces main script from 19 lines to 2 lines (89% reduction)


3. prepare_excel_export(inclusions_filename, organizations_filename, console_instance, inclusions_mapping_config, organizations_mapping_config)

Purpose: Centralized preparation function - loads JSONs, config, and validates

Responsibility:

  • Load inclusions JSON from filesystem
  • Load organizations JSON from filesystem
  • Load Excel export configuration
  • Validate configuration against templates
  • Aggregate and return all errors

Returns: (prep_success: bool, inclusions_data, organizations_data, excel_config, has_critical_errors: bool, error_messages: list)

Used By: Both export_excel_only() and potentially run_normal_mode_export()


4. execute_excel_export(inclusions_data, organizations_data, excel_config, console_instance, inclusions_mapping_config, organizations_mapping_config)

Purpose: Execute Excel export with comprehensive error handling

Responsibility:

  • Call core export_to_excel() function
  • Catch and log all exceptions
  • Return success/failure status to caller

Returns: (success: bool, error_message: str)

Error Handling: All exceptions caught and returned as error messages (never raises)


5. _load_json_file_internal(filename)

Purpose: Internal helper for safe JSON file loading

Responsibility:

  • Check file existence
  • Load and parse JSON
  • Handle errors gracefully
  • Return None on failure (instead of raising)

Used By: run_normal_mode_export() internally


Design Pattern: Consistency with Quality Checks

The orchestration functions follow the exact pattern established by run_check_only_mode() from the quality_checks module:

Aspect Quality Checks Excel Export
Standalone mode orchestration run_check_only_mode() export_excel_only()
Config loading in module Yes Yes
User confirmation in module Yes Yes
Error handling in module Yes Yes
Main script integration 1 line call 1 line call

Result: Consistent architecture across all major features (quality checks, excel export, etc.)


Configuration System

Two-Table Configuration

The Excel export is configured through two tables in Endobest_Dashboard_Config.xlsx:

Table 1: Excel_Workbooks

Defines metadata for each Excel workbook to generate.

Column Type Required Example Description
workbook_name Text Yes "Endobest_Output" Unique identifier for workbook
template_path Text Yes "templates/Endobest_Template.xlsx" Path relative to config/ folder
output_filename Text Yes "{workbook_name}_{extract_date_time}.xlsx" Template for output filename
output_exists_action Text Yes "Increment" How to handle conflicts (Overwrite/Increment/Backup)

Table 2: Excel_Sheets

Defines how to fill each sheet in the workbooks.

Column Type Required Example Description
workbook_name Text Yes "Endobest_Output" Must match Excel_Workbooks entry
sheet_name Text Yes "Inclusions" Sheet name in template
source_type Text Yes "Inclusions" Variable / Inclusions / Organizations
target Text Yes "DataTable" Named range or cell reference
column_mapping JSON Conditional {"col_id": "patient_id"} For source_type=Inclusions/Organizations only
filter_condition JSON No {"status": "active"} AND conditions for filtering
sort_keys JSON No [["date", "asc"], ["id", "asc", "*natsort"]] Sort specification with optional datetime/natsort
value_replacement JSON No [{"type": "bool", "true": "Yes", "false": "No"}] Value transformations

Data Flow & Processing Pipeline

Overview

Input Data (inclusions + organizations)
    ↓
Filter (AND conditions)
    ↓
Sort (multi-key with datetime)
    ↓
Value Replacement (strict typing)
    ↓
Fill Excel Cells/Ranges (via xlwings)
    ↓
Save Workbook (xlwings)
    ↓
Formulas Automatically Recalculated (xlwings COM API)
    ↓
Final Excel File

Detailed Processing Steps

Step 1: Filter

Applies AND conditions to select matching items.

Logic:

  • Start with all items
  • For each field in filter_condition:
    • Keep only items where field value equals expected value
    • Support nested field paths (dot notation: patient.status)
  • Return filtered items

Example:

{
  "status": "active",
  "visit_type": "inclusion"
}

Keeps only items where BOTH conditions are true.

Step 2: Sort

Multi-key sort with datetime awareness and missing field handling.

Logic:

  • Apply sort keys in order (first key is primary, second is secondary, etc.)
  • Detect datetime fields automatically (ISO format: YYYY-MM-DD)
  • Items with missing fields go to end of sort
  • Reverse order for "desc" order specification

Example:

[
  {"field": "visit_type", "order": "asc"},
  {"field": "date_visit", "order": "desc"}
]

Step 3: Value Replacement

Transform cell values based on rules (first-match-wins).

Logic:

  • Evaluate rules in order
  • Stop at first matching rule
  • Strict type matching (e.g., boolean True ≠ string "true")
  • Return original value if no match

Supported Types:

  • "bool": Boolean replacement with "true" and "false" fields
  • "str": String replacement with "from" and "to" fields
  • "int": Integer replacement with "from" and "to" fields

Step 4: Fill Excel

Place transformed data into Excel cells or named ranges.

Two Modes:

  • Variable (Single Cell): Write evaluated template string to target cell
  • Table (Named Range): Write filtered/sorted/replaced items to target range
4.1 Variable Mode (Template String Substitution)

For source_type = "Variable":

  1. Evaluate the source template string using .format(**template_vars)
  2. Write result to the target named cell
  3. Example: {extract_date_time_french}"2025-01-15 14:30:45+01:00"
4.2 Table Mode (Data Fill with Column Mapping)

For source_type = "Inclusions" or "Organizations":

Key Concept: The first row of the table target serves as BOTH TEMPLATE and FIRST DATA ROW. Some columns may contain formulas that should NOT be overwritten (unmapped columns).

Algorithm:

  1. Extract Column Mapping

    • Load mapping from Inclusions_Mapping or Organizations_Mapping table
    • Mapping column name comes from Excel_Sheets.source parameter
    • Mapping contains indices (0, 1, 2...) indicating Excel column positions
    • Example:
      Inclusions_Mapping:
      | field_name | field_group      | MainReport_PatientsList |
      | Patient_Id | Patient_Identification | 0          |
      | Status     | Inclusion        | 1                       |
      | Date       | Inclusion        | 3                       |
      (Column 2 not mapped - preserves template formula!)
      
    • Result: {0: "Patient_Identification.Patient_Id", 1: "Inclusion.Status", 3: "Inclusion.Date"}
  2. Filter and Sort Data

    • Apply AND filter conditions
    • Apply multi-key sort with datetime parsing
    • Example: 5 items match filter, sorted by Patient_Id ascending
  3. Extend Table Rows

    • Delete any existing data rows below the template row
    • Keep the first row (template + first data)
    • For each filtered/sorted item: a. Create new row (or use template row for first item) b. Copy ALL cells from template row (preserves formulas!) c. Overwrite ONLY mapped columns with JSON data d. Apply value_replacement to mapped values
  4. Preserve Formulas in Unmapped Columns

    • Unmapped columns (those without index in mapping) keep template values
    • If template column contains formula, it's preserved and recalculates later
    • Allows mixed rows: some columns from JSON, some from formulas

Example:

Template Row (Row 1):

| A: P001      | B: Active     | C: =SUM(...) | D: 2025-01 |
| (mapped 0)   | (mapped 1)    | (formula!)   | (mapped 3) |

After processing (3 data items):

| A: P001      | B: Active     | C: 45        | D: 2025-01 | ← Template + first data
| A: P002      | B: Active     | C: 67        | D: 2025-02 | ← Data 2 (formula copied)
| A: P003      | B: Active     | C: 89        | D: 2025-03 | ← Data 3 (formula copied)

Result:

  • Columns A, B, D filled with JSON data and value replacement
  • Column C: Formula =SUM(...) copied to all rows, will recalculate
  • All rows have consistent formatting from template

Excel Export Functions

Public Functions (3)

load_excel_export_config(console=None)

def load_excel_export_config(console_instance=None):
    """Load Excel export configuration from config file.

    Reads Excel_Workbooks and Excel_Sheets tables from
    Endobest_Dashboard_Config.xlsx, parses JSON fields.

    Args:
        console_instance: Optional Rich Console for messages

    Returns:
        (config_dict, has_error: bool)

    Raises:
        None (returns error status instead)
    """

validate_excel_config(excel_config, console, inclusions_mapping, organizations_mapping)

def validate_excel_config(excel_config, console_instance,
                         inclusions_mapping_config,
                         organizations_mapping_config):
    """Validate Excel configuration against templates.

    Checks that:
    - Template files exist and are valid
    - Named ranges exist in templates
    - Dimensions are correct
    - Mappings reference valid fields

    Args:
        excel_config: Config dict from load_excel_export_config()
        console_instance: Rich Console instance
        inclusions_mapping_config: List of valid inclusions fields
        organizations_mapping_config: Dict of valid organizations fields

    Returns:
        (has_critical_error: bool, error_messages: list)
    """

export_to_excel(inclusions_data, organizations_data, excel_config, console=None)

def export_to_excel(inclusions_data, organizations_data, excel_config,
                   console_instance=None):
    """Main orchestration: Generate Excel files from data and config.

    xlwings-based processing with automatic formula recalculation:
    - Load template via xlwings
    - Apply data transformations (filter, sort, replace)
    - Fill cells/ranges with data
    - Save workbook (formulas auto-recalculated by Excel COM API)

    Args:
        inclusions_data: List of inclusion dicts
        organizations_data: List of organization dicts
        excel_config: Config dict from load_excel_export_config()
        console_instance: Optional Rich Console

    Returns:
        None (creates files as side effect)

    Raises:
        Catches and logs exceptions, continues with next workbook
    """

Internal Functions (10)

_prepare_template_variables()

def _prepare_template_variables():
    """Extract variables for template string substitution.

    Variables:
    - extract_date_time: Full ISO datetime (UTC→Paris TZ)
    - extract_year: Year
    - extract_month: Month (2-digit)
    - extract_day: Day (2-digit)

    Returns:
        dict: Variables for .format(**locals())
    """

_apply_filter(item, filter_condition)

def _apply_filter(item, filter_condition):
    """Apply AND filter to item.

    Returns True only if ALL conditions match.
    Supports nested field paths (dot notation).

    Args:
        item: Dict to filter
        filter_condition: Dict of field:value conditions

    Returns:
        bool: True if matches, False otherwise
    """

_apply_sort(items, sort_keys)

def _apply_sort(items, sort_keys):
    """Multi-key sort with datetime parsing and natural alphanumeric support.

    Handles:
    - String fields (case-insensitive comparison)
    - Numeric and datetime fields
    - Natural alphanumeric sorting (*natsort option)
    - Missing fields (placed at end)
    - Mixed ascending and descending order

    Args:
        items: List of dicts to sort
        sort_keys: List of [field, order] or [field, order, option]
                   where option can be:
                   - datetime format string (e.g., "%Y-%m-%d")
                   - "*natsort" for natural alphanumeric sorting

    Returns:
        list: Sorted items
    """

_apply_value_replacement(value, replacements)

def _apply_value_replacement(value, replacements):
    """Transform value using first-matching rule.

    Strict type matching. Returns original if no match.

    Args:
        value: Original value
        replacements: List of replacement rules

    Returns:
        Replaced value or original
    """

_handle_output_exists(output_path, action)

def _handle_output_exists(output_path, action):
    """Handle file conflicts: Overwrite/Increment/Backup.

    Overwrite: Returns same path (existing file will be overwritten)
    Increment: Returns path with _1, _2, etc. suffix
    Backup: Renames existing to _backup_1, etc.; returns original path

    Args:
        output_path: Target file path
        action: "Overwrite" | "Increment" | "Backup"

    Returns:
        str: Actual path to use
    """

_get_named_range_dimensions(workbook, range_name)

def _get_named_range_dimensions(workbook, range_name):
    """Extract position and dimensions from named range.

    Uses openpyxl named_ranges to find range definition.

    Args:
        workbook: openpyxl Workbook object
        range_name: Name of the named range

    Returns:
        (sheet_name, start_cell, height, width)

    Raises:
        ValueError if range not found
    """

_process_sheet_xlwings(workbook_xw, sheet_config, inclusions_data, organizations_data, ...)

def _process_sheet_xlwings(workbook_xw, sheet_config, inclusions_data,
                           organizations_data, inclusions_mapping_config,
                           organizations_mapping_config, template_vars):
    """Fill single sheet using xlwings (native Excel COM API).

    Routes based on source_type:
    - Variable: Evaluate template string, write to cell
    - Inclusions/Organizations: Filter, sort, fill table (bulk operation)

    Automatic formula recalculation occurs via xlwings COM API.

    Args:
        workbook_xw: xlwings Book object (open)
        sheet_config: Single sheet configuration dict
        inclusions_data, organizations_data: Source data
        inclusions_mapping_config, organizations_mapping_config: Field mappings
        template_vars: Variables for template strings

    Returns:
        bool: Success status
    """

set_dependencies(console_obj, inclusions_file, organizations_file)

def set_dependencies(console_instance, inclusions_filename,
                    organizations_filename, ...):
    """Inject module-level variables (dependency injection).

    Called from main dashboard to provide:
    - console: Rich Console instance
    - File names and configuration

    Args:
        console_instance: Rich Console object
        ... (other global references)

    Returns:
        None
    """

Filter, Sort & Replacement Logic

AND Filter Logic

Conditions combined with AND (all must be true):

filter_condition = {"status": "active", "type": "inclusion"}
# Matches: {"status": "active", "type": "inclusion", "date": "2025-01-15"}
# Does NOT match: {"status": "active", "type": "follow-up"}  (type different)

Nested Field Support:

filter_condition = {"patient.status": "active"}
# Matches: {"patient": {"status": "active"}}

Multi-Key Sort Logic

Sort keys applied in order (first is primary):

sort_keys = [
    ["status", "asc"],                    # Primary sort
    ["date_visit", "desc"],               # Secondary sort
    ["patient_id", "asc", "*natsort"]     # Tertiary sort with natural alphanumeric
]

String Comparison:

  • Case-insensitive by default: "Centre" comes before "CHU" (natural alphabetical order)
  • Tiebreaker: Case-sensitive if lowercase versions are equal

Datetime Handling:

  • Provide strptime format as third parameter: ["date_field", "desc", "%Y-%m-%d"]
  • Custom formats supported: "%d/%m/%Y", "%Y-%m-%d %H:%M:%S", etc.

Natural Alphanumeric Sorting:

  • Use "*natsort" as third parameter for proper numeric segment handling
  • Correctly sorts: "ENDOBEST-003-3-BA" < "ENDOBEST-003-20-BA" < "ENDOBEST-003-100-BA"
  • Also handles: "v1.2" < "v1.10", "file2.txt" < "file10.txt"
  • Perfect for patient IDs, version codes, sequential identifiers

Missing Values:

  • Items with missing/null/undefined field values placed at end

Value Replacement Rules

First-matching rule wins; strict type matching:

replacements = [
    {"type": "bool", "true": "Yes", "false": "No"},
    {"type": "str", "from": "active", "to": "Active"},
]

# True (boolean) → "Yes"
# "active" (string) → "Active"
# "true" (string) → "true" (no match, unchanged)

Template Variables

Available Variables

Template variables available in output_filename and Variable cell content:

Variable Type Example Notes
extract_date_time ISO datetime 2025-01-15T14:30:45+01:00 Full timestamp (UTC→Paris TZ)
extract_year Year 2025 4-digit year
extract_month Month 01 2-digit month
extract_day Day 15 2-digit day
workbook_name Text "Endobest_Output" From config

Usage Examples

Filename Template:

{workbook_name}_{extract_date_time}.xlsx
→ Endobest_Output_2025-01-15T14-30-45.xlsx

Variable Cell Template:

Extracted: {extract_date_time}
→ Extracted: 2025-01-15T14:30:45+01:00

File Conflict Handling

Three Strategies

1. Overwrite

  • Deletes existing file
  • Writes new file with same name
output_path: report.xlsx
result: report.xlsx (new)

2. Increment

  • Finds next available number
  • Appends _1, _2, etc. to filename
existing: report.xlsx, report_1.xlsx, report_2.xlsx
output_path: report.xlsx
result: report_3.xlsx

3. Backup

  • Renames existing to _backup_N
  • Writes new file with original name
existing: report.xlsx
output_path: report.xlsx
result:
  - report_backup_1.xlsx (renamed)
  - report.xlsx (new)

Integration with Main Dashboard

Integration Points

  1. Startup Validation (before collection):

    EXCEL_EXPORT_CONFIG, error = load_excel_export_config(console)
    if error:
        # Ask user confirmation
        EXCEL_EXPORT_ENABLED = False
    
  2. After JSON Export (after collection):

    if EXCEL_EXPORT_ENABLED:
        inclusions = load_json_file(INCLUSIONS_FILE_NAME)
        organizations = load_json_file(ORGANIZATIONS_FILE_NAME)
        export_to_excel(inclusions, organizations, EXCEL_EXPORT_CONFIG, console)
    
  3. --excel-only Mode:

    if "--excel-only" in sys.argv:
        inclusions = load_json_file(INCLUSIONS_FILE_NAME)
        organizations = load_json_file(ORGANIZATIONS_FILE_NAME)
        export_to_excel(inclusions, organizations, EXCEL_EXPORT_CONFIG, console)
    

Global Variables

Added to eb_dashboard.py:

EXCEL_EXPORT_CONFIG = None          # Loaded config
EXCEL_EXPORT_ENABLED = False        # Flag to enable/disable export

# Constants
EXCEL_WORKBOOKS_TABLE_NAME = "Excel_Workbooks"
EXCEL_SHEETS_TABLE_NAME = "Excel_Sheets"

Error Handling & Validation

Validation Stages

Stage 1: Config Loading (Startup)

  • File exists and valid Excel format
  • Required columns present
  • JSON parsing succeeds
  • Returns error status

Stage 2: Config Validation (Startup)

  • Templates exist in config/ folder
  • Templates valid .xlsx files
  • Named ranges exist
  • Dimensions correct
  • Returns critical error status

Stage 3: User Confirmation (Startup)

  • If critical errors found:
    • Display error messages
    • Ask user to continue or abort
    • Set EXCEL_EXPORT_ENABLED flag

Stage 4: Runtime Error Handling

  • Try/except wraps main export
  • Logs detailed errors
  • Continues with next workbook
  • Displays summary

Error Messages

Critical Config Error:

⚠ CRITICAL CONFIGURATION ERROR(S) DETECTED
────────────────────────────────────
Error 1: Template file missing: config/templates/Missing.xlsx
Error 2: Named range not found: MyRange in sheet MySheet
...
Do you want to continue anyway? [y/N]:

Runtime Error:

✗ Excel export failed: [Specific error message]
(See dashboard.log for full traceback)

Configuration Examples

Example 1: Simple Inclusion List

Excel_Workbooks:

workbook_name template_path output_filename output_exists_action
Inclusions_Report templates/Simple.xlsx Inclusions_{extract_date_time}.xlsx Increment

Excel_Sheets:

workbook_name sheet_name source_type target column_mapping filter_condition sort_keys value_replacement
Inclusions_Report Data Inclusions DataTable {"col_id": "patient_id", "col_name": "name"} {"status": "active"} [{"field": "date_inclusion", "order": "asc"}] null

Example 2: Multi-Sheet with Variables

Excel_Sheets (multiple rows):

workbook_name sheet_name source_type target ...
Report Title Variable TitleCell ...
Report Inclusions Inclusions InclusionTable ...
Report Organizations Organizations OrgTable ...

Example 3: Value Replacement

Excel_Sheets:

value_replacement: [
    {
        "type": "bool",
        "true": "Yes",
        "false": "No"
    },
    {
        "type": "str",
        "from": "active",
        "to": "Active Status"
    }
]

Troubleshooting & Debugging

Common Issues

"Template file missing"

Cause: Template path incorrect or file not in config/ folder Solution: Verify file exists at config/{template_path}

"Named range not found"

Cause: Range name in config doesn't exist in template Solution: Check range name in Excel (Formulas → Define Names → Name Manager)

"Dimensions mismatch"

Cause: Column count in mapping exceeds named range width Solution: Verify named range dimensions and column mapping count match

"Formulas not recalculating"

Cause: xlwings not installed or Excel not available on system Solution: Ensure xlwings is installed (pip install xlwings) and Excel is available. Formulas are automatically recalculated by xlwings via COM API.

Debug Mode

python eb_dashboard.py --debug

Enables verbose logging with detailed Excel export operations.

Log File

Check dashboard.log for:

  • Configuration load/validation results
  • Each workbook processing
  • Filter/sort/replace operations
  • File creation details
  • Error details and tracebacks

Notes for Developers

Adding New Features

  1. New Transformation Step: Add function to eb_dashboard_excel_export.py, call from _process_sheet_xlwings()
  2. New Source Type: Add case to _process_sheet_xlwings() router (update SOURCE_TYPES in constants)
  3. New Template Variable: Add to _prepare_template_variables()
  4. Update Constants: Add new values to eb_dashboard_constants.py (single source of truth)

Testing

  • Unit tests: test_core_logic.py (26 tests, 100% pass)
  • No external dependencies needed (pure function testing)
  • Integration tests: Use --excel_only mode with real data

Performance Considerations

  • Data Filtering: O(n) per filter rule
  • Sorting: O(n log n)
  • Excel Fill: O(n) for cells, time depends on file size
  • Typical Duration: 1-5 seconds per workbook (depends on data volume and template complexity)

End of Excel Export Architecture Documentation