EB_Dashboard/DOCUMENTATION/DOCUMENTATION_13_EXCEL_EXPORT.md

# Endobest Excel Export Feature & Architecture

## Part 4: Configuration-Driven Excel Workbook Generation

**Document Version:** 1.1
**Last Updated:** 2025-11-11
**Audience:** Developers, Business Analysts, System Architects
**Language:** English

---

## Table of Contents

1. [Overview](#overview)
2. [Architecture & Design](#architecture--design)
3. [Core Components](#core-components)
4. [High-Level Orchestration Functions (v1.1+)](#high-level-orchestration-functions-v11)
5. [Configuration System](#configuration-system)
6. [Data Flow & Processing Pipeline](#data-flow--processing-pipeline)
7. [Excel Export Functions](#excel-export-functions)
8. [Filter, Sort & Replacement Logic](#filter-sort--replacement-logic)
9. [Template Variables](#template-variables)
10. [File Conflict Handling](#file-conflict-handling)
11. [Integration with Main Dashboard](#integration-with-main-dashboard)
12. [Error Handling & Validation](#error-handling--validation)
13. [Configuration Examples](#configuration-examples)
14. [Troubleshooting & Debugging](#troubleshooting--debugging)

---

## Overview

The **Excel Export Feature** enables generation of configurable Excel workbooks from patient inclusion data and organization statistics. The system is entirely configuration-driven, allowing non-technical users to define export behavior through Excel configuration tables without code modifications.

### Key Characteristics

**Configuration-Driven Design:**
- All export behavior defined in `Endobest_Dashboard_Config.xlsx`
- Two tables: `Excel_Workbooks` (metadata) and `Excel_Sheets` (sheet definitions)
- No code changes needed to modify export behavior

**Modular Architecture:**
- New module: `eb_dashboard_excel_export.py`
- Separation of concerns: Excel logic isolated from main dashboard
- Dependency injection for testing and flexibility

**Data Transformation:**
- **Filter:** AND conditions with nested field support
- **Sort:** Multi-key sorting with case-insensitive strings, datetime parsing, natural alphanumeric sorting (`*natsort`)
- **Replace:** Strict type matching with first-match-wins logic
- **Fill:** Direct cell or named range targeting

> **Note:** For complete configuration details and up-to-date column specifications, refer to `DOCUMENTATION_99_CONFIG_GUIDE.md`

**Three Operating Modes:**
1. **Normal:** Full collection → Quality checks → JSON export → Excel export
2. **--excel-only:** Load existing JSON → Excel export (fast iteration)
3. **--check-only:** Quality checks only (unchanged, for backward compatibility)

---

## Architecture & Design

### Module Structure

```
eb_dashboard_excel_export.py
├── Imports & Dependencies
├── Constants & Configuration
│   └── EXCEL_RECALC_TIMEOUT = 60
├── Module-Level Variables (Injected)
│   ├── console (Rich Console instance)
│   ├── DASHBOARD_CONFIG_FILE_NAME
│   └── Other global references
│
├── Public API (Called from main)
│   ├── load_excel_export_config(console)
│   ├── validate_excel_config(excel_config, console, inclusions_mapping, organizations_mapping)
│   └── export_to_excel(inclusions_data, organizations_data, excel_config, console)
│
└── Internal Functions (Helpers)
    ├── _prepare_template_variables()
    ├── _apply_filter(item, filter_condition)
    ├── _apply_sort(items, sort_keys)
    ├── _apply_value_replacement(value, replacements)
    ├── _handle_output_exists(output_path, action)
    ├── _get_named_range_dimensions(workbook, range_name) [openpyxl - validation phase]
    ├── _get_table_dimensions_xlwings(workbook_xw, range_name) [xlwings - data processing]
    ├── _recalculate_workbook(workbook_path)
    ├── _process_sheet_xlwings(workbook_xw, sheet_config, ...) [xlwings - data fill]
    └── set_dependencies(...)
```

### Design Principles

1. **Configuration-First:** Behavior determined by config, not code
2. **Pure Functions:** Helper functions are pure (no side effects) except I/O
3. **xlwings-First Architecture:** Data processing uses xlwings exclusively (native Excel COM API)
   - Configuration validation uses openpyxl (read-only, lighter footprint)
   - Data fill & processing uses xlwings (preserves workbook structure, formulas, images)
   - Automatic formula recalculation via xlwings COM API during cell updates
   - No redundant file reloads - metadata read via COM API without reloading
4. **Early Validation:** Config errors detected at startup, before data collection

---

## Core Components

### 1. Configuration Loading
**Function:** `load_excel_export_config(console)`

Loads Excel export configuration from the `Endobest_Dashboard_Config.xlsx` file.

**Responsibilities:**
- Read `Excel_Workbooks` table
- Read `Excel_Sheets` table
- Parse JSON fields (filter_condition, sort_keys, value_replacement)
- Validate structure and presence of required columns
- Return parsed config and error status

**Return Value:**
```python
(config_dict, has_error: bool)
```

**Config Structure:**
```python
{
    "workbooks": [
        {
            "workbook_name": str,
            "template_path": str,
            "output_filename": str,
            "output_exists_action": "Overwrite" | "Increment" | "Backup"
        },
        ...
    ],
    "sheets": [
        {
            "workbook_name": str,
            "sheet_name": str,
            "source_type": "Variable" | "Inclusions" | "Organizations",
            "target": str,
            "column_mapping": dict | None,
            "filter_condition": dict | None,
            "sort_keys": list | None,
            "value_replacement": list | None
        },
        ...
    ]
}
```

### 2. Configuration Validation
**Function:** `validate_excel_config(excel_config, console, inclusions_mapping, organizations_mapping)`

Validates that all referenced templates exist and have correct structure.

**Validations Performed:**
- Template files exist in `config/` directory
- Template files are valid Excel (`.xlsx`)
- Named ranges exist in templates
- Named range dimensions correct (height=1 for tables, width≥max index)
- Column mappings reference valid fields
- Source types are valid

**Return Value:**
```python
(has_critical_error: bool, error_messages: list)
```

### 3. Excel Export Orchestration
**Function:** `export_to_excel(inclusions_data, organizations_data, excel_config, console)`

Main orchestration function for Excel export.

**Workflow:**
1. Prepare template variables (timestamp, extract_date_time, etc.)
2. For each workbook in config:
   - Resolve output filename using template variables
   - Handle file conflicts (Overwrite/Increment/Backup)
   - Copy template to output location
   - **XLWINGS PHASE (native Excel COM API):**
     - Load workbook with xlwings
     - For each sheet config:
       - Apply filters, sorts, replacements
       - Read metadata via xlwings COM API (no file reloads)
       - Fill cells/named ranges with data
       - Formulas automatically recalculated by Excel COM API
     - Save workbook
3. Log summary and completion

**Architecture Change (v1.2+):**
- Migration from openpyxl to xlwings eliminated need for separate win32com recalculation phase
- xlwings uses native Excel COM API, which automatically recalculates formulas during cell updates
- Simplified workflow: one Excel session, no hand-off between libraries

---

## High-Level Orchestration Functions (v1.1+)

**New in v1.1:** Three high-level orchestration functions were added to completely externalize Excel export orchestration from the main script. These functions follow the established pattern from the quality_checks module.

### 1. `export_excel_only(sys_argv, console_instance, inclusions_filename, organizations_filename, inclusions_mapping_config, organizations_mapping_config)`

**Purpose:** Complete orchestration of `--excel-only` CLI mode

**Workflow:**
1. Initialize console and set default filenames
2. Call `prepare_excel_export()` to load and validate
3. Handle critical configuration errors with user confirmation
4. Call `execute_excel_export()` to perform export
5. Display results and return

**Usage in Main Script:**
```python
if excel_only_mode:
    export_excel_only(sys.argv, console, INCLUSIONS_FILE_NAME, ORGANIZATIONS_FILE_NAME,
                     INCLUSIONS_MAPPING_CONFIG, {})
    return
```

**Impact:** Reduces main script from 34 lines to 4 lines (87% reduction)

---

### 2. `run_normal_mode_export(inclusions_data, organizations_data, excel_enabled, excel_config, console_instance, inclusions_mapping_config, organizations_mapping_config)`

**Purpose:** Orchestrates Excel export phase during normal workflow

**Workflow:**
1. Check if export enabled (returns early if not)
2. Load JSONs from filesystem (ensures consistency)
3. Call `execute_excel_export()` to perform export
4. Display results and return status tuple

**Returns:** `(success: bool, error_message: str)`

**Usage in Main Script:**
```python
# After JSONs are written to disk
run_normal_mode_export(output_inclusions, organizations_list, EXCEL_EXPORT_ENABLED,
                      EXCEL_EXPORT_CONFIG, console, INCLUSIONS_MAPPING_CONFIG, {})
```

**Impact:** Reduces main script from 19 lines to 2 lines (89% reduction)

---

### 3. `prepare_excel_export(inclusions_filename, organizations_filename, console_instance, inclusions_mapping_config, organizations_mapping_config)`

**Purpose:** Centralized preparation function - loads JSONs, config, and validates

**Responsibility:**
- Load inclusions JSON from filesystem
- Load organizations JSON from filesystem
- Load Excel export configuration
- Validate configuration against templates
- Aggregate and return all errors

**Returns:** `(prep_success: bool, inclusions_data, organizations_data, excel_config, has_critical_errors: bool, error_messages: list)`

**Used By:** Both `export_excel_only()` and potentially `run_normal_mode_export()`

---

### 4. `execute_excel_export(inclusions_data, organizations_data, excel_config, console_instance, inclusions_mapping_config, organizations_mapping_config)`

**Purpose:** Execute Excel export with comprehensive error handling

**Responsibility:**
- Call core `export_to_excel()` function
- Catch and log all exceptions
- Return success/failure status to caller

**Returns:** `(success: bool, error_message: str)`

**Error Handling:** All exceptions caught and returned as error messages (never raises)

---

### 5. `_load_json_file_internal(filename)`

**Purpose:** Internal helper for safe JSON file loading

**Responsibility:**
- Check file existence
- Load and parse JSON
- Handle errors gracefully
- Return None on failure (instead of raising)

**Used By:** `run_normal_mode_export()` internally

---

### Design Pattern: Consistency with Quality Checks

The orchestration functions follow the exact pattern established by `run_check_only_mode()` from the quality_checks module:

| Aspect | Quality Checks | Excel Export |
|--------|---|---|
| Standalone mode orchestration | `run_check_only_mode()` | `export_excel_only()` |
| Config loading in module | ✅ Yes | ✅ Yes |
| User confirmation in module | ✅ Yes | ✅ Yes |
| Error handling in module | ✅ Yes | ✅ Yes |
| Main script integration | 1 line call | 1 line call |

**Result:** Consistent architecture across all major features (quality checks, excel export, etc.)

---

## Configuration System

### Two-Table Configuration

The Excel export is configured through two tables in `Endobest_Dashboard_Config.xlsx`:

#### Table 1: Excel_Workbooks
Defines metadata for each Excel workbook to generate.

| Column | Type | Required | Example | Description |
|--------|------|----------|---------|-------------|
| workbook_name | Text | Yes | "Endobest_Output" | Unique identifier for workbook |
| template_path | Text | Yes | "templates/Endobest_Template.xlsx" | Path relative to config/ folder |
| output_filename | Text | Yes | "{workbook_name}_{extract_date_time}.xlsx" | Template for output filename |
| output_exists_action | Text | Yes | "Increment" | How to handle conflicts (Overwrite/Increment/Backup) |

#### Table 2: Excel_Sheets
Defines how to fill each sheet in the workbooks.

| Column | Type | Required | Example | Description |
|--------|------|----------|---------|-------------|
| workbook_name | Text | Yes | "Endobest_Output" | Must match Excel_Workbooks entry |
| sheet_name | Text | Yes | "Inclusions" | Sheet name in template |
| source_type | Text | Yes | "Inclusions" | Variable / Inclusions / Organizations |
| target | Text | Yes | "DataTable" | Named range or cell reference |
| column_mapping | JSON | Conditional | `{"col_id": "patient_id"}` | For source_type=Inclusions/Organizations only |
| filter_condition | JSON | No | `{"status": "active"}` | AND conditions for filtering |
| sort_keys | JSON | No | `[["date", "asc"], ["id", "asc", "*natsort"]]` | Sort specification with optional datetime/natsort |
| value_replacement | JSON | No | `[{"type": "bool", "true": "Yes", "false": "No"}]` | Value transformations |

---

## Data Flow & Processing Pipeline

### Overview

```
Input Data (inclusions + organizations)
    ↓
Filter (AND conditions)
    ↓
Sort (multi-key with datetime)
    ↓
Value Replacement (strict typing)
    ↓
Fill Excel Cells/Ranges (via xlwings)
    ↓
Save Workbook (xlwings)
    ↓
Formulas Automatically Recalculated (xlwings COM API)
    ↓
Final Excel File
```

### Detailed Processing Steps

#### Step 1: Filter
Applies AND conditions to select matching items.

**Logic:**
- Start with all items
- For each field in filter_condition:
  - Keep only items where field value equals expected value
  - Support nested field paths (dot notation: `patient.status`)
- Return filtered items

**Example:**
```json
{
  "status": "active",
  "visit_type": "inclusion"
}
```
Keeps only items where BOTH conditions are true.

#### Step 2: Sort
Multi-key sort with datetime awareness and missing field handling.

**Logic:**
- Apply sort keys in order (first key is primary, second is secondary, etc.)
- Detect datetime fields automatically (ISO format: YYYY-MM-DD)
- Items with missing fields go to end of sort
- Reverse order for `"desc"` order specification

**Example:**
```json
[
  {"field": "visit_type", "order": "asc"},
  {"field": "date_visit", "order": "desc"}
]
```

#### Step 3: Value Replacement
Transform cell values based on rules (first-match-wins).

**Logic:**
- Evaluate rules in order
- Stop at first matching rule
- Strict type matching (e.g., boolean `True` ≠ string `"true"`)
- Return original value if no match

**Supported Types:**
- `"bool"`: Boolean replacement with `"true"` and `"false"` fields
- `"str"`: String replacement with `"from"` and `"to"` fields
- `"int"`: Integer replacement with `"from"` and `"to"` fields

#### Step 4: Fill Excel
Place transformed data into Excel cells or named ranges.

**Two Modes:**
- **Variable (Single Cell):** Write evaluated template string to target cell
- **Table (Named Range):** Write filtered/sorted/replaced items to target range

##### 4.1 Variable Mode (Template String Substitution)

For `source_type = "Variable"`:
1. Evaluate the source template string using `.format(**template_vars)`
2. Write result to the target named cell
3. Example: `{extract_date_time_french}` → `"2025-01-15 14:30:45+01:00"`

##### 4.2 Table Mode (Data Fill with Column Mapping)

For `source_type = "Inclusions"` or `"Organizations"`:

**Key Concept:** The first row of the table target serves as BOTH TEMPLATE and FIRST DATA ROW.
Some columns may contain formulas that should NOT be overwritten (unmapped columns).

**Algorithm:**

1. **Extract Column Mapping**
   - Load mapping from Inclusions_Mapping or Organizations_Mapping table
   - Mapping column name comes from Excel_Sheets.source parameter
   - Mapping contains indices (0, 1, 2...) indicating Excel column positions
   - Example:
     ```
     Inclusions_Mapping:
     | field_name | field_group      | MainReport_PatientsList |
     | Patient_Id | Patient_Identification | 0          |
     | Status     | Inclusion        | 1                       |
     | Date       | Inclusion        | 3                       |
     (Column 2 not mapped - preserves template formula!)
     ```
   - Result: `{0: "Patient_Identification.Patient_Id", 1: "Inclusion.Status", 3: "Inclusion.Date"}`

2. **Filter and Sort Data**
   - Apply AND filter conditions
   - Apply multi-key sort with datetime parsing
   - Example: 5 items match filter, sorted by Patient_Id ascending

3. **Extend Table Rows**
   - Delete any existing data rows below the template row
   - Keep the first row (template + first data)
   - For each filtered/sorted item:
     a. Create new row (or use template row for first item)
     b. Copy ALL cells from template row (preserves formulas!)
     c. Overwrite ONLY mapped columns with JSON data
     d. Apply value_replacement to mapped values

4. **Preserve Formulas in Unmapped Columns**
   - Unmapped columns (those without index in mapping) keep template values
   - If template column contains formula, it's preserved and recalculates later
   - Allows mixed rows: some columns from JSON, some from formulas

**Example:**

Template Row (Row 1):
```
| A: P001      | B: Active     | C: =SUM(...) | D: 2025-01 |
| (mapped 0)   | (mapped 1)    | (formula!)   | (mapped 3) |
```

After processing (3 data items):
```
| A: P001      | B: Active     | C: 45        | D: 2025-01 | ← Template + first data
| A: P002      | B: Active     | C: 67        | D: 2025-02 | ← Data 2 (formula copied)
| A: P003      | B: Active     | C: 89        | D: 2025-03 | ← Data 3 (formula copied)
```

Result:
- Columns A, B, D filled with JSON data and value replacement
- Column C: Formula `=SUM(...)` copied to all rows, will recalculate
- All rows have consistent formatting from template

---

## Excel Export Functions

### Public Functions (3)

#### load_excel_export_config(console=None)
```python
def load_excel_export_config(console_instance=None):
    """Load Excel export configuration from config file.

    Reads Excel_Workbooks and Excel_Sheets tables from
    Endobest_Dashboard_Config.xlsx, parses JSON fields.

    Args:
        console_instance: Optional Rich Console for messages

    Returns:
        (config_dict, has_error: bool)

    Raises:
        None (returns error status instead)
    """
```

#### validate_excel_config(excel_config, console, inclusions_mapping, organizations_mapping)
```python
def validate_excel_config(excel_config, console_instance,
                         inclusions_mapping_config,
                         organizations_mapping_config):
    """Validate Excel configuration against templates.

    Checks that:
    - Template files exist and are valid
    - Named ranges exist in templates
    - Dimensions are correct
    - Mappings reference valid fields

    Args:
        excel_config: Config dict from load_excel_export_config()
        console_instance: Rich Console instance
        inclusions_mapping_config: List of valid inclusions fields
        organizations_mapping_config: Dict of valid organizations fields

    Returns:
        (has_critical_error: bool, error_messages: list)
    """
```

#### export_to_excel(inclusions_data, organizations_data, excel_config, console=None)
```python
def export_to_excel(inclusions_data, organizations_data, excel_config,
                   console_instance=None):
    """Main orchestration: Generate Excel files from data and config.

    xlwings-based processing with automatic formula recalculation:
    - Load template via xlwings
    - Apply data transformations (filter, sort, replace)
    - Fill cells/ranges with data
    - Save workbook (formulas auto-recalculated by Excel COM API)

    Args:
        inclusions_data: List of inclusion dicts
        organizations_data: List of organization dicts
        excel_config: Config dict from load_excel_export_config()
        console_instance: Optional Rich Console

    Returns:
        None (creates files as side effect)

    Raises:
        Catches and logs exceptions, continues with next workbook
    """
```

### Internal Functions (10)

#### _prepare_template_variables()
```python
def _prepare_template_variables():
    """Extract variables for template string substitution.

    Variables:
    - extract_date_time: Full ISO datetime (UTC→Paris TZ)
    - extract_year: Year
    - extract_month: Month (2-digit)
    - extract_day: Day (2-digit)

    Returns:
        dict: Variables for .format(**locals())
    """
```

#### _apply_filter(item, filter_condition)
```python
def _apply_filter(item, filter_condition):
    """Apply AND filter to item.

    Returns True only if ALL conditions match.
    Supports nested field paths (dot notation).

    Args:
        item: Dict to filter
        filter_condition: Dict of field:value conditions

    Returns:
        bool: True if matches, False otherwise
    """
```

#### _apply_sort(items, sort_keys)
```python
def _apply_sort(items, sort_keys):
    """Multi-key sort with datetime parsing and natural alphanumeric support.

    Handles:
    - String fields (case-insensitive comparison)
    - Numeric and datetime fields
    - Natural alphanumeric sorting (*natsort option)
    - Missing fields (placed at end)
    - Mixed ascending and descending order

    Args:
        items: List of dicts to sort
        sort_keys: List of [field, order] or [field, order, option]
                   where option can be:
                   - datetime format string (e.g., "%Y-%m-%d")
                   - "*natsort" for natural alphanumeric sorting

    Returns:
        list: Sorted items
    """
```

#### _apply_value_replacement(value, replacements)
```python
def _apply_value_replacement(value, replacements):
    """Transform value using first-matching rule.

    Strict type matching. Returns original if no match.

    Args:
        value: Original value
        replacements: List of replacement rules

    Returns:
        Replaced value or original
    """
```

#### _handle_output_exists(output_path, action)
```python
def _handle_output_exists(output_path, action):
    """Handle file conflicts: Overwrite/Increment/Backup.

    Overwrite: Returns same path (existing file will be overwritten)
    Increment: Returns path with _1, _2, etc. suffix
    Backup: Renames existing to _backup_1, etc.; returns original path

    Args:
        output_path: Target file path
        action: "Overwrite" | "Increment" | "Backup"

    Returns:
        str: Actual path to use
    """
```

#### _get_named_range_dimensions(workbook, range_name)
```python
def _get_named_range_dimensions(workbook, range_name):
    """Extract position and dimensions from named range.

    Uses openpyxl named_ranges to find range definition.

    Args:
        workbook: openpyxl Workbook object
        range_name: Name of the named range

    Returns:
        (sheet_name, start_cell, height, width)

    Raises:
        ValueError if range not found
    """
```

#### _process_sheet_xlwings(workbook_xw, sheet_config, inclusions_data, organizations_data, ...)
```python
def _process_sheet_xlwings(workbook_xw, sheet_config, inclusions_data,
                           organizations_data, inclusions_mapping_config,
                           organizations_mapping_config, template_vars):
    """Fill single sheet using xlwings (native Excel COM API).

    Routes based on source_type:
    - Variable: Evaluate template string, write to cell
    - Inclusions/Organizations: Filter, sort, fill table (bulk operation)

    Automatic formula recalculation occurs via xlwings COM API.

    Args:
        workbook_xw: xlwings Book object (open)
        sheet_config: Single sheet configuration dict
        inclusions_data, organizations_data: Source data
        inclusions_mapping_config, organizations_mapping_config: Field mappings
        template_vars: Variables for template strings

    Returns:
        bool: Success status
    """
```

#### set_dependencies(console_obj, inclusions_file, organizations_file)
```python
def set_dependencies(console_instance, inclusions_filename,
                    organizations_filename, ...):
    """Inject module-level variables (dependency injection).

    Called from main dashboard to provide:
    - console: Rich Console instance
    - File names and configuration

    Args:
        console_instance: Rich Console object
        ... (other global references)

    Returns:
        None
    """
```

---

## Filter, Sort & Replacement Logic

### AND Filter Logic

Conditions combined with AND (all must be true):

```python
filter_condition = {"status": "active", "type": "inclusion"}
# Matches: {"status": "active", "type": "inclusion", "date": "2025-01-15"}
# Does NOT match: {"status": "active", "type": "follow-up"}  (type different)
```

**Nested Field Support:**
```python
filter_condition = {"patient.status": "active"}
# Matches: {"patient": {"status": "active"}}
```

### Multi-Key Sort Logic

Sort keys applied in order (first is primary):

```python
sort_keys = [
    ["status", "asc"],                    # Primary sort
    ["date_visit", "desc"],               # Secondary sort
    ["patient_id", "asc", "*natsort"]     # Tertiary sort with natural alphanumeric
]
```

**String Comparison:**
- **Case-insensitive by default:** `"Centre"` comes before `"CHU"` (natural alphabetical order)
- Tiebreaker: Case-sensitive if lowercase versions are equal

**Datetime Handling:**
- Provide strptime format as third parameter: `["date_field", "desc", "%Y-%m-%d"]`
- Custom formats supported: `"%d/%m/%Y"`, `"%Y-%m-%d %H:%M:%S"`, etc.

**Natural Alphanumeric Sorting:**
- Use `"*natsort"` as third parameter for proper numeric segment handling
- Correctly sorts: `"ENDOBEST-003-3-BA"` < `"ENDOBEST-003-20-BA"` < `"ENDOBEST-003-100-BA"`
- Also handles: `"v1.2"` < `"v1.10"`, `"file2.txt"` < `"file10.txt"`
- Perfect for patient IDs, version codes, sequential identifiers

**Missing Values:**
- Items with missing/null/undefined field values placed at end

### Value Replacement Rules

First-matching rule wins; strict type matching:

```python
replacements = [
    {"type": "bool", "true": "Yes", "false": "No"},
    {"type": "str", "from": "active", "to": "Active"},
]

# True (boolean) → "Yes"
# "active" (string) → "Active"
# "true" (string) → "true" (no match, unchanged)
```

---

## Template Variables

### Available Variables

Template variables available in `output_filename` and Variable cell content:

| Variable | Type | Example | Notes |
|----------|------|---------|-------|
| `extract_date_time` | ISO datetime | `2025-01-15T14:30:45+01:00` | Full timestamp (UTC→Paris TZ) |
| `extract_year` | Year | `2025` | 4-digit year |
| `extract_month` | Month | `01` | 2-digit month |
| `extract_day` | Day | `15` | 2-digit day |
| `workbook_name` | Text | `"Endobest_Output"` | From config |

### Usage Examples

**Filename Template:**
```
{workbook_name}_{extract_date_time}.xlsx
→ Endobest_Output_2025-01-15T14-30-45.xlsx
```

**Variable Cell Template:**
```
Extracted: {extract_date_time}
→ Extracted: 2025-01-15T14:30:45+01:00
```

---

## File Conflict Handling

### Three Strategies

#### 1. Overwrite
- Deletes existing file
- Writes new file with same name

```
output_path: report.xlsx
result: report.xlsx (new)
```

#### 2. Increment
- Finds next available number
- Appends _1, _2, etc. to filename

```
existing: report.xlsx, report_1.xlsx, report_2.xlsx
output_path: report.xlsx
result: report_3.xlsx
```

#### 3. Backup
- Renames existing to _backup_N
- Writes new file with original name

```
existing: report.xlsx
output_path: report.xlsx
result:
  - report_backup_1.xlsx (renamed)
  - report.xlsx (new)
```

---

## Integration with Main Dashboard

### Integration Points

1. **Startup Validation (before collection):**
   ```python
   EXCEL_EXPORT_CONFIG, error = load_excel_export_config(console)
   if error:
       # Ask user confirmation
       EXCEL_EXPORT_ENABLED = False
   ```

2. **After JSON Export (after collection):**
   ```python
   if EXCEL_EXPORT_ENABLED:
       inclusions = load_json_file(INCLUSIONS_FILE_NAME)
       organizations = load_json_file(ORGANIZATIONS_FILE_NAME)
       export_to_excel(inclusions, organizations, EXCEL_EXPORT_CONFIG, console)
   ```

3. **--excel-only Mode:**
   ```python
   if "--excel-only" in sys.argv:
       inclusions = load_json_file(INCLUSIONS_FILE_NAME)
       organizations = load_json_file(ORGANIZATIONS_FILE_NAME)
       export_to_excel(inclusions, organizations, EXCEL_EXPORT_CONFIG, console)
   ```

### Global Variables

Added to `eb_dashboard.py`:

```python
EXCEL_EXPORT_CONFIG = None          # Loaded config
EXCEL_EXPORT_ENABLED = False        # Flag to enable/disable export

# Constants
EXCEL_WORKBOOKS_TABLE_NAME = "Excel_Workbooks"
EXCEL_SHEETS_TABLE_NAME = "Excel_Sheets"
```

---

## Error Handling & Validation

### Validation Stages

#### Stage 1: Config Loading (Startup)
- File exists and valid Excel format
- Required columns present
- JSON parsing succeeds
- Returns error status

#### Stage 2: Config Validation (Startup)
- Templates exist in `config/` folder
- Templates valid `.xlsx` files
- Named ranges exist
- Dimensions correct
- Returns critical error status

#### Stage 3: User Confirmation (Startup)
- If critical errors found:
  - Display error messages
  - Ask user to continue or abort
  - Set EXCEL_EXPORT_ENABLED flag

#### Stage 4: Runtime Error Handling
- Try/except wraps main export
- Logs detailed errors
- Continues with next workbook
- Displays summary

### Error Messages

**Critical Config Error:**
```
⚠ CRITICAL CONFIGURATION ERROR(S) DETECTED
────────────────────────────────────
Error 1: Template file missing: config/templates/Missing.xlsx
Error 2: Named range not found: MyRange in sheet MySheet
...
Do you want to continue anyway? [y/N]:
```

**Runtime Error:**
```
✗ Excel export failed: [Specific error message]
(See dashboard.log for full traceback)
```

---

## Configuration Examples

### Example 1: Simple Inclusion List

**Excel_Workbooks:**
| workbook_name | template_path | output_filename | output_exists_action |
|---|---|---|---|
| Inclusions_Report | templates/Simple.xlsx | Inclusions_{extract_date_time}.xlsx | Increment |

**Excel_Sheets:**
| workbook_name | sheet_name | source_type | target | column_mapping | filter_condition | sort_keys | value_replacement |
|---|---|---|---|---|---|---|---|
| Inclusions_Report | Data | Inclusions | DataTable | {"col_id": "patient_id", "col_name": "name"} | {"status": "active"} | [{"field": "date_inclusion", "order": "asc"}] | null |

### Example 2: Multi-Sheet with Variables

**Excel_Sheets (multiple rows):**
| workbook_name | sheet_name | source_type | target | ... |
|---|---|---|---|---|
| Report | Title | Variable | TitleCell | ... |
| Report | Inclusions | Inclusions | InclusionTable | ... |
| Report | Organizations | Organizations | OrgTable | ... |

### Example 3: Value Replacement

**Excel_Sheets:**
```
value_replacement: [
    {
        "type": "bool",
        "true": "Yes",
        "false": "No"
    },
    {
        "type": "str",
        "from": "active",
        "to": "Active Status"
    }
]
```

---

## Troubleshooting & Debugging

### Common Issues

#### "Template file missing"
**Cause:** Template path incorrect or file not in `config/` folder
**Solution:** Verify file exists at `config/{template_path}`

#### "Named range not found"
**Cause:** Range name in config doesn't exist in template
**Solution:** Check range name in Excel (Formulas → Define Names → Name Manager)

#### "Dimensions mismatch"
**Cause:** Column count in mapping exceeds named range width
**Solution:** Verify named range dimensions and column mapping count match

#### "Formulas not recalculating"
**Cause:** xlwings not installed or Excel not available on system
**Solution:** Ensure xlwings is installed (`pip install xlwings`) and Excel is available. Formulas are automatically recalculated by xlwings via COM API.

### Debug Mode

```bash
python eb_dashboard.py --debug
```

Enables verbose logging with detailed Excel export operations.

### Log File

Check `dashboard.log` for:
- Configuration load/validation results
- Each workbook processing
- Filter/sort/replace operations
- File creation details
- Error details and tracebacks

---

## Notes for Developers

### Adding New Features

1. **New Transformation Step:** Add function to `eb_dashboard_excel_export.py`, call from `_process_sheet_xlwings()`
2. **New Source Type:** Add case to `_process_sheet_xlwings()` router (update SOURCE_TYPES in constants)
3. **New Template Variable:** Add to `_prepare_template_variables()`
4. **Update Constants:** Add new values to `eb_dashboard_constants.py` (single source of truth)

### Testing

- Unit tests: `test_core_logic.py` (26 tests, 100% pass)
- No external dependencies needed (pure function testing)
- Integration tests: Use `--excel_only` mode with real data

### Performance Considerations

- **Data Filtering:** O(n) per filter rule
- **Sorting:** O(n log n)
- **Excel Fill:** O(n) for cells, time depends on file size
- **Typical Duration:** 1-5 seconds per workbook (depends on data volume and template complexity)

---

**End of Excel Export Architecture Documentation**