33 KiB
Endobest Excel Export Feature & Architecture
Part 4: Configuration-Driven Excel Workbook Generation
Document Version: 1.1 Last Updated: 2025-11-11 Audience: Developers, Business Analysts, System Architects Language: English
Table of Contents
- Overview
- Architecture & Design
- Core Components
- High-Level Orchestration Functions (v1.1+)
- Configuration System
- Data Flow & Processing Pipeline
- Excel Export Functions
- Filter, Sort & Replacement Logic
- Template Variables
- File Conflict Handling
- Integration with Main Dashboard
- Error Handling & Validation
- Configuration Examples
- Troubleshooting & Debugging
Overview
The Excel Export Feature enables generation of configurable Excel workbooks from patient inclusion data and organization statistics. The system is entirely configuration-driven, allowing non-technical users to define export behavior through Excel configuration tables without code modifications.
Key Characteristics
Configuration-Driven Design:
- All export behavior defined in
Endobest_Dashboard_Config.xlsx - Two tables:
Excel_Workbooks(metadata) andExcel_Sheets(sheet definitions) - No code changes needed to modify export behavior
Modular Architecture:
- New module:
eb_dashboard_excel_export.py - Separation of concerns: Excel logic isolated from main dashboard
- Dependency injection for testing and flexibility
Data Transformation:
- Filter: AND conditions with nested field support
- Sort: Multi-key sorting with case-insensitive strings, datetime parsing, natural alphanumeric sorting (
*natsort) - Replace: Strict type matching with first-match-wins logic
- Fill: Direct cell or named range targeting
Note: For complete configuration details and up-to-date column specifications, refer to
DOCUMENTATION_99_CONFIG_GUIDE.md
Three Operating Modes:
- Normal: Full collection → Quality checks → JSON export → Excel export
- --excel-only: Load existing JSON → Excel export (fast iteration)
- --check-only: Quality checks only (unchanged, for backward compatibility)
Architecture & Design
Module Structure
eb_dashboard_excel_export.py
├── Imports & Dependencies
├── Constants & Configuration
│ └── EXCEL_RECALC_TIMEOUT = 60
├── Module-Level Variables (Injected)
│ ├── console (Rich Console instance)
│ ├── DASHBOARD_CONFIG_FILE_NAME
│ └── Other global references
│
├── Public API (Called from main)
│ ├── load_excel_export_config(console)
│ ├── validate_excel_config(excel_config, console, inclusions_mapping, organizations_mapping)
│ └── export_to_excel(inclusions_data, organizations_data, excel_config, console)
│
└── Internal Functions (Helpers)
├── _prepare_template_variables()
├── _apply_filter(item, filter_condition)
├── _apply_sort(items, sort_keys)
├── _apply_value_replacement(value, replacements)
├── _handle_output_exists(output_path, action)
├── _get_named_range_dimensions(workbook, range_name) [openpyxl - validation phase]
├── _get_table_dimensions_xlwings(workbook_xw, range_name) [xlwings - data processing]
├── _recalculate_workbook(workbook_path)
├── _process_sheet_xlwings(workbook_xw, sheet_config, ...) [xlwings - data fill]
└── set_dependencies(...)
Design Principles
- Configuration-First: Behavior determined by config, not code
- Pure Functions: Helper functions are pure (no side effects) except I/O
- xlwings-First Architecture: Data processing uses xlwings exclusively (native Excel COM API)
- Configuration validation uses openpyxl (read-only, lighter footprint)
- Data fill & processing uses xlwings (preserves workbook structure, formulas, images)
- Automatic formula recalculation via xlwings COM API during cell updates
- No redundant file reloads - metadata read via COM API without reloading
- Early Validation: Config errors detected at startup, before data collection
Core Components
1. Configuration Loading
Function: load_excel_export_config(console)
Loads Excel export configuration from the Endobest_Dashboard_Config.xlsx file.
Responsibilities:
- Read
Excel_Workbookstable - Read
Excel_Sheetstable - Parse JSON fields (filter_condition, sort_keys, value_replacement)
- Validate structure and presence of required columns
- Return parsed config and error status
Return Value:
(config_dict, has_error: bool)
Config Structure:
{
"workbooks": [
{
"workbook_name": str,
"template_path": str,
"output_filename": str,
"output_exists_action": "Overwrite" | "Increment" | "Backup"
},
...
],
"sheets": [
{
"workbook_name": str,
"sheet_name": str,
"source_type": "Variable" | "Inclusions" | "Organizations",
"target": str,
"column_mapping": dict | None,
"filter_condition": dict | None,
"sort_keys": list | None,
"value_replacement": list | None
},
...
]
}
2. Configuration Validation
Function: validate_excel_config(excel_config, console, inclusions_mapping, organizations_mapping)
Validates that all referenced templates exist and have correct structure.
Validations Performed:
- Template files exist in
config/directory - Template files are valid Excel (
.xlsx) - Named ranges exist in templates
- Named range dimensions correct (height=1 for tables, width≥max index)
- Column mappings reference valid fields
- Source types are valid
Return Value:
(has_critical_error: bool, error_messages: list)
3. Excel Export Orchestration
Function: export_to_excel(inclusions_data, organizations_data, excel_config, console)
Main orchestration function for Excel export.
Workflow:
- Prepare template variables (timestamp, extract_date_time, etc.)
- For each workbook in config:
- Resolve output filename using template variables
- Handle file conflicts (Overwrite/Increment/Backup)
- Copy template to output location
- XLWINGS PHASE (native Excel COM API):
- Load workbook with xlwings
- For each sheet config:
- Apply filters, sorts, replacements
- Read metadata via xlwings COM API (no file reloads)
- Fill cells/named ranges with data
- Formulas automatically recalculated by Excel COM API
- Save workbook
- Log summary and completion
Architecture Change (v1.2+):
- Migration from openpyxl to xlwings eliminated need for separate win32com recalculation phase
- xlwings uses native Excel COM API, which automatically recalculates formulas during cell updates
- Simplified workflow: one Excel session, no hand-off between libraries
High-Level Orchestration Functions (v1.1+)
New in v1.1: Three high-level orchestration functions were added to completely externalize Excel export orchestration from the main script. These functions follow the established pattern from the quality_checks module.
1. export_excel_only(sys_argv, console_instance, inclusions_filename, organizations_filename, inclusions_mapping_config, organizations_mapping_config)
Purpose: Complete orchestration of --excel-only CLI mode
Workflow:
- Initialize console and set default filenames
- Call
prepare_excel_export()to load and validate - Handle critical configuration errors with user confirmation
- Call
execute_excel_export()to perform export - Display results and return
Usage in Main Script:
if excel_only_mode:
export_excel_only(sys.argv, console, INCLUSIONS_FILE_NAME, ORGANIZATIONS_FILE_NAME,
INCLUSIONS_MAPPING_CONFIG, {})
return
Impact: Reduces main script from 34 lines to 4 lines (87% reduction)
2. run_normal_mode_export(inclusions_data, organizations_data, excel_enabled, excel_config, console_instance, inclusions_mapping_config, organizations_mapping_config)
Purpose: Orchestrates Excel export phase during normal workflow
Workflow:
- Check if export enabled (returns early if not)
- Load JSONs from filesystem (ensures consistency)
- Call
execute_excel_export()to perform export - Display results and return status tuple
Returns: (success: bool, error_message: str)
Usage in Main Script:
# After JSONs are written to disk
run_normal_mode_export(output_inclusions, organizations_list, EXCEL_EXPORT_ENABLED,
EXCEL_EXPORT_CONFIG, console, INCLUSIONS_MAPPING_CONFIG, {})
Impact: Reduces main script from 19 lines to 2 lines (89% reduction)
3. prepare_excel_export(inclusions_filename, organizations_filename, console_instance, inclusions_mapping_config, organizations_mapping_config)
Purpose: Centralized preparation function - loads JSONs, config, and validates
Responsibility:
- Load inclusions JSON from filesystem
- Load organizations JSON from filesystem
- Load Excel export configuration
- Validate configuration against templates
- Aggregate and return all errors
Returns: (prep_success: bool, inclusions_data, organizations_data, excel_config, has_critical_errors: bool, error_messages: list)
Used By: Both export_excel_only() and potentially run_normal_mode_export()
4. execute_excel_export(inclusions_data, organizations_data, excel_config, console_instance, inclusions_mapping_config, organizations_mapping_config)
Purpose: Execute Excel export with comprehensive error handling
Responsibility:
- Call core
export_to_excel()function - Catch and log all exceptions
- Return success/failure status to caller
Returns: (success: bool, error_message: str)
Error Handling: All exceptions caught and returned as error messages (never raises)
5. _load_json_file_internal(filename)
Purpose: Internal helper for safe JSON file loading
Responsibility:
- Check file existence
- Load and parse JSON
- Handle errors gracefully
- Return None on failure (instead of raising)
Used By: run_normal_mode_export() internally
Design Pattern: Consistency with Quality Checks
The orchestration functions follow the exact pattern established by run_check_only_mode() from the quality_checks module:
| Aspect | Quality Checks | Excel Export |
|---|---|---|
| Standalone mode orchestration | run_check_only_mode() |
export_excel_only() |
| Config loading in module | ✅ Yes | ✅ Yes |
| User confirmation in module | ✅ Yes | ✅ Yes |
| Error handling in module | ✅ Yes | ✅ Yes |
| Main script integration | 1 line call | 1 line call |
Result: Consistent architecture across all major features (quality checks, excel export, etc.)
Configuration System
Two-Table Configuration
The Excel export is configured through two tables in Endobest_Dashboard_Config.xlsx:
Table 1: Excel_Workbooks
Defines metadata for each Excel workbook to generate.
| Column | Type | Required | Example | Description |
|---|---|---|---|---|
| workbook_name | Text | Yes | "Endobest_Output" | Unique identifier for workbook |
| template_path | Text | Yes | "templates/Endobest_Template.xlsx" | Path relative to config/ folder |
| output_filename | Text | Yes | "{workbook_name}_{extract_date_time}.xlsx" | Template for output filename |
| output_exists_action | Text | Yes | "Increment" | How to handle conflicts (Overwrite/Increment/Backup) |
Table 2: Excel_Sheets
Defines how to fill each sheet in the workbooks.
| Column | Type | Required | Example | Description |
|---|---|---|---|---|
| workbook_name | Text | Yes | "Endobest_Output" | Must match Excel_Workbooks entry |
| sheet_name | Text | Yes | "Inclusions" | Sheet name in template |
| source_type | Text | Yes | "Inclusions" | Variable / Inclusions / Organizations |
| target | Text | Yes | "DataTable" | Named range or cell reference |
| column_mapping | JSON | Conditional | {"col_id": "patient_id"} |
For source_type=Inclusions/Organizations only |
| filter_condition | JSON | No | {"status": "active"} |
AND conditions for filtering |
| sort_keys | JSON | No | [["date", "asc"], ["id", "asc", "*natsort"]] |
Sort specification with optional datetime/natsort |
| value_replacement | JSON | No | [{"type": "bool", "true": "Yes", "false": "No"}] |
Value transformations |
Data Flow & Processing Pipeline
Overview
Input Data (inclusions + organizations)
↓
Filter (AND conditions)
↓
Sort (multi-key with datetime)
↓
Value Replacement (strict typing)
↓
Fill Excel Cells/Ranges (via xlwings)
↓
Save Workbook (xlwings)
↓
Formulas Automatically Recalculated (xlwings COM API)
↓
Final Excel File
Detailed Processing Steps
Step 1: Filter
Applies AND conditions to select matching items.
Logic:
- Start with all items
- For each field in filter_condition:
- Keep only items where field value equals expected value
- Support nested field paths (dot notation:
patient.status)
- Return filtered items
Example:
{
"status": "active",
"visit_type": "inclusion"
}
Keeps only items where BOTH conditions are true.
Step 2: Sort
Multi-key sort with datetime awareness and missing field handling.
Logic:
- Apply sort keys in order (first key is primary, second is secondary, etc.)
- Detect datetime fields automatically (ISO format: YYYY-MM-DD)
- Items with missing fields go to end of sort
- Reverse order for
"desc"order specification
Example:
[
{"field": "visit_type", "order": "asc"},
{"field": "date_visit", "order": "desc"}
]
Step 3: Value Replacement
Transform cell values based on rules (first-match-wins).
Logic:
- Evaluate rules in order
- Stop at first matching rule
- Strict type matching (e.g., boolean
True≠ string"true") - Return original value if no match
Supported Types:
"bool": Boolean replacement with"true"and"false"fields"str": String replacement with"from"and"to"fields"int": Integer replacement with"from"and"to"fields
Step 4: Fill Excel
Place transformed data into Excel cells or named ranges.
Two Modes:
- Variable (Single Cell): Write evaluated template string to target cell
- Table (Named Range): Write filtered/sorted/replaced items to target range
4.1 Variable Mode (Template String Substitution)
For source_type = "Variable":
- Evaluate the source template string using
.format(**template_vars) - Write result to the target named cell
- Example:
{extract_date_time_french}→"2025-01-15 14:30:45+01:00"
4.2 Table Mode (Data Fill with Column Mapping)
For source_type = "Inclusions" or "Organizations":
Key Concept: The first row of the table target serves as BOTH TEMPLATE and FIRST DATA ROW. Some columns may contain formulas that should NOT be overwritten (unmapped columns).
Algorithm:
-
Extract Column Mapping
- Load mapping from Inclusions_Mapping or Organizations_Mapping table
- Mapping column name comes from Excel_Sheets.source parameter
- Mapping contains indices (0, 1, 2...) indicating Excel column positions
- Example:
Inclusions_Mapping: | field_name | field_group | MainReport_PatientsList | | Patient_Id | Patient_Identification | 0 | | Status | Inclusion | 1 | | Date | Inclusion | 3 | (Column 2 not mapped - preserves template formula!) - Result:
{0: "Patient_Identification.Patient_Id", 1: "Inclusion.Status", 3: "Inclusion.Date"}
-
Filter and Sort Data
- Apply AND filter conditions
- Apply multi-key sort with datetime parsing
- Example: 5 items match filter, sorted by Patient_Id ascending
-
Extend Table Rows
- Delete any existing data rows below the template row
- Keep the first row (template + first data)
- For each filtered/sorted item: a. Create new row (or use template row for first item) b. Copy ALL cells from template row (preserves formulas!) c. Overwrite ONLY mapped columns with JSON data d. Apply value_replacement to mapped values
-
Preserve Formulas in Unmapped Columns
- Unmapped columns (those without index in mapping) keep template values
- If template column contains formula, it's preserved and recalculates later
- Allows mixed rows: some columns from JSON, some from formulas
Example:
Template Row (Row 1):
| A: P001 | B: Active | C: =SUM(...) | D: 2025-01 |
| (mapped 0) | (mapped 1) | (formula!) | (mapped 3) |
After processing (3 data items):
| A: P001 | B: Active | C: 45 | D: 2025-01 | ← Template + first data
| A: P002 | B: Active | C: 67 | D: 2025-02 | ← Data 2 (formula copied)
| A: P003 | B: Active | C: 89 | D: 2025-03 | ← Data 3 (formula copied)
Result:
- Columns A, B, D filled with JSON data and value replacement
- Column C: Formula
=SUM(...)copied to all rows, will recalculate - All rows have consistent formatting from template
Excel Export Functions
Public Functions (3)
load_excel_export_config(console=None)
def load_excel_export_config(console_instance=None):
"""Load Excel export configuration from config file.
Reads Excel_Workbooks and Excel_Sheets tables from
Endobest_Dashboard_Config.xlsx, parses JSON fields.
Args:
console_instance: Optional Rich Console for messages
Returns:
(config_dict, has_error: bool)
Raises:
None (returns error status instead)
"""
validate_excel_config(excel_config, console, inclusions_mapping, organizations_mapping)
def validate_excel_config(excel_config, console_instance,
inclusions_mapping_config,
organizations_mapping_config):
"""Validate Excel configuration against templates.
Checks that:
- Template files exist and are valid
- Named ranges exist in templates
- Dimensions are correct
- Mappings reference valid fields
Args:
excel_config: Config dict from load_excel_export_config()
console_instance: Rich Console instance
inclusions_mapping_config: List of valid inclusions fields
organizations_mapping_config: Dict of valid organizations fields
Returns:
(has_critical_error: bool, error_messages: list)
"""
export_to_excel(inclusions_data, organizations_data, excel_config, console=None)
def export_to_excel(inclusions_data, organizations_data, excel_config,
console_instance=None):
"""Main orchestration: Generate Excel files from data and config.
xlwings-based processing with automatic formula recalculation:
- Load template via xlwings
- Apply data transformations (filter, sort, replace)
- Fill cells/ranges with data
- Save workbook (formulas auto-recalculated by Excel COM API)
Args:
inclusions_data: List of inclusion dicts
organizations_data: List of organization dicts
excel_config: Config dict from load_excel_export_config()
console_instance: Optional Rich Console
Returns:
None (creates files as side effect)
Raises:
Catches and logs exceptions, continues with next workbook
"""
Internal Functions (10)
_prepare_template_variables()
def _prepare_template_variables():
"""Extract variables for template string substitution.
Variables:
- extract_date_time: Full ISO datetime (UTC→Paris TZ)
- extract_year: Year
- extract_month: Month (2-digit)
- extract_day: Day (2-digit)
Returns:
dict: Variables for .format(**locals())
"""
_apply_filter(item, filter_condition)
def _apply_filter(item, filter_condition):
"""Apply AND filter to item.
Returns True only if ALL conditions match.
Supports nested field paths (dot notation).
Args:
item: Dict to filter
filter_condition: Dict of field:value conditions
Returns:
bool: True if matches, False otherwise
"""
_apply_sort(items, sort_keys)
def _apply_sort(items, sort_keys):
"""Multi-key sort with datetime parsing and natural alphanumeric support.
Handles:
- String fields (case-insensitive comparison)
- Numeric and datetime fields
- Natural alphanumeric sorting (*natsort option)
- Missing fields (placed at end)
- Mixed ascending and descending order
Args:
items: List of dicts to sort
sort_keys: List of [field, order] or [field, order, option]
where option can be:
- datetime format string (e.g., "%Y-%m-%d")
- "*natsort" for natural alphanumeric sorting
Returns:
list: Sorted items
"""
_apply_value_replacement(value, replacements)
def _apply_value_replacement(value, replacements):
"""Transform value using first-matching rule.
Strict type matching. Returns original if no match.
Args:
value: Original value
replacements: List of replacement rules
Returns:
Replaced value or original
"""
_handle_output_exists(output_path, action)
def _handle_output_exists(output_path, action):
"""Handle file conflicts: Overwrite/Increment/Backup.
Overwrite: Returns same path (existing file will be overwritten)
Increment: Returns path with _1, _2, etc. suffix
Backup: Renames existing to _backup_1, etc.; returns original path
Args:
output_path: Target file path
action: "Overwrite" | "Increment" | "Backup"
Returns:
str: Actual path to use
"""
_get_named_range_dimensions(workbook, range_name)
def _get_named_range_dimensions(workbook, range_name):
"""Extract position and dimensions from named range.
Uses openpyxl named_ranges to find range definition.
Args:
workbook: openpyxl Workbook object
range_name: Name of the named range
Returns:
(sheet_name, start_cell, height, width)
Raises:
ValueError if range not found
"""
_process_sheet_xlwings(workbook_xw, sheet_config, inclusions_data, organizations_data, ...)
def _process_sheet_xlwings(workbook_xw, sheet_config, inclusions_data,
organizations_data, inclusions_mapping_config,
organizations_mapping_config, template_vars):
"""Fill single sheet using xlwings (native Excel COM API).
Routes based on source_type:
- Variable: Evaluate template string, write to cell
- Inclusions/Organizations: Filter, sort, fill table (bulk operation)
Automatic formula recalculation occurs via xlwings COM API.
Args:
workbook_xw: xlwings Book object (open)
sheet_config: Single sheet configuration dict
inclusions_data, organizations_data: Source data
inclusions_mapping_config, organizations_mapping_config: Field mappings
template_vars: Variables for template strings
Returns:
bool: Success status
"""
set_dependencies(console_obj, inclusions_file, organizations_file)
def set_dependencies(console_instance, inclusions_filename,
organizations_filename, ...):
"""Inject module-level variables (dependency injection).
Called from main dashboard to provide:
- console: Rich Console instance
- File names and configuration
Args:
console_instance: Rich Console object
... (other global references)
Returns:
None
"""
Filter, Sort & Replacement Logic
AND Filter Logic
Conditions combined with AND (all must be true):
filter_condition = {"status": "active", "type": "inclusion"}
# Matches: {"status": "active", "type": "inclusion", "date": "2025-01-15"}
# Does NOT match: {"status": "active", "type": "follow-up"} (type different)
Nested Field Support:
filter_condition = {"patient.status": "active"}
# Matches: {"patient": {"status": "active"}}
Multi-Key Sort Logic
Sort keys applied in order (first is primary):
sort_keys = [
["status", "asc"], # Primary sort
["date_visit", "desc"], # Secondary sort
["patient_id", "asc", "*natsort"] # Tertiary sort with natural alphanumeric
]
String Comparison:
- Case-insensitive by default:
"Centre"comes before"CHU"(natural alphabetical order) - Tiebreaker: Case-sensitive if lowercase versions are equal
Datetime Handling:
- Provide strptime format as third parameter:
["date_field", "desc", "%Y-%m-%d"] - Custom formats supported:
"%d/%m/%Y","%Y-%m-%d %H:%M:%S", etc.
Natural Alphanumeric Sorting:
- Use
"*natsort"as third parameter for proper numeric segment handling - Correctly sorts:
"ENDOBEST-003-3-BA"<"ENDOBEST-003-20-BA"<"ENDOBEST-003-100-BA" - Also handles:
"v1.2"<"v1.10","file2.txt"<"file10.txt" - Perfect for patient IDs, version codes, sequential identifiers
Missing Values:
- Items with missing/null/undefined field values placed at end
Value Replacement Rules
First-matching rule wins; strict type matching:
replacements = [
{"type": "bool", "true": "Yes", "false": "No"},
{"type": "str", "from": "active", "to": "Active"},
]
# True (boolean) → "Yes"
# "active" (string) → "Active"
# "true" (string) → "true" (no match, unchanged)
Template Variables
Available Variables
Template variables available in output_filename and Variable cell content:
| Variable | Type | Example | Notes |
|---|---|---|---|
extract_date_time |
ISO datetime | 2025-01-15T14:30:45+01:00 |
Full timestamp (UTC→Paris TZ) |
extract_year |
Year | 2025 |
4-digit year |
extract_month |
Month | 01 |
2-digit month |
extract_day |
Day | 15 |
2-digit day |
workbook_name |
Text | "Endobest_Output" |
From config |
Usage Examples
Filename Template:
{workbook_name}_{extract_date_time}.xlsx
→ Endobest_Output_2025-01-15T14-30-45.xlsx
Variable Cell Template:
Extracted: {extract_date_time}
→ Extracted: 2025-01-15T14:30:45+01:00
File Conflict Handling
Three Strategies
1. Overwrite
- Deletes existing file
- Writes new file with same name
output_path: report.xlsx
result: report.xlsx (new)
2. Increment
- Finds next available number
- Appends _1, _2, etc. to filename
existing: report.xlsx, report_1.xlsx, report_2.xlsx
output_path: report.xlsx
result: report_3.xlsx
3. Backup
- Renames existing to _backup_N
- Writes new file with original name
existing: report.xlsx
output_path: report.xlsx
result:
- report_backup_1.xlsx (renamed)
- report.xlsx (new)
Integration with Main Dashboard
Integration Points
-
Startup Validation (before collection):
EXCEL_EXPORT_CONFIG, error = load_excel_export_config(console) if error: # Ask user confirmation EXCEL_EXPORT_ENABLED = False -
After JSON Export (after collection):
if EXCEL_EXPORT_ENABLED: inclusions = load_json_file(INCLUSIONS_FILE_NAME) organizations = load_json_file(ORGANIZATIONS_FILE_NAME) export_to_excel(inclusions, organizations, EXCEL_EXPORT_CONFIG, console) -
--excel-only Mode:
if "--excel-only" in sys.argv: inclusions = load_json_file(INCLUSIONS_FILE_NAME) organizations = load_json_file(ORGANIZATIONS_FILE_NAME) export_to_excel(inclusions, organizations, EXCEL_EXPORT_CONFIG, console)
Global Variables
Added to eb_dashboard.py:
EXCEL_EXPORT_CONFIG = None # Loaded config
EXCEL_EXPORT_ENABLED = False # Flag to enable/disable export
# Constants
EXCEL_WORKBOOKS_TABLE_NAME = "Excel_Workbooks"
EXCEL_SHEETS_TABLE_NAME = "Excel_Sheets"
Error Handling & Validation
Validation Stages
Stage 1: Config Loading (Startup)
- File exists and valid Excel format
- Required columns present
- JSON parsing succeeds
- Returns error status
Stage 2: Config Validation (Startup)
- Templates exist in
config/folder - Templates valid
.xlsxfiles - Named ranges exist
- Dimensions correct
- Returns critical error status
Stage 3: User Confirmation (Startup)
- If critical errors found:
- Display error messages
- Ask user to continue or abort
- Set EXCEL_EXPORT_ENABLED flag
Stage 4: Runtime Error Handling
- Try/except wraps main export
- Logs detailed errors
- Continues with next workbook
- Displays summary
Error Messages
Critical Config Error:
⚠ CRITICAL CONFIGURATION ERROR(S) DETECTED
────────────────────────────────────
Error 1: Template file missing: config/templates/Missing.xlsx
Error 2: Named range not found: MyRange in sheet MySheet
...
Do you want to continue anyway? [y/N]:
Runtime Error:
✗ Excel export failed: [Specific error message]
(See dashboard.log for full traceback)
Configuration Examples
Example 1: Simple Inclusion List
Excel_Workbooks:
| workbook_name | template_path | output_filename | output_exists_action |
|---|---|---|---|
| Inclusions_Report | templates/Simple.xlsx | Inclusions_{extract_date_time}.xlsx | Increment |
Excel_Sheets:
| workbook_name | sheet_name | source_type | target | column_mapping | filter_condition | sort_keys | value_replacement |
|---|---|---|---|---|---|---|---|
| Inclusions_Report | Data | Inclusions | DataTable | {"col_id": "patient_id", "col_name": "name"} | {"status": "active"} | [{"field": "date_inclusion", "order": "asc"}] | null |
Example 2: Multi-Sheet with Variables
Excel_Sheets (multiple rows):
| workbook_name | sheet_name | source_type | target | ... |
|---|---|---|---|---|
| Report | Title | Variable | TitleCell | ... |
| Report | Inclusions | Inclusions | InclusionTable | ... |
| Report | Organizations | Organizations | OrgTable | ... |
Example 3: Value Replacement
Excel_Sheets:
value_replacement: [
{
"type": "bool",
"true": "Yes",
"false": "No"
},
{
"type": "str",
"from": "active",
"to": "Active Status"
}
]
Troubleshooting & Debugging
Common Issues
"Template file missing"
Cause: Template path incorrect or file not in config/ folder
Solution: Verify file exists at config/{template_path}
"Named range not found"
Cause: Range name in config doesn't exist in template Solution: Check range name in Excel (Formulas → Define Names → Name Manager)
"Dimensions mismatch"
Cause: Column count in mapping exceeds named range width Solution: Verify named range dimensions and column mapping count match
"Formulas not recalculating"
Cause: xlwings not installed or Excel not available on system
Solution: Ensure xlwings is installed (pip install xlwings) and Excel is available. Formulas are automatically recalculated by xlwings via COM API.
Debug Mode
python eb_dashboard.py --debug
Enables verbose logging with detailed Excel export operations.
Log File
Check dashboard.log for:
- Configuration load/validation results
- Each workbook processing
- Filter/sort/replace operations
- File creation details
- Error details and tracebacks
Notes for Developers
Adding New Features
- New Transformation Step: Add function to
eb_dashboard_excel_export.py, call from_process_sheet_xlwings() - New Source Type: Add case to
_process_sheet_xlwings()router (update SOURCE_TYPES in constants) - New Template Variable: Add to
_prepare_template_variables() - Update Constants: Add new values to
eb_dashboard_constants.py(single source of truth)
Testing
- Unit tests:
test_core_logic.py(26 tests, 100% pass) - No external dependencies needed (pure function testing)
- Integration tests: Use
--excel_onlymode with real data
Performance Considerations
- Data Filtering: O(n) per filter rule
- Sorting: O(n log n)
- Excel Fill: O(n) for cells, time depends on file size
- Typical Duration: 1-5 seconds per workbook (depends on data volume and template complexity)
End of Excel Export Architecture Documentation