25 KiB
Endobest Dashboard - Configuration Guide
Document Version: 1.0 Last Updated: 2025-11-08 Audience: System Administrators, Configuration Managers Language: English
Configuration Overview
The Endobest Dashboard is configured entirely through Excel files - no code changes needed.
Main Configuration File
File Location: config/Endobest_Dashboard_Config.xlsx
Contains:
Inclusions_Mapping- Field definitions for inclusion dataOrganizations_Mapping- Field definitions for organization dataExcel_Workbooks- Metadata for Excel exportExcel_Sheets- Sheet definitions and data transformation rulesRegression_Check- Quality check rules
This guide focuses on Excel_Workbooks and Excel_Sheets tables (for Excel export configuration).
Table of Contents
- File Location & Structure
- Inclusions_Mapping (Reference)
- Organizations_Mapping (Reference)
- Excel_Workbooks Table
- Excel_Sheets Table
- Data Types & Formats
- JSON Field Specifications
- Naming Conventions
- Configuration Examples
- Validation & Error Messages
- Best Practices
- Troubleshooting
File Location & Structure
Directory Layout
Endobest Dashboard/
├── eb_dashboard.py (main script)
├── config/
│ ├── Endobest_Dashboard_Config.xlsx (← CONFIGURATION FILE)
│ ├── Endobest_Extended_Fields.xlsx (old, deprecated)
│ ├── eb_org_center_mapping.xlsx
│ └── templates/
│ ├── Endobest_Template.xlsx
│ ├── Statistics_Template.xlsx
│ └── (other templates)
├── endobest_inclusions.json (output)
├── endobest_organizations.json (output)
└── dashboard.log
Opening & Editing
- Open
config/Endobest_Dashboard_Config.xlsxin Excel - Go to specific sheet tab
- Edit rows as needed
- Save file
- Run script - changes take effect on next run
Important: Do NOT change column order or delete required columns.
Inclusions_Mapping (Reference)
This table defines which patient fields to include in export.
Purpose
Specifies which inclusion data fields are available for use in:
- Excel export (column_mapping in Excel_Sheets)
- Quality checks
- Regression testing
Columns
| Column | Type | Example | Notes |
|---|---|---|---|
| Field_Selection | Action | [["include", "."]] | Pipeline of include/exclude actions |
| Field_Name | Text | patient_id | Internal name used in column_mapping |
Usage in Excel Export
The Field_Name values are used in column_mapping:
{
"col_patient_id": "patient_id",
"col_name": "patient_name",
"col_status": "inclusion_status"
}
Map Excel Column Name → Inclusion Field Name
Organizations_Mapping (Reference)
This table defines which organization fields to include in export.
Purpose
Specifies which organization data fields are available for use in:
- Excel export (column_mapping for Organizations source_type)
- Quality checks
Columns
| Column | Type | Example | Notes |
|---|---|---|---|
| Field_Name | Text | org_id | Internal name |
| org_id | Text | org.id | Data source path |
| org_name | Text | org.name | Organization name |
Usage in Excel Export
The Field_Name values are used in column_mapping:
{
"col_org_code": "org_id",
"col_org_name": "org_name"
}
Excel_Workbooks Table
Defines metadata for each Excel file to generate.
Purpose
Specifies WHAT Excel files to create, using which templates, with what naming.
Column Definitions
workbook_name (Required)
- Type: Text
- Length: 1-255 characters
- Example:
Endobest_Output,Statistics_Report,Monthly_Summary - Usage: Unique identifier referenced in Excel_Sheets table
- Rules: Must be unique within the table
- Notes: Used in template variables as {workbook_name}
template_path (Required)
- Type: Text (file path)
- Example:
templates/Endobest_Template.xlsx - Relative To:
config/folder - Rules: Path is relative, not absolute
- Validation: Script checks file exists before export
- Notes: Template must be valid Excel (.xlsx) file
- Error if:
- File doesn't exist
- File is not .xlsx format
- Path is absolute instead of relative
output_filename (Required)
- Type: Text (filename template)
- Example:
{workbook_name}_{extract_date_time}.xlsx - Available Variables:
{workbook_name}- From workbook_name column{extract_date_time}- Full ISO datetime (2025-01-15T14:30:45+01:00){extract_year}- Year (2025){extract_month}- Month (01-12){extract_day}- Day (01-31)
- Processed As: Python f-string via
.format() - Example Results:
Report_{extract_date_time}.xlsx→Report_2025-01-15T14-30-45.xlsx{workbook_name}_Month{extract_month}.xlsx→Endobest_Output_Month01.xlsx
- Rules:
- Must include
.xlsxextension - Must be valid filename (no /, , :, *, ?, ", <, >, |)
- Variables are case-sensitive
- Must include
output_exists_action (Required)
- Type: Text (one of three values)
- Valid Values:
Overwrite- Replace existing fileIncrement- Append _1, _2, etc.Backup- Rename existing to _backup_1, etc.
- Default:
Increment(recommended for safety) - Behavior:
| Action | If file exists | Result |
|---|---|---|
| Overwrite | report.xlsx |
Deletes report.xlsx, creates new |
| Increment | report.xlsx, report_1.xlsx |
Creates report_2.xlsx |
| Backup | report.xlsx |
Renames to report_backup_1.xlsx, creates new report.xlsx |
Row Rules
- Each row generates ONE Excel file
- All columns must be filled (no empty cells)
- workbook_name must be unique
- Multiple workbooks allowed
Example Rows
Row 1:
workbook_name: Endobest_Output
template_path: templates/Endobest_Template.xlsx
output_filename: {workbook_name}_{extract_date_time}.xlsx
output_exists_action: Increment
Row 2:
workbook_name: Statistics_Report
template_path: templates/Statistics.xlsx
output_filename: {workbook_name}_{extract_year}-{extract_month}.xlsx
output_exists_action: Overwrite
Excel_Sheets Table
Defines how to fill sheets within the workbooks.
Purpose
Specifies HOW to fill each sheet:
- Which data to use (Inclusions/Organizations/Variable)
- How to transform it (filter, sort, replace)
- Where to put it (target cell/range)
Column Definitions
workbook_name (Required)
- Type: Text
- Example:
Endobest_Output - Rules: Must match exactly one row in Excel_Workbooks table
- Validation: Script checks reference exists
sheet_name (Required)
- Type: Text
- Example:
Inclusions,Summary,Organizations - Rules: Must match sheet name in template exactly
- Validation: Script checks sheet exists in template
source_type (Required)
- Type: Text (one of three values)
- Valid Values:
Variable- Single variable value (timestamp, text, etc.)Inclusions- Patient inclusion dataOrganizations- Organization data
- Rules: Determines what column_mapping is required
target (Required)
- Type: Text (cell reference or named range)
- Format:
- Cell reference:
A1,B10,Title_Cell - Named range:
DataTable,InclusionsRange, etc.
- Cell reference:
- For Variable: Single cell (not a range)
- For Inclusions/Organizations: Named range with height=1 (single row for headers, data below)
- Validation: Script checks target exists in template
column_mapping (Conditional)
- Required If: source_type =
InclusionsOROrganizations - Type: JSON object
- Format:
{"excel_column_name": "data_field_name", ...} - Example (Inclusions):
{ "col_id": "patient_id", "col_name": "patient_name", "col_status": "inclusion_status", "col_date": "date_inclusion" } - Example (Organizations):
{ "col_code": "org_id", "col_name": "org_name", "col_count": "patient_count" } - Field Names: Must match names in Inclusions_Mapping or Organizations_Mapping
- Column Order: Determines order of columns in Excel (left to right)
- Validation: Script checks all field names exist in mapping
- For Variable: Leave empty (NULL or omit)
filter_condition (Optional)
- Type: JSON object (AND conditions)
- Default: NULL (no filtering, all items included)
- Format:
{"field_name": expected_value, ...} - Example:
{ "status": "active", "visit_type": "inclusion" } - Logic: AND (all conditions must match)
- Item with
{"status": "active", "visit_type": "inclusion"}→ MATCHES - Item with
{"status": "active", "visit_type": "follow-up"}→ DOES NOT MATCH
- Item with
- Nested Fields: Support dot notation
"patient.status": "active"matches{"patient": {"status": "active"}}
- For Variable: Ignored (leave NULL)
- Types: String, number, boolean values all supported
sort_keys (Optional)
- Type: JSON array of sort specifications
- Default: NULL (no sorting, original order)
- Format:
[["field_name", "asc"|"desc"], ["field2", "order", "option"], ...] - Example:
[ ["date_visit", "desc"], ["patient_name", "asc"] ] - Primary/Secondary: First array element is primary sort, second is secondary, etc.
- Options: Third element can be datetime format (
"%Y-%m-%d") or"*natsort"for alphanumeric sorting - Order Values:
"asc"- Ascending (A→Z, 0→9, old→new dates)"desc"- Descending (Z→A, 9→0, new→old dates)
- Missing Fields: Items with missing field placed at end
- Datetime: Auto-detected from ISO format (YYYY-MM-DD) - no configuration needed
- For Variable: Ignored (leave NULL)
value_replacement (Optional)
-
Type: JSON array of replacement rules
-
Default: NULL (no replacement, original values used)
-
Format:
[{rule1}, {rule2}, ...] -
Logic: First matching rule wins (stop at first match)
-
Types Supported:
Boolean replacement:
{ "type": "bool", "true": "Yes", "false": "No" }- Matches: Python boolean
True/False(not strings) - Replaces:
True→ "Yes",False→ "No"
String replacement:
{ "type": "str", "from": "active", "to": "Active Status" }- Matches: String "active" (exact, case-sensitive)
- Does NOT match: "Active" or "ACTIVE"
Integer replacement:
{ "type": "int", "from": 0, "to": "Not Applicable" }- Matches: Integer 0 (not string "0")
- Replaces: 0 → "Not Applicable"
- Matches: Python boolean
-
Type Matching: Strict - boolean True ≠ string "true"
-
Multiple Rules Example:
[ {"type": "bool", "true": "Yes", "false": "No"}, {"type": "str", "from": "active", "to": "Active"}, {"type": "str", "from": "inactive", "to": "Inactive"} ]- Booleans match first rule
- "active" matches second rule
- "inactive" matches third rule
- Other strings pass through unchanged
-
For Variable: Ignored (leave NULL)
Row Rules
- Each row defines ONE sheet in ONE workbook
- Source_type determines required fields:
- Variable: column_mapping, filter_condition, sort_keys, value_replacement all ignored
- Inclusions/Organizations: column_mapping REQUIRED, others optional
- Multiple rows for same workbook allowed (multiple sheets)
- Multiple rows for same sheet not recommended (last wins)
Example Configurations
Simple Inclusions Table:
workbook_name: Endobest_Output
sheet_name: Inclusions
source_type: Inclusions
target: DataTable
column_mapping: {"col_id": "patient_id", "col_name": "patient_name"}
filter_condition: {"status": "active"}
sort_keys: [["date_inclusion", "desc"]]
value_replacement: NULL
Multiple Sheets:
Row 1 (Title):
workbook_name: Report
sheet_name: Title
source_type: Variable
target: TitleCell
(other columns ignored)
Row 2 (Inclusions):
workbook_name: Report
sheet_name: Data
source_type: Inclusions
target: InclusionTable
column_mapping: {...}
Row 3 (Organizations):
workbook_name: Report
sheet_name: Orgs
source_type: Organizations
target: OrgTable
column_mapping: {...}
Complex Transformations:
workbook_name: Statistics
sheet_name: SummaryData
source_type: Inclusions
target: SummaryTable
column_mapping: {
"col_id": "patient_id",
"col_status": "status",
"col_activated": "is_activated"
}
filter_condition: {"status": "active"}
sort_keys: [
["status", "asc"],
["date_visit", "desc"]
]
value_replacement: [
{"type": "bool", "true": "✓", "false": "✗"},
{"type": "str", "from": "active", "to": "Active"},
{"type": "str", "from": "pending", "to": "Pending"}
]
Data Types & Formats
Text Fields
- Type: Plain text
- Length: As needed
- Special Characters: Allowed in values, but not in field names
- Examples:
patient_id,Inclusions,Endobest_Output
JSON Fields
- Type: Valid JSON format
- Validation: Must be valid JSON or NULL
- Common Mistakes:
- Missing quotes:
{col_id: "patient_id"}✗ (should be{"col_id": "patient_id"}) - Single quotes:
{'col_id': 'patient_id'}✗ (JSON uses double quotes) - Trailing commas:
{"a": 1,}✗ (not valid JSON)
- Missing quotes:
- Validation: Script validates JSON parsing before use
Dates & Times
- Format: ISO 8601 (YYYY-MM-DD or YYYY-MM-DDTHH:MM:SS)
- Example:
2025-01-15,2025-01-15T14:30:45 - Timezone: Convert to UTC before storing
- Auto-Detection: Script auto-detects datetime fields and parses correctly
JSON Field Specifications
column_mapping JSON
Structure:
{
"excel_column_1": "field_name_1",
"excel_column_2": "field_name_2",
...
}
Rules:
- Keys (left side): Column names (can be any text)
- Values (right side): Must match Inclusions_Mapping or Organizations_Mapping
- Order: Determines column order in Excel (left to right)
- Count: No limit, but must fit in target range
Validation:
- All values must exist in source mapping
- Extra columns cause error
- Missing columns fill with blanks
filter_condition JSON
Structure:
{
"field_1": value_1,
"field_2": value_2,
...
}
Rules:
- Keys (left side): Field names (from mapping)
- Values (right side): Literal values to match
- Logic: AND (all conditions must match)
- Empty object:
{}matches all (no filtering)
Value Types Supported:
- String:
"active" - Number:
123,45.67 - Boolean:
true,false(JSON format, not quoted) - NULL:
null
Example:
{
"status": "active",
"center_code": "PARIS01",
"patient_count": 10
}
Matches only items with ALL three conditions.
sort_keys JSON
Structure:
[
["field_name_1", "asc"],
["field_name_2", "desc"],
["field_name_3", "asc", "option"]
]
Rules:
- Array of arrays format (ordered list)
- Each sort specification:
[field, order]or[field, order, option] - Field: Must exist in source data
- Order:
"asc"or"desc"only - Option (optional): Special sorting behavior (see below)
- Empty array:
[]means no sorting
Field Matching:
- Exact field name match required
- Case-sensitive field names
- String comparison: Case-insensitive by default
"Centre Evidens"comes before"CHU Hospital"(natural alphabetical order)
Optional Third Parameter:
-
Datetime Format:
["date_field", "desc", "%Y-%m-%d"]- Provide Python strptime format for custom date parsing
- Example formats:
"%d/%m/%Y","%Y-%m-%d %H:%M:%S"
-
Natural Alphanumeric Sorting:
["patient_id", "asc", "*natsort"]- Use
"*natsort"for natural sorting of alphanumeric codes - Correctly sorts:
"ENDOBEST-003-3-BA"<"ENDOBEST-003-20-BA" - Also handles:
"file2.txt"<"file10.txt","v1.9"<"v1.10" - Perfect for patient IDs, version numbers, sequential codes
- Use
value_replacement JSON
Structure:
[
{
"type": "TYPE_NAME",
"TYPE_SPECIFIC_FIELDS": values
},
...
]
Boolean Type:
{
"type": "bool",
"true": "Replacement for True",
"false": "Replacement for False"
}
String Type:
{
"type": "str",
"from": "Source string",
"to": "Replacement string"
}
Integer Type:
{
"type": "int",
"from": 123,
"to": "Replacement"
}
Rules:
- Each rule must have
"type"field - Other fields required per type
- Evaluated in order (first match wins)
- NULL or empty array means no replacement
Naming Conventions
File & Path Naming
- Paths: Relative to
config/folder - Separators: Use forward slash
/(not backslash\) - Extensions: Must include
.xlsx - Spaces: Avoid in filenames (use underscore or camelCase)
Column Naming
- No spaces: Use underscores or camelCase
- Avoid special characters: Letters, numbers, underscore only
- Length: Keep reasonable (avoid 100+ char names)
- Consistency: Use same names across configuration
Field Naming
- From Mapping: Use exact names from Inclusions_Mapping or Organizations_Mapping
- Case-Sensitive: Field_Name ≠ field_name
- Match Required: Must exist in mapping
Excel Named Ranges
- Define in Excel: Formulas → Name Manager → New
- Naming: Same rules as column naming
- Scope: Sheet-level or Workbook-level both OK
- Used in:
targetcolumn of Excel_Sheets
Configuration Examples
Example 1: Simple Patient Report
Excel_Workbooks:
workbook_name | template_path | output_filename | output_exists_action
Endobest_Report | templates/Simple.xlsx | Report_{extract_date_time}.xlsx | Increment
Excel_Sheets:
workbook_name | sheet_name | source_type | target | column_mapping | filter_condition | sort_keys
Endobest_Report | Patients | Inclusions | PatientTbl | {"ID": "patient_id", | {"status": | [{"field": "date_inclusion",
| | | | "Name": "patient_name", | "active"} | "order": "asc"}]
| | | | "Date": "date_inclusion"} | |
Example 2: Multi-Sheet Report
Excel_Workbooks:
workbook_name | template_path | output_filename | output_exists_action
FullReport | templates/Multi.xlsx | {workbook_name}_{extract_month}.xlsx | Overwrite
Excel_Sheets (3 rows):
Row 1 (Title):
workbook_name | sheet_name | source_type | target | column_mapping | filter_condition | sort_keys
FullReport | Cover | Variable | TitleCell | NULL | NULL | NULL
Row 2 (Inclusions):
workbook_name | sheet_name | source_type | target | column_mapping | filter_condition | sort_keys
FullReport | Inclusions | Inclusions | IncTbl | {"col_id": "patient_id", | {"status": "active"} | [{"field": "date_visit",
| | | | "col_name": "patient_name", | | "order": "desc"}]
| | | | "col_site": "site_id"} | |
Row 3 (Organizations):
workbook_name | sheet_name | source_type | target | column_mapping | filter_condition | sort_keys
FullReport | Summary | Organizations | OrgTbl | {"Name": "org_name", | NULL | [{"field": "org_name",
| | | | "Count": "patient_count"} | | "order": "asc"}]
Validation & Error Messages
Configuration Errors (Startup)
Template file missing:
✗ CRITICAL: Template file missing: config/templates/Missing.xlsx
Fix: Verify file exists and path is correct
Named range not found:
✗ CRITICAL: Named range not found: 'DataTable' in sheet 'Inclusions'
Fix: Create named range in Excel or correct the name in configuration
Column reference invalid:
✗ CRITICAL: Column mapping references invalid field: 'unknown_field'
Fix: Check field name matches Inclusions_Mapping or Organizations_Mapping exactly
JSON parse error:
✗ CRITICAL: Invalid JSON in column_mapping: {col_id: "patient_id"}
Fix: Ensure all JSON fields use double quotes and valid syntax
Runtime Errors
No matching data:
⚠ WARNING: Filter condition found no matching items for sheet 'Inclusions'
Possible Causes:
- Filter too restrictive
- Filter field doesn't exist
- No data in source Fix: Review filter_condition, check data exists
File write error:
✗ ERROR: Could not write file: Permission denied
Possible Causes:
- File open in another program
- No write permissions
- Disk full Fix: Close Excel, check permissions, check disk space
Best Practices
Configuration Management
-
Backup Config
- Keep version history
- Comment changes in Excel or separate document
-
Test Changes
- Use
--excel_onlymode for quick testing - Run full process periodically to verify
- Use
-
Document Mappings
- Maintain spreadsheet of field meanings
- Update when fields change
-
Naming Consistency
- Use same field names across tables
- Use descriptive, self-documenting names
Performance Optimization
-
Filter Early
- Use filter_condition to reduce data
- Smaller datasets = faster processing
-
Smart Sorting
- Don't sort if not needed
- Sort by indexed fields when possible
-
Template Optimization
- Minimize template complexity
- Remove unnecessary formulas
Data Quality
-
Validation
- Verify filter_condition results
- Check sort_keys order makes sense
- Test value_replacement transformations
-
Documentation
- Document why each filter exists
- Document expected results
- Include contact info for questions
Security
-
File Permissions
- Restrict config file access (contains sensitive paths)
- Backup encrypted if needed
-
Data Privacy
- Excel files contain patient data
- Handle per organization policy
- Ensure secure storage/transmission
Troubleshooting
Configuration Issues
"Excel config file not found"
- Path:
config/Endobest_Dashboard_Config.xlsx - Check file exists in correct location
"Required column missing"
- Check all required columns present
- Don't delete or rename columns
- Use exact column names
"Workbook name mismatch"
- Excel_Sheets.workbook_name must match Excel_Workbooks.workbook_name exactly
- Check spelling and case
Template Issues
"Template file not found"
- Verify file in
config/templates/folder - Check path relative to config (not root)
- Example correct:
templates/MyTemplate.xlsx - Example incorrect:
config/templates/MyTemplate.xlsx
"Named range not found"
- Open template in Excel
- Formulas → Name Manager
- Verify range exists and spelling matches
"Invalid target cell"
- Check cell reference format (A1, B10, etc.) or range name
- Verify cell/range exists in sheet
Data Issues
"No data in Excel cells"
- Check filter_condition isn't too restrictive
- Verify source data exists (run --check-only)
- Check column_mapping field names are correct
"Column order wrong"
- Column order determined by column_mapping object key order
- In newer Excel: right-click → "Edit in formula bar" to see order
- Reorder keys in JSON to change column order
"Values not replaced"
- Check value_replacement type matches actual data type
- Boolean True ≠ string "true"
- Check rule order (first match wins)
"Dates sorting incorrectly"
- Dates must be ISO format: YYYY-MM-DD
- Check field value format
- If text looks like date but formats as text in Excel, may sort alphabetically
Advanced Configuration
Template Variables in Variable Cells
Use variables to populate single cells:
target: TimestampCell
source_type: Variable
In Excel template, cell value:
"Extracted: {extract_date_time}"
Result:
"Extracted: 2025-01-15T14:30:45+01:00"
Dynamic Filenames
Create filenames that reflect data/content:
output_filename: "{workbook_name}_{extract_year}_{extract_month}.xlsx"
Results in:
"Statistics_2025_01.xlsx"
"Endobest_Output_2025_01.xlsx"
Cascading Filters & Sorts
Apply multiple rules:
filter_condition: {"status": "active", "center": "PARIS01", "type": "inclusion"}
sort_keys: [
["visit_order", "asc"],
["date_visit", "desc"],
["patient_name", "asc"]
]
End of Configuration Guide
For user guide, see DOCUMENTATION_98_USER_GUIDE.md For architecture details, see DOCUMENTATION_13_EXCEL_EXPORT.md