Version fonctionnelle
This commit is contained in:
924
DOCUMENTATION/DOCUMENTATION_99_CONFIG_GUIDE.md
Normal file
924
DOCUMENTATION/DOCUMENTATION_99_CONFIG_GUIDE.md
Normal file
@@ -0,0 +1,924 @@
|
||||
# Endobest Dashboard - Configuration Guide
|
||||
|
||||
**Document Version:** 1.0
|
||||
**Last Updated:** 2025-11-08
|
||||
**Audience:** System Administrators, Configuration Managers
|
||||
**Language:** English
|
||||
|
||||
---
|
||||
|
||||
## Configuration Overview
|
||||
|
||||
The Endobest Dashboard is configured entirely through Excel files - no code changes needed.
|
||||
|
||||
### Main Configuration File
|
||||
|
||||
**File Location:** `config/Endobest_Dashboard_Config.xlsx`
|
||||
|
||||
**Contains:**
|
||||
- `Inclusions_Mapping` - Field definitions for inclusion data
|
||||
- `Organizations_Mapping` - Field definitions for organization data
|
||||
- `Excel_Workbooks` - Metadata for Excel export
|
||||
- `Excel_Sheets` - Sheet definitions and data transformation rules
|
||||
- `Regression_Check` - Quality check rules
|
||||
|
||||
This guide focuses on **Excel_Workbooks** and **Excel_Sheets** tables (for Excel export configuration).
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [File Location & Structure](#file-location--structure)
|
||||
2. [Inclusions_Mapping (Reference)](#inclusions_mapping-reference)
|
||||
3. [Organizations_Mapping (Reference)](#organizations_mapping-reference)
|
||||
4. [Excel_Workbooks Table](#excel_workbooks-table)
|
||||
5. [Excel_Sheets Table](#excel_sheets-table)
|
||||
6. [Data Types & Formats](#data-types--formats)
|
||||
7. [JSON Field Specifications](#json-field-specifications)
|
||||
8. [Naming Conventions](#naming-conventions)
|
||||
9. [Configuration Examples](#configuration-examples)
|
||||
10. [Validation & Error Messages](#validation--error-messages)
|
||||
11. [Best Practices](#best-practices)
|
||||
12. [Troubleshooting](#troubleshooting)
|
||||
|
||||
---
|
||||
|
||||
## File Location & Structure
|
||||
|
||||
### Directory Layout
|
||||
|
||||
```
|
||||
Endobest Dashboard/
|
||||
├── eb_dashboard.py (main script)
|
||||
├── config/
|
||||
│ ├── Endobest_Dashboard_Config.xlsx (← CONFIGURATION FILE)
|
||||
│ ├── Endobest_Extended_Fields.xlsx (old, deprecated)
|
||||
│ ├── eb_org_center_mapping.xlsx
|
||||
│ └── templates/
|
||||
│ ├── Endobest_Template.xlsx
|
||||
│ ├── Statistics_Template.xlsx
|
||||
│ └── (other templates)
|
||||
├── endobest_inclusions.json (output)
|
||||
├── endobest_organizations.json (output)
|
||||
└── dashboard.log
|
||||
```
|
||||
|
||||
### Opening & Editing
|
||||
|
||||
1. Open `config/Endobest_Dashboard_Config.xlsx` in Excel
|
||||
2. Go to specific sheet tab
|
||||
3. Edit rows as needed
|
||||
4. Save file
|
||||
5. Run script - changes take effect on next run
|
||||
|
||||
**Important:** Do NOT change column order or delete required columns.
|
||||
|
||||
---
|
||||
|
||||
## Inclusions_Mapping (Reference)
|
||||
|
||||
This table defines which patient fields to include in export.
|
||||
|
||||
### Purpose
|
||||
Specifies which inclusion data fields are available for use in:
|
||||
- Excel export (column_mapping in Excel_Sheets)
|
||||
- Quality checks
|
||||
- Regression testing
|
||||
|
||||
### Columns
|
||||
|
||||
| Column | Type | Example | Notes |
|
||||
|--------|------|---------|-------|
|
||||
| Field_Selection | Action | [["include", "*.*"]] | Pipeline of include/exclude actions |
|
||||
| Field_Name | Text | patient_id | Internal name used in column_mapping |
|
||||
|
||||
### Usage in Excel Export
|
||||
|
||||
The Field_Name values are used in `column_mapping`:
|
||||
|
||||
```json
|
||||
{
|
||||
"col_patient_id": "patient_id",
|
||||
"col_name": "patient_name",
|
||||
"col_status": "inclusion_status"
|
||||
}
|
||||
```
|
||||
|
||||
**Map Excel Column Name → Inclusion Field Name**
|
||||
|
||||
---
|
||||
|
||||
## Organizations_Mapping (Reference)
|
||||
|
||||
This table defines which organization fields to include in export.
|
||||
|
||||
### Purpose
|
||||
Specifies which organization data fields are available for use in:
|
||||
- Excel export (column_mapping for Organizations source_type)
|
||||
- Quality checks
|
||||
|
||||
### Columns
|
||||
|
||||
| Column | Type | Example | Notes |
|
||||
|--------|------|---------|-------|
|
||||
| Field_Name | Text | org_id | Internal name |
|
||||
| org_id | Text | org.id | Data source path |
|
||||
| org_name | Text | org.name | Organization name |
|
||||
|
||||
### Usage in Excel Export
|
||||
|
||||
The Field_Name values are used in `column_mapping`:
|
||||
|
||||
```json
|
||||
{
|
||||
"col_org_code": "org_id",
|
||||
"col_org_name": "org_name"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Excel_Workbooks Table
|
||||
|
||||
Defines metadata for each Excel file to generate.
|
||||
|
||||
### Purpose
|
||||
Specifies WHAT Excel files to create, using which templates, with what naming.
|
||||
|
||||
### Column Definitions
|
||||
|
||||
#### workbook_name (Required)
|
||||
- **Type:** Text
|
||||
- **Length:** 1-255 characters
|
||||
- **Example:** `Endobest_Output`, `Statistics_Report`, `Monthly_Summary`
|
||||
- **Usage:** Unique identifier referenced in Excel_Sheets table
|
||||
- **Rules:** Must be unique within the table
|
||||
- **Notes:** Used in template variables as {workbook_name}
|
||||
|
||||
#### template_path (Required)
|
||||
- **Type:** Text (file path)
|
||||
- **Example:** `templates/Endobest_Template.xlsx`
|
||||
- **Relative To:** `config/` folder
|
||||
- **Rules:** Path is relative, not absolute
|
||||
- **Validation:** Script checks file exists before export
|
||||
- **Notes:** Template must be valid Excel (.xlsx) file
|
||||
- **Error if:**
|
||||
- File doesn't exist
|
||||
- File is not .xlsx format
|
||||
- Path is absolute instead of relative
|
||||
|
||||
#### output_filename (Required)
|
||||
- **Type:** Text (filename template)
|
||||
- **Example:** `{workbook_name}_{extract_date_time}.xlsx`
|
||||
- **Available Variables:**
|
||||
- `{workbook_name}` - From workbook_name column
|
||||
- `{extract_date_time}` - Full ISO datetime (2025-01-15T14:30:45+01:00)
|
||||
- `{extract_year}` - Year (2025)
|
||||
- `{extract_month}` - Month (01-12)
|
||||
- `{extract_day}` - Day (01-31)
|
||||
- **Processed As:** Python f-string via `.format()`
|
||||
- **Example Results:**
|
||||
- `Report_{extract_date_time}.xlsx` → `Report_2025-01-15T14-30-45.xlsx`
|
||||
- `{workbook_name}_Month{extract_month}.xlsx` → `Endobest_Output_Month01.xlsx`
|
||||
- **Rules:**
|
||||
- Must include `.xlsx` extension
|
||||
- Must be valid filename (no /, \, :, *, ?, ", <, >, |)
|
||||
- Variables are case-sensitive
|
||||
|
||||
#### output_exists_action (Required)
|
||||
- **Type:** Text (one of three values)
|
||||
- **Valid Values:**
|
||||
- `Overwrite` - Replace existing file
|
||||
- `Increment` - Append _1, _2, etc.
|
||||
- `Backup` - Rename existing to _backup_1, etc.
|
||||
- **Default:** `Increment` (recommended for safety)
|
||||
- **Behavior:**
|
||||
|
||||
| Action | If file exists | Result |
|
||||
|--------|---|---|
|
||||
| **Overwrite** | `report.xlsx` | Deletes `report.xlsx`, creates new |
|
||||
| **Increment** | `report.xlsx`, `report_1.xlsx` | Creates `report_2.xlsx` |
|
||||
| **Backup** | `report.xlsx` | Renames to `report_backup_1.xlsx`, creates new `report.xlsx` |
|
||||
|
||||
### Row Rules
|
||||
|
||||
- Each row generates ONE Excel file
|
||||
- All columns must be filled (no empty cells)
|
||||
- workbook_name must be unique
|
||||
- Multiple workbooks allowed
|
||||
|
||||
### Example Rows
|
||||
|
||||
```
|
||||
Row 1:
|
||||
workbook_name: Endobest_Output
|
||||
template_path: templates/Endobest_Template.xlsx
|
||||
output_filename: {workbook_name}_{extract_date_time}.xlsx
|
||||
output_exists_action: Increment
|
||||
|
||||
Row 2:
|
||||
workbook_name: Statistics_Report
|
||||
template_path: templates/Statistics.xlsx
|
||||
output_filename: {workbook_name}_{extract_year}-{extract_month}.xlsx
|
||||
output_exists_action: Overwrite
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Excel_Sheets Table
|
||||
|
||||
Defines how to fill sheets within the workbooks.
|
||||
|
||||
### Purpose
|
||||
Specifies HOW to fill each sheet:
|
||||
- Which data to use (Inclusions/Organizations/Variable)
|
||||
- How to transform it (filter, sort, replace)
|
||||
- Where to put it (target cell/range)
|
||||
|
||||
### Column Definitions
|
||||
|
||||
#### workbook_name (Required)
|
||||
- **Type:** Text
|
||||
- **Example:** `Endobest_Output`
|
||||
- **Rules:** Must match exactly one row in Excel_Workbooks table
|
||||
- **Validation:** Script checks reference exists
|
||||
|
||||
#### sheet_name (Required)
|
||||
- **Type:** Text
|
||||
- **Example:** `Inclusions`, `Summary`, `Organizations`
|
||||
- **Rules:** Must match sheet name in template exactly
|
||||
- **Validation:** Script checks sheet exists in template
|
||||
|
||||
#### source_type (Required)
|
||||
- **Type:** Text (one of three values)
|
||||
- **Valid Values:**
|
||||
- `Variable` - Single variable value (timestamp, text, etc.)
|
||||
- `Inclusions` - Patient inclusion data
|
||||
- `Organizations` - Organization data
|
||||
- **Rules:** Determines what column_mapping is required
|
||||
|
||||
#### target (Required)
|
||||
- **Type:** Text (cell reference or named range)
|
||||
- **Format:**
|
||||
- Cell reference: `A1`, `B10`, `Title_Cell`
|
||||
- Named range: `DataTable`, `InclusionsRange`, etc.
|
||||
- **For Variable:** Single cell (not a range)
|
||||
- **For Inclusions/Organizations:** Named range with height=1 (single row for headers, data below)
|
||||
- **Validation:** Script checks target exists in template
|
||||
|
||||
#### column_mapping (Conditional)
|
||||
- **Required If:** source_type = `Inclusions` OR `Organizations`
|
||||
- **Type:** JSON object
|
||||
- **Format:** `{"excel_column_name": "data_field_name", ...}`
|
||||
- **Example (Inclusions):**
|
||||
```json
|
||||
{
|
||||
"col_id": "patient_id",
|
||||
"col_name": "patient_name",
|
||||
"col_status": "inclusion_status",
|
||||
"col_date": "date_inclusion"
|
||||
}
|
||||
```
|
||||
- **Example (Organizations):**
|
||||
```json
|
||||
{
|
||||
"col_code": "org_id",
|
||||
"col_name": "org_name",
|
||||
"col_count": "patient_count"
|
||||
}
|
||||
```
|
||||
- **Field Names:** Must match names in Inclusions_Mapping or Organizations_Mapping
|
||||
- **Column Order:** Determines order of columns in Excel (left to right)
|
||||
- **Validation:** Script checks all field names exist in mapping
|
||||
- **For Variable:** Leave empty (NULL or omit)
|
||||
|
||||
#### filter_condition (Optional)
|
||||
- **Type:** JSON object (AND conditions)
|
||||
- **Default:** NULL (no filtering, all items included)
|
||||
- **Format:** `{"field_name": expected_value, ...}`
|
||||
- **Example:**
|
||||
```json
|
||||
{
|
||||
"status": "active",
|
||||
"visit_type": "inclusion"
|
||||
}
|
||||
```
|
||||
- **Logic:** AND (all conditions must match)
|
||||
- Item with `{"status": "active", "visit_type": "inclusion"}` → MATCHES
|
||||
- Item with `{"status": "active", "visit_type": "follow-up"}` → DOES NOT MATCH
|
||||
- **Nested Fields:** Support dot notation
|
||||
- `"patient.status": "active"` matches `{"patient": {"status": "active"}}`
|
||||
- **For Variable:** Ignored (leave NULL)
|
||||
- **Types:** String, number, boolean values all supported
|
||||
|
||||
#### sort_keys (Optional)
|
||||
- **Type:** JSON array of sort specifications
|
||||
- **Default:** NULL (no sorting, original order)
|
||||
- **Format:** `[["field_name", "asc"|"desc"], ["field2", "order", "option"], ...]`
|
||||
- **Example:**
|
||||
```json
|
||||
[
|
||||
["date_visit", "desc"],
|
||||
["patient_name", "asc"]
|
||||
]
|
||||
```
|
||||
- **Primary/Secondary:** First array element is primary sort, second is secondary, etc.
|
||||
- **Options:** Third element can be datetime format (`"%Y-%m-%d"`) or `"*natsort"` for alphanumeric sorting
|
||||
- **Order Values:**
|
||||
- `"asc"` - Ascending (A→Z, 0→9, old→new dates)
|
||||
- `"desc"` - Descending (Z→A, 9→0, new→old dates)
|
||||
- **Missing Fields:** Items with missing field placed at end
|
||||
- **Datetime:** Auto-detected from ISO format (YYYY-MM-DD) - no configuration needed
|
||||
- **For Variable:** Ignored (leave NULL)
|
||||
|
||||
#### value_replacement (Optional)
|
||||
- **Type:** JSON array of replacement rules
|
||||
- **Default:** NULL (no replacement, original values used)
|
||||
- **Format:** `[{rule1}, {rule2}, ...]`
|
||||
- **Logic:** First matching rule wins (stop at first match)
|
||||
- **Types Supported:**
|
||||
|
||||
**Boolean replacement:**
|
||||
```json
|
||||
{
|
||||
"type": "bool",
|
||||
"true": "Yes",
|
||||
"false": "No"
|
||||
}
|
||||
```
|
||||
- Matches: Python boolean `True` / `False` (not strings)
|
||||
- Replaces: `True` → "Yes", `False` → "No"
|
||||
|
||||
**String replacement:**
|
||||
```json
|
||||
{
|
||||
"type": "str",
|
||||
"from": "active",
|
||||
"to": "Active Status"
|
||||
}
|
||||
```
|
||||
- Matches: String "active" (exact, case-sensitive)
|
||||
- Does NOT match: "Active" or "ACTIVE"
|
||||
|
||||
**Integer replacement:**
|
||||
```json
|
||||
{
|
||||
"type": "int",
|
||||
"from": 0,
|
||||
"to": "Not Applicable"
|
||||
}
|
||||
```
|
||||
- Matches: Integer 0 (not string "0")
|
||||
- Replaces: 0 → "Not Applicable"
|
||||
|
||||
- **Type Matching:** Strict - boolean True ≠ string "true"
|
||||
- **Multiple Rules Example:**
|
||||
```json
|
||||
[
|
||||
{"type": "bool", "true": "Yes", "false": "No"},
|
||||
{"type": "str", "from": "active", "to": "Active"},
|
||||
{"type": "str", "from": "inactive", "to": "Inactive"}
|
||||
]
|
||||
```
|
||||
- Booleans match first rule
|
||||
- "active" matches second rule
|
||||
- "inactive" matches third rule
|
||||
- Other strings pass through unchanged
|
||||
- **For Variable:** Ignored (leave NULL)
|
||||
|
||||
### Row Rules
|
||||
|
||||
- Each row defines ONE sheet in ONE workbook
|
||||
- Source_type determines required fields:
|
||||
- **Variable:** column_mapping, filter_condition, sort_keys, value_replacement all ignored
|
||||
- **Inclusions/Organizations:** column_mapping REQUIRED, others optional
|
||||
- Multiple rows for same workbook allowed (multiple sheets)
|
||||
- Multiple rows for same sheet not recommended (last wins)
|
||||
|
||||
### Example Configurations
|
||||
|
||||
**Simple Inclusions Table:**
|
||||
```
|
||||
workbook_name: Endobest_Output
|
||||
sheet_name: Inclusions
|
||||
source_type: Inclusions
|
||||
target: DataTable
|
||||
column_mapping: {"col_id": "patient_id", "col_name": "patient_name"}
|
||||
filter_condition: {"status": "active"}
|
||||
sort_keys: [["date_inclusion", "desc"]]
|
||||
value_replacement: NULL
|
||||
```
|
||||
|
||||
**Multiple Sheets:**
|
||||
```
|
||||
Row 1 (Title):
|
||||
workbook_name: Report
|
||||
sheet_name: Title
|
||||
source_type: Variable
|
||||
target: TitleCell
|
||||
(other columns ignored)
|
||||
|
||||
Row 2 (Inclusions):
|
||||
workbook_name: Report
|
||||
sheet_name: Data
|
||||
source_type: Inclusions
|
||||
target: InclusionTable
|
||||
column_mapping: {...}
|
||||
|
||||
Row 3 (Organizations):
|
||||
workbook_name: Report
|
||||
sheet_name: Orgs
|
||||
source_type: Organizations
|
||||
target: OrgTable
|
||||
column_mapping: {...}
|
||||
```
|
||||
|
||||
**Complex Transformations:**
|
||||
```
|
||||
workbook_name: Statistics
|
||||
sheet_name: SummaryData
|
||||
source_type: Inclusions
|
||||
target: SummaryTable
|
||||
column_mapping: {
|
||||
"col_id": "patient_id",
|
||||
"col_status": "status",
|
||||
"col_activated": "is_activated"
|
||||
}
|
||||
filter_condition: {"status": "active"}
|
||||
sort_keys: [
|
||||
["status", "asc"],
|
||||
["date_visit", "desc"]
|
||||
]
|
||||
value_replacement: [
|
||||
{"type": "bool", "true": "✓", "false": "✗"},
|
||||
{"type": "str", "from": "active", "to": "Active"},
|
||||
{"type": "str", "from": "pending", "to": "Pending"}
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Data Types & Formats
|
||||
|
||||
### Text Fields
|
||||
- **Type:** Plain text
|
||||
- **Length:** As needed
|
||||
- **Special Characters:** Allowed in values, but not in field names
|
||||
- **Examples:** `patient_id`, `Inclusions`, `Endobest_Output`
|
||||
|
||||
### JSON Fields
|
||||
- **Type:** Valid JSON format
|
||||
- **Validation:** Must be valid JSON or NULL
|
||||
- **Common Mistakes:**
|
||||
- Missing quotes: `{col_id: "patient_id"}` ✗ (should be `{"col_id": "patient_id"}`)
|
||||
- Single quotes: `{'col_id': 'patient_id'}` ✗ (JSON uses double quotes)
|
||||
- Trailing commas: `{"a": 1,}` ✗ (not valid JSON)
|
||||
- **Validation:** Script validates JSON parsing before use
|
||||
|
||||
### Dates & Times
|
||||
- **Format:** ISO 8601 (YYYY-MM-DD or YYYY-MM-DDTHH:MM:SS)
|
||||
- **Example:** `2025-01-15`, `2025-01-15T14:30:45`
|
||||
- **Timezone:** Convert to UTC before storing
|
||||
- **Auto-Detection:** Script auto-detects datetime fields and parses correctly
|
||||
|
||||
---
|
||||
|
||||
## JSON Field Specifications
|
||||
|
||||
### column_mapping JSON
|
||||
|
||||
**Structure:**
|
||||
```json
|
||||
{
|
||||
"excel_column_1": "field_name_1",
|
||||
"excel_column_2": "field_name_2",
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
**Rules:**
|
||||
- Keys (left side): Column names (can be any text)
|
||||
- Values (right side): Must match Inclusions_Mapping or Organizations_Mapping
|
||||
- Order: Determines column order in Excel (left to right)
|
||||
- Count: No limit, but must fit in target range
|
||||
|
||||
**Validation:**
|
||||
- All values must exist in source mapping
|
||||
- Extra columns cause error
|
||||
- Missing columns fill with blanks
|
||||
|
||||
### filter_condition JSON
|
||||
|
||||
**Structure:**
|
||||
```json
|
||||
{
|
||||
"field_1": value_1,
|
||||
"field_2": value_2,
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
**Rules:**
|
||||
- Keys (left side): Field names (from mapping)
|
||||
- Values (right side): Literal values to match
|
||||
- Logic: AND (all conditions must match)
|
||||
- Empty object: `{}` matches all (no filtering)
|
||||
|
||||
**Value Types Supported:**
|
||||
- String: `"active"`
|
||||
- Number: `123`, `45.67`
|
||||
- Boolean: `true`, `false` (JSON format, not quoted)
|
||||
- NULL: `null`
|
||||
|
||||
**Example:**
|
||||
```json
|
||||
{
|
||||
"status": "active",
|
||||
"center_code": "PARIS01",
|
||||
"patient_count": 10
|
||||
}
|
||||
```
|
||||
Matches only items with ALL three conditions.
|
||||
|
||||
### sort_keys JSON
|
||||
|
||||
**Structure:**
|
||||
```json
|
||||
[
|
||||
["field_name_1", "asc"],
|
||||
["field_name_2", "desc"],
|
||||
["field_name_3", "asc", "option"]
|
||||
]
|
||||
```
|
||||
|
||||
**Rules:**
|
||||
- Array of arrays format (ordered list)
|
||||
- Each sort specification: `[field, order]` or `[field, order, option]`
|
||||
- Field: Must exist in source data
|
||||
- Order: `"asc"` or `"desc"` only
|
||||
- Option (optional): Special sorting behavior (see below)
|
||||
- Empty array: `[]` means no sorting
|
||||
|
||||
**Field Matching:**
|
||||
- Exact field name match required
|
||||
- Case-sensitive field names
|
||||
- **String comparison:** Case-insensitive by default
|
||||
- `"Centre Evidens"` comes before `"CHU Hospital"` (natural alphabetical order)
|
||||
|
||||
**Optional Third Parameter:**
|
||||
|
||||
1. **Datetime Format:**
|
||||
```json
|
||||
["date_field", "desc", "%Y-%m-%d"]
|
||||
```
|
||||
- Provide Python strptime format for custom date parsing
|
||||
- Example formats: `"%d/%m/%Y"`, `"%Y-%m-%d %H:%M:%S"`
|
||||
|
||||
2. **Natural Alphanumeric Sorting:**
|
||||
```json
|
||||
["patient_id", "asc", "*natsort"]
|
||||
```
|
||||
- Use `"*natsort"` for natural sorting of alphanumeric codes
|
||||
- Correctly sorts: `"ENDOBEST-003-3-BA"` < `"ENDOBEST-003-20-BA"`
|
||||
- Also handles: `"file2.txt"` < `"file10.txt"`, `"v1.9"` < `"v1.10"`
|
||||
- Perfect for patient IDs, version numbers, sequential codes
|
||||
|
||||
### value_replacement JSON
|
||||
|
||||
**Structure:**
|
||||
```json
|
||||
[
|
||||
{
|
||||
"type": "TYPE_NAME",
|
||||
"TYPE_SPECIFIC_FIELDS": values
|
||||
},
|
||||
...
|
||||
]
|
||||
```
|
||||
|
||||
**Boolean Type:**
|
||||
```json
|
||||
{
|
||||
"type": "bool",
|
||||
"true": "Replacement for True",
|
||||
"false": "Replacement for False"
|
||||
}
|
||||
```
|
||||
|
||||
**String Type:**
|
||||
```json
|
||||
{
|
||||
"type": "str",
|
||||
"from": "Source string",
|
||||
"to": "Replacement string"
|
||||
}
|
||||
```
|
||||
|
||||
**Integer Type:**
|
||||
```json
|
||||
{
|
||||
"type": "int",
|
||||
"from": 123,
|
||||
"to": "Replacement"
|
||||
}
|
||||
```
|
||||
|
||||
**Rules:**
|
||||
- Each rule must have `"type"` field
|
||||
- Other fields required per type
|
||||
- Evaluated in order (first match wins)
|
||||
- NULL or empty array means no replacement
|
||||
|
||||
---
|
||||
|
||||
## Naming Conventions
|
||||
|
||||
### File & Path Naming
|
||||
- **Paths:** Relative to `config/` folder
|
||||
- **Separators:** Use forward slash `/` (not backslash `\`)
|
||||
- **Extensions:** Must include `.xlsx`
|
||||
- **Spaces:** Avoid in filenames (use underscore or camelCase)
|
||||
|
||||
### Column Naming
|
||||
- **No spaces:** Use underscores or camelCase
|
||||
- **Avoid special characters:** Letters, numbers, underscore only
|
||||
- **Length:** Keep reasonable (avoid 100+ char names)
|
||||
- **Consistency:** Use same names across configuration
|
||||
|
||||
### Field Naming
|
||||
- **From Mapping:** Use exact names from Inclusions_Mapping or Organizations_Mapping
|
||||
- **Case-Sensitive:** Field_Name ≠ field_name
|
||||
- **Match Required:** Must exist in mapping
|
||||
|
||||
### Excel Named Ranges
|
||||
- **Define in Excel:** Formulas → Name Manager → New
|
||||
- **Naming:** Same rules as column naming
|
||||
- **Scope:** Sheet-level or Workbook-level both OK
|
||||
- **Used in:** `target` column of Excel_Sheets
|
||||
|
||||
---
|
||||
|
||||
## Configuration Examples
|
||||
|
||||
### Example 1: Simple Patient Report
|
||||
|
||||
**Excel_Workbooks:**
|
||||
```
|
||||
workbook_name | template_path | output_filename | output_exists_action
|
||||
Endobest_Report | templates/Simple.xlsx | Report_{extract_date_time}.xlsx | Increment
|
||||
```
|
||||
|
||||
**Excel_Sheets:**
|
||||
```
|
||||
workbook_name | sheet_name | source_type | target | column_mapping | filter_condition | sort_keys
|
||||
Endobest_Report | Patients | Inclusions | PatientTbl | {"ID": "patient_id", | {"status": | [{"field": "date_inclusion",
|
||||
| | | | "Name": "patient_name", | "active"} | "order": "asc"}]
|
||||
| | | | "Date": "date_inclusion"} | |
|
||||
```
|
||||
|
||||
### Example 2: Multi-Sheet Report
|
||||
|
||||
**Excel_Workbooks:**
|
||||
```
|
||||
workbook_name | template_path | output_filename | output_exists_action
|
||||
FullReport | templates/Multi.xlsx | {workbook_name}_{extract_month}.xlsx | Overwrite
|
||||
```
|
||||
|
||||
**Excel_Sheets (3 rows):**
|
||||
```
|
||||
Row 1 (Title):
|
||||
workbook_name | sheet_name | source_type | target | column_mapping | filter_condition | sort_keys
|
||||
FullReport | Cover | Variable | TitleCell | NULL | NULL | NULL
|
||||
|
||||
Row 2 (Inclusions):
|
||||
workbook_name | sheet_name | source_type | target | column_mapping | filter_condition | sort_keys
|
||||
FullReport | Inclusions | Inclusions | IncTbl | {"col_id": "patient_id", | {"status": "active"} | [{"field": "date_visit",
|
||||
| | | | "col_name": "patient_name", | | "order": "desc"}]
|
||||
| | | | "col_site": "site_id"} | |
|
||||
|
||||
Row 3 (Organizations):
|
||||
workbook_name | sheet_name | source_type | target | column_mapping | filter_condition | sort_keys
|
||||
FullReport | Summary | Organizations | OrgTbl | {"Name": "org_name", | NULL | [{"field": "org_name",
|
||||
| | | | "Count": "patient_count"} | | "order": "asc"}]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Validation & Error Messages
|
||||
|
||||
### Configuration Errors (Startup)
|
||||
|
||||
**Template file missing:**
|
||||
```
|
||||
✗ CRITICAL: Template file missing: config/templates/Missing.xlsx
|
||||
```
|
||||
**Fix:** Verify file exists and path is correct
|
||||
|
||||
**Named range not found:**
|
||||
```
|
||||
✗ CRITICAL: Named range not found: 'DataTable' in sheet 'Inclusions'
|
||||
```
|
||||
**Fix:** Create named range in Excel or correct the name in configuration
|
||||
|
||||
**Column reference invalid:**
|
||||
```
|
||||
✗ CRITICAL: Column mapping references invalid field: 'unknown_field'
|
||||
```
|
||||
**Fix:** Check field name matches Inclusions_Mapping or Organizations_Mapping exactly
|
||||
|
||||
**JSON parse error:**
|
||||
```
|
||||
✗ CRITICAL: Invalid JSON in column_mapping: {col_id: "patient_id"}
|
||||
```
|
||||
**Fix:** Ensure all JSON fields use double quotes and valid syntax
|
||||
|
||||
### Runtime Errors
|
||||
|
||||
**No matching data:**
|
||||
```
|
||||
⚠ WARNING: Filter condition found no matching items for sheet 'Inclusions'
|
||||
```
|
||||
**Possible Causes:**
|
||||
- Filter too restrictive
|
||||
- Filter field doesn't exist
|
||||
- No data in source
|
||||
**Fix:** Review filter_condition, check data exists
|
||||
|
||||
**File write error:**
|
||||
```
|
||||
✗ ERROR: Could not write file: Permission denied
|
||||
```
|
||||
**Possible Causes:**
|
||||
- File open in another program
|
||||
- No write permissions
|
||||
- Disk full
|
||||
**Fix:** Close Excel, check permissions, check disk space
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Configuration Management
|
||||
|
||||
1. **Backup Config**
|
||||
- Keep version history
|
||||
- Comment changes in Excel or separate document
|
||||
|
||||
2. **Test Changes**
|
||||
- Use `--excel_only` mode for quick testing
|
||||
- Run full process periodically to verify
|
||||
|
||||
3. **Document Mappings**
|
||||
- Maintain spreadsheet of field meanings
|
||||
- Update when fields change
|
||||
|
||||
4. **Naming Consistency**
|
||||
- Use same field names across tables
|
||||
- Use descriptive, self-documenting names
|
||||
|
||||
### Performance Optimization
|
||||
|
||||
1. **Filter Early**
|
||||
- Use filter_condition to reduce data
|
||||
- Smaller datasets = faster processing
|
||||
|
||||
2. **Smart Sorting**
|
||||
- Don't sort if not needed
|
||||
- Sort by indexed fields when possible
|
||||
|
||||
3. **Template Optimization**
|
||||
- Minimize template complexity
|
||||
- Remove unnecessary formulas
|
||||
|
||||
### Data Quality
|
||||
|
||||
1. **Validation**
|
||||
- Verify filter_condition results
|
||||
- Check sort_keys order makes sense
|
||||
- Test value_replacement transformations
|
||||
|
||||
2. **Documentation**
|
||||
- Document why each filter exists
|
||||
- Document expected results
|
||||
- Include contact info for questions
|
||||
|
||||
### Security
|
||||
|
||||
1. **File Permissions**
|
||||
- Restrict config file access (contains sensitive paths)
|
||||
- Backup encrypted if needed
|
||||
|
||||
2. **Data Privacy**
|
||||
- Excel files contain patient data
|
||||
- Handle per organization policy
|
||||
- Ensure secure storage/transmission
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Configuration Issues
|
||||
|
||||
**"Excel config file not found"**
|
||||
- Path: `config/Endobest_Dashboard_Config.xlsx`
|
||||
- Check file exists in correct location
|
||||
|
||||
**"Required column missing"**
|
||||
- Check all required columns present
|
||||
- Don't delete or rename columns
|
||||
- Use exact column names
|
||||
|
||||
**"Workbook name mismatch"**
|
||||
- Excel_Sheets.workbook_name must match Excel_Workbooks.workbook_name exactly
|
||||
- Check spelling and case
|
||||
|
||||
### Template Issues
|
||||
|
||||
**"Template file not found"**
|
||||
- Verify file in `config/templates/` folder
|
||||
- Check path relative to config (not root)
|
||||
- Example correct: `templates/MyTemplate.xlsx`
|
||||
- Example incorrect: `config/templates/MyTemplate.xlsx`
|
||||
|
||||
**"Named range not found"**
|
||||
- Open template in Excel
|
||||
- Formulas → Name Manager
|
||||
- Verify range exists and spelling matches
|
||||
|
||||
**"Invalid target cell"**
|
||||
- Check cell reference format (A1, B10, etc.) or range name
|
||||
- Verify cell/range exists in sheet
|
||||
|
||||
### Data Issues
|
||||
|
||||
**"No data in Excel cells"**
|
||||
- Check filter_condition isn't too restrictive
|
||||
- Verify source data exists (run --check-only)
|
||||
- Check column_mapping field names are correct
|
||||
|
||||
**"Column order wrong"**
|
||||
- Column order determined by column_mapping object key order
|
||||
- In newer Excel: right-click → "Edit in formula bar" to see order
|
||||
- Reorder keys in JSON to change column order
|
||||
|
||||
**"Values not replaced"**
|
||||
- Check value_replacement type matches actual data type
|
||||
- Boolean True ≠ string "true"
|
||||
- Check rule order (first match wins)
|
||||
|
||||
**"Dates sorting incorrectly"**
|
||||
- Dates must be ISO format: YYYY-MM-DD
|
||||
- Check field value format
|
||||
- If text looks like date but formats as text in Excel, may sort alphabetically
|
||||
|
||||
---
|
||||
|
||||
## Advanced Configuration
|
||||
|
||||
### Template Variables in Variable Cells
|
||||
|
||||
Use variables to populate single cells:
|
||||
|
||||
```
|
||||
target: TimestampCell
|
||||
source_type: Variable
|
||||
|
||||
In Excel template, cell value:
|
||||
"Extracted: {extract_date_time}"
|
||||
|
||||
Result:
|
||||
"Extracted: 2025-01-15T14:30:45+01:00"
|
||||
```
|
||||
|
||||
### Dynamic Filenames
|
||||
|
||||
Create filenames that reflect data/content:
|
||||
|
||||
```
|
||||
output_filename: "{workbook_name}_{extract_year}_{extract_month}.xlsx"
|
||||
|
||||
Results in:
|
||||
"Statistics_2025_01.xlsx"
|
||||
"Endobest_Output_2025_01.xlsx"
|
||||
```
|
||||
|
||||
### Cascading Filters & Sorts
|
||||
|
||||
Apply multiple rules:
|
||||
|
||||
```
|
||||
filter_condition: {"status": "active", "center": "PARIS01", "type": "inclusion"}
|
||||
sort_keys: [
|
||||
["visit_order", "asc"],
|
||||
["date_visit", "desc"],
|
||||
["patient_name", "asc"]
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**End of Configuration Guide**
|
||||
|
||||
For user guide, see DOCUMENTATION_98_USER_GUIDE.md
|
||||
For architecture details, see DOCUMENTATION_13_EXCEL_EXPORT.md
|
||||
|
||||
Reference in New Issue
Block a user