# Endobest Field Mapping Configuration Guide ## Part 2: Field Mapping & Configuration **Document Version:** 2.0 (Updated with new module references) **Last Updated:** 2025-11-08 **Audience:** Developers, Business Analysts, Data Managers **Language:** English **Note:** Configuration file `Endobest_Dashboard_Config.xlsx` uses `Inclusions_Mapping` sheet for field definitions (see DOCUMENTATION_13_EXCEL_EXPORT.md and DOCUMENTATION_99_CONFIG_GUIDE.md for Excel export configuration) --- ## Table of Contents 1. [Overview](#overview) 2. [Technical Architecture](#technical-architecture) 3. [Field Processing Logic](#field-processing-logic) 4. [Configuration File Structure](#configuration-file-structure) 5. [Column Reference](#column-reference) 6. [Special Value Prefixes](#special-value-prefixes) 7. [Data Sources Explained](#data-sources-explained) 8. [Field Path Syntax](#field-path-syntax) 9. [Custom Functions Reference](#custom-functions-reference) 10. [Post-Processing Transformations](#post-processing-transformations) 11. [Configuration Examples](#configuration-examples) 12. [User Guide: Adding/Modifying Fields](#user-guide-adding-modifying-fields) 13. [Common Patterns & Recipes](#common-patterns--recipes) 14. [Troubleshooting](#troubleshooting) --- ## Overview The **Field Mapping Configuration** defines which data points are extracted from multiple APIs (RC, GDD, questionnaires) and how they are transformed before export. The configuration is **100% externalized in an Excel file**, enabling non-technical users to add new fields without code modifications. ### Key Concepts - **Field Group:** Logical container for related fields (e.g., "Patient_Identification", "Inclusion", "Endotest") - **Field Name:** Unique identifier for the field within its group - **Source:** Where the data comes from (questionnaire, record, inclusion, request) - **Field Path:** JSON path to navigate nested structures - **Transformations:** Post-processing rules (labels, templates, conditions) - **Custom Functions:** Calculated fields with business logic --- ## Technical Architecture ### Field Extraction Pipeline (Detailed) ``` CONFIGURATION LOADING (startup): ├─ Load Endobest_Dashboard_Config.xlsx ├─ Parse Inclusions_Mapping sheet (rows 2 onwards) ├─ Validate each field configuration ├─ Parse JSON fields (field_path, value_labels, true_if_any, field_condition) └─ Store in DASHBOARD_CONFIG array FIELD PROCESSING (per patient): ├─ For each field in DASHBOARD_CONFIG: │ ├─ Determine source type (questionnaire, record, inclusion, request, calculated) │ │ │ ├─ IF source == questionnaire: │ │ ├─ Method: Search by q_id, q_name, or q_category │ │ ├─ Data: All questionnaires already fetched for patient │ │ ├─ Path: Navigate to field_path within questionnaire answers │ │ └─ Result: raw_value or "undefined" │ │ │ ├─ IF source == record: │ │ ├─ Data: Patient's clinical record │ │ ├─ Path: Navigate JSON structure using field_path │ │ └─ Result: raw_value or "undefined" │ │ │ ├─ IF source == inclusion: │ │ ├─ Data: Patient inclusion metadata │ │ ├─ Path: Navigate nested inclusion structure │ │ └─ Result: raw_value or "undefined" │ │ │ ├─ IF source == request: │ │ ├─ Data: Lab test request/results │ │ ├─ Path: Navigate request JSON structure │ │ └─ Result: raw_value or "undefined" │ │ │ ├─ IF source == calculated: │ │ ├─ Function: Custom business logic function │ │ ├─ Arguments: From field_path │ │ ├─ Access: Other fields already processed in output_inclusion │ │ └─ Result: Computed value │ │ │ ├─ CHECK field_condition (optional): │ │ ├─ If condition is false → Set to "N/A" │ │ ├─ If condition is undefined → Set to "undefined" │ │ └─ If condition is true → Continue processing │ │ │ ├─ APPLY post-processing transformations: │ │ ├─ true_if_any: Convert to boolean │ │ ├─ value_labels: Map to localized text │ │ ├─ field_template: Apply formatting │ │ └─ List joining: Flatten arrays │ │ │ └─ STORE: output_inclusion[field_group][field_name] = final_value │ └─ Result: Complete inclusion with all extended fields ``` ### Questionnaire Finding Strategy The system supports **3 methods** to locate questionnaires: ```python def find_questionnaire(all_questionnaires, source_type, source_value): if source_type == "q_id": # Direct lookup by questionnaire ID (fastest) return all_questionnaires.get(source_value, {}).get("answers") elif source_type == "q_name": # Sequential search by questionnaire name for qcm_data in all_questionnaires.values(): if qcm_data["questionnaire"]["name"] == source_value: return qcm_data.get("answers") return None elif source_type == "q_category": # Sequential search by questionnaire category for qcm_data in all_questionnaires.values(): if qcm_data["questionnaire"]["category"] == source_value: return qcm_data.get("answers") return None ``` **Recommendation:** Use `q_id=` for best performance (direct lookup) ### Questionnaire Data Optimization Instead of multiple filtered API calls: ``` BEFORE (slow): GET /api/surveys/{qcm_id_1}/answers?subject={patient_id} GET /api/surveys/{qcm_id_2}/answers?subject={patient_id} GET /api/surveys/{qcm_id_3}/answers?subject={patient_id} ... (N calls per patient) AFTER (optimized - single call): POST /api/surveys/filter/with-answers payload: {"context": "clinic_research", "subject": patient_id} returns: [ {"questionnaire": {id, name, category}, "answers": {...}}, {"questionnaire": {id, name, category}, "answers": {...}}, ... ] ``` All questionnaires are returned in a single call, indexed by ID for fast lookup. --- ## Field Processing Logic ### Step 1: Source Type Determination | Source Prefix | Meaning | Example | Data Location | |---------------|---------|---------|----------------| | `q_id=` | Questionnaire by ID | `q_id=uuid-123` | `all_questionnaires[uuid-123]["answers"]` | | `q_name=` | Questionnaire by name | `q_name=Symptom Check` | Search by `["questionnaire"]["name"]` | | `q_category=` | Questionnaire by category | `q_category=Symptoms` | Search by `["questionnaire"]["category"]` | | `record` | Clinical record | `record` | `record_data["record"]` | | `inclusion` | Inclusion metadata | `inclusion` | `inclusion_data` | | `request` | Lab test request | `request` | `request_data` | | (Calculated) | Custom function | N/A | Function result | ### Step 2: Raw Value Extraction The `field_path` defines how to navigate nested JSON structures: ```python # Simple path field_path = ["patient", "name"] # Equivalent to: data["patient"]["name"] # Nested path field_path = ["record", "clinicResearchData", 0, "data"] # Equivalent to: data["record"]["clinicResearchData"][0]["data"] # Wildcard path (returns array) field_path = ["record", "clinicResearchData", "*", "test_name"] # Returns: [test_name_1, test_name_2, test_name_3, ...] # Deep wildcard field_path = ["record", "*", "results", "*", "value"] # Matches all results.*.value across all record items ``` ### Step 3: Field Condition Checking (Optional) The `field_condition` allows skipping field processing based on another field's value: ``` IF field_condition is specified: ├─ Look up condition field value in output_inclusion ├─ IF condition value is None or "undefined": │ └─ Set final_value = "undefined" (skip further processing) ├─ IF condition value is not a boolean: │ └─ Set final_value = "$$$$ Condition Field Error" ├─ IF condition value is False: │ └─ Set final_value = "N/A" (field not applicable) └─ IF condition value is True: └─ Continue with post-processing ``` **Example:** ```json { "field_group": "Endotest", "field_name": "Request_Status", "source_id": "request", "field_path": ["status"], "field_condition": "Endotest.Request_Sent" } ``` Meaning: Only populate "Request_Status" if "Request_Sent" is True. Otherwise set to "N/A". ### Step 4: Post-Processing Transformations #### 4a. Array Flattening If `raw_value` is an array → Join with `|` delimiter: ``` Input: ["Active", "Pending", "Resolved"] Output: "Active|Pending|Resolved" ``` #### 4b. Score Dictionary Formatting If `raw_value` is dict with keys `['total', 'max']` → Format as string: ``` Input: {"total": 8, "max": 10} Output: "8/10" ``` #### 4c. true_if_any Transformation If `true_if_any` is specified → Convert to boolean: ``` true_if_any: ["Active", "Pending"] raw_value: "Active" → Does raw_value match ANY value in true_if_any list? → TRUE ``` #### 4d. value_labels Mapping If `value_labels` is specified → Map value to localized text: ```json { "raw_value": "active", "value_labels": [ {"value": "active", "text": {"fr": "Actif", "en": "Active"}}, {"value": "inactive", "text": {"fr": "Inactif", "en": "Inactive"}} ] } → Output: "Actif" (French text) ``` #### 4e. field_template Formatting If `field_template` is specified → Apply template with `$value` placeholder: ``` field_template: "Score: $value/100" final_value: 85 → Output: "Score: 85/100" ``` --- ## Configuration File Structure ### File Location ``` Endobest_Dashboard_Config.xlsx ├─ Sheet 1: "Inclusions_Mapping" (field mapping definition) └─ Sheet 2: "Regression_Check" (non-regression rules) [See DOCUMENTATION_12_QUALITY_CHECKS.md] ``` ### Inclusions_Mapping Sheet Overview ``` Row 1 (Headers): A B C D E field_group field_name source_name source_id field_path F G H I field_template field_condition true_if_any value_labels Row 2+: Field definitions (one per row) ``` **Color Coding** (for visual identification): - **Yellow:** Extended fields or Calculated fields (requires special attention) - **Blue:** Questionnaire-sourced fields (q_id, q_name, q_category) - **Red:** Fields with errors or missing required data - **White:** Record/Inclusion/Request fields --- ## Column Reference ### Column A: field_group **Type:** String (required) **Description:** Logical grouping of related fields in output JSON **Rules:** - Must be unique within context (same field_name can exist in different groups) - Becomes a dictionary key in JSON: `output[field_group][field_name]` - Controls field visibility in regression checks **Examples:** ``` Patient_Identification → Contains patient metadata Inclusion → Inclusion status and data Endotest → Lab test information Custom_Data → Default for general fields Infos_Générales → General information Antécédents Médicaux → Medical history ``` ### Column B: field_name **Type:** String (required) **Description:** Unique field identifier within its group **Rules:** - Must not be empty - Can contain letters, numbers, underscores, hyphens - Special text in parentheses is automatically removed - Example: `Patient_Age (years)` → `Patient_Age` **Excel Behavior:** When cell contains `Patient_Age (years)`, the system parses it as: ``` field_name = "Patient_Age" # Parenthetical text stripped ``` ### Column C: source_name **Type:** String (enum) **Required:** Yes (unless cell contains "Not Specified") **Valid Values:** ``` Inclusion → Field from inclusion data Record → Field from clinical record Request → Field from lab test request Patient / Douleurs → Questionnaire name (implicit q_name=) Signes et symptômes → Questionnaire name (implicit q_name=) Calculated → Custom function (no direct source) Not Specified → Skip this row (used for spacing/comments) ``` ### Column D: source_id **Type:** String (enum with prefixes or JSON array) **Description:** Specifies how to identify the data source #### Format Options: **1. Questionnaire by ID (Recommended)** ``` Syntax: q_id= Example: q_id=550e8400-e29b-41d4-a716-446655440000 Speed: Fastest (direct lookup) ``` **2. Questionnaire by Name** ``` Syntax: q_name= Example: q_name=Symptom Questionnaire Speed: Slower (sequential search) ``` **3. Questionnaire by Category** ``` Syntax: q_category= Example: q_category=Medical History Speed: Slower (sequential search) ``` **4. Record Source** ``` Value: record Means: Extract from clinical record data ``` **5. Inclusion Source** ``` Value: inclusion Means: Extract from inclusion metadata ``` **6. Request Source** ``` Value: request Means: Extract from lab test request ``` **7. Calculated Function** ``` Syntax: Example: search_in_fields_using_regex, if_then_else, extract_parentheses_content See Section: Custom Functions Reference ``` ### Column E: field_path **Type:** JSON array (required when field is specified) **Description:** Path to navigate nested JSON structure #### Syntax Examples: **Simple field:** ```json ["name"] // Equivalent to: data["name"] ``` **Nested path:** ```json ["record", "patient", "demographics", "age"] // Equivalent to: data["record"]["patient"]["demographics"]["age"] ``` **Array index:** ```json ["record", "clinicResearchData", 0, "test_name"] // Equivalent to: data["record"]["clinicResearchData"][0]["test_name"] ``` **Wildcard (all elements):** ```json ["record", "clinicResearchData", "*", "test_name"] // Returns: [test_name_1, test_name_2, test_name_3, ...] // Result: Automatically joined with "|" in final value ``` **For Calculated Functions (arguments):** ```json [ "search_in_fields_using_regex", ".*surgery.*", "Previous_Surgery", "Recent_Surgery" ] // First element: function name // Rest: arguments to pass to function ``` ### Column F: field_template **Type:** String with `$value` placeholder (optional) **Description:** Apply formatting to the final value **Rules:** - Only applied if final_value is not "undefined" or "N/A" - Must contain `$value` placeholder - Result: Template with `$value` replaced by actual value **Examples:** ``` Template: "$value%" Value: 85 Result: "85%" Template: "Score: $value/100" Value: 42 Result: "Score: 42/100" Template: "Status: $value (Updated)" Value: "Active" Result: "Status: Active (Updated)" ``` ### Column G: field_condition **Type:** String (field name reference, optional) **Description:** Conditional field inclusion based on another field's value **Rules:** - If specified, must reference another field name already processed - Must evaluate to a boolean value - Referenced as `.` **Logic:** ``` IF field_condition_value == True: Process field normally ELIF field_condition_value == False: Set final_value = "N/A" ELSE (undefined/null/non-boolean): Set final_value = "undefined" ``` **Examples:** ``` field_condition: Inclusion.isPrematurelyTerminated Meaning: Only process this field if patient is prematurely terminated field_condition: Endotest.Request_Sent Meaning: Only process if test request was sent ``` ### Column H: true_if_any **Type:** JSON array (optional) **Description:** Convert to boolean if value matches ANY item in array **Syntax:** ```json ["value1", "value2", "value3"] ``` **Logic:** ``` LOOP through true_if_any array: IF raw_value == any_item: RETURN True RETURN False ``` **Example:** ```json { "field_name": "Is_Active", "true_if_any": ["active", "pending", "processing"] } raw_value = "pending" → Does "pending" exist in ["active", "pending", "processing"]? → YES → Final value = True raw_value = "completed" → Does "completed" exist in list? → NO → Final value = False ``` ### Column I: value_labels **Type:** JSON array of mapping objects (optional) **Description:** Map field values to localized text labels **Syntax:** ```json [ { "value": "raw_value_1", "text": { "fr": "Libellé Français", "en": "English Label" } }, { "value": "raw_value_2", "text": { "fr": "Autre Libellé", "en": "Another Label" } } ] ``` **Logic:** ``` LOOP through value_labels array: IF label_map.value == raw_value: RETURN label_map.text.fr (French text) IF no match found: RETURN "$$$$ Value Error: {raw_value}" ``` **Example:** ```json { "field_name": "Status", "value_labels": [ { "value": 1, "text": {"fr": "Inclus", "en": "Included"} }, { "value": 0, "text": {"fr": "Pré-inclus", "en": "Pre-included"} } ] } raw_value = 1 → Map to French label: "Inclus" ``` --- ## Special Value Prefixes This section documents special prefixes and keywords used in Extended Fields configuration for value resolution and field references. ### Prefix: `$` (String Literal) **Location:** In function arguments (like `if_then_else` parameters) **Meaning:** Marks a string value as a literal (not a field reference) **Syntax:** `$value` (just prefix with `$`, no quotes needed) **Without `$` prefix:** ```json { "field_path": ["is_true", "Has_Consent", "YES", "NO"] } // "YES" is interpreted as a FIELD NAME to look up // This will fail because no field named "YES" exists ``` **With `$` prefix (correct):** ```json { "field_path": ["is_true", "Has_Consent", "$YES", "$NO"] } // $YES is interpreted as LITERAL STRING "YES" // $NO is interpreted as LITERAL STRING "NO" // Has_Consent is interpreted as FIELD NAME (no prefix) ``` **Why It Matters:** The system needs to distinguish between: - **Field references** (look up values): `Status`, `Is_Active`, `Patient_Id` - **Literal values** (use as-is): `$Active`, `$N/A`, `$Ready` --- ### No Prefix: Field References **Location:** Arguments where field names are expected **Meaning:** Refers to a field in the current inclusion data **Examples:** ```json { "field_path": ["is_true", "Has_Consent", "$YES", "$NO"] } // Has_Consent ← field reference (look up this field's value) // Status ← field reference // Is_Active ← field reference ``` **Resolution:** The system looks up the field in the current inclusion object. --- ### Wildcard: `*` in Field Paths **Location:** In `field_path` column (Column E in Mapping sheet) **Meaning:** Match all elements at this level **Syntax:** ```json ["record", "*", "results", "*", "value"] ``` **Example 1: Single Level Wildcard** ```json { "field_path": ["items", "*", "name"] } // Returns all "name" values from each item // If items = [ // {name: "Item 1", ...}, // {name: "Item 2", ...}, // {name: "Item 3", ...} // ] // Result: ["Item 1", "Item 2", "Item 3"] // Final output: "Item 1|Item 2|Item 3" (pipe-joined) ``` **Example 2: Multiple Level Wildcard** ```json { "field_path": ["record", "*", "data", "*", "test"] } // Matches test values at multiple nesting levels ``` **Post-Processing:** - Arrays are automatically joined with `|` delimiter - Scalar values are kept as-is --- ### Value Resolution in if_then_else When using the `if_then_else` function, values are resolved based on their format: | Format | Type | Resolution | |--------|------|-----------| | `true`, `false` | Boolean literal | Used directly | | `42`, `3.14` | Numeric literal | Used directly | | `$string` | String literal | Remove `$` prefix and use value | | `field_name` | Field reference | Look up field value | **Examples:** ```json { "field_path": ["is_true", "Has_Consent", "$APPROVED", "$NOT_APPROVED"] } // Has_Consent → field reference (look it up) // $APPROVED → string literal (use "APPROVED") // $NOT_APPROVED → string literal (use "NOT_APPROVED") { "field_path": ["==", "Status", "$Active", "Overall_Status", "$MISSING"] } // Status → field reference // $Active → string literal (use "Active") // Overall_Status → field reference // $MISSING → string literal (use "MISSING") ``` --- ## Summary Table: Special Prefixes | Symbol | Meaning | Example | |--------|---------|---------| | `$value` | String literal (remove `$` prefix) | `$YES`, `$READY`, `$N/A` | | No prefix | Field reference (look up) | `Status`, `Patient_Id` | | `*` | Wildcard in field_path (all array elements) | `["items", "*", "name"]` | --- ## Data Sources Explained ### 1. Questionnaire Sources (q_id, q_name, q_category) #### What Are Questionnaires? Questionnaires are forms/surveys filled out by patients or clinicians in the Research Clinic system. Each questionnaire has: - **ID:** Unique identifier (UUID) - **Name:** Display name (e.g., "Symptom Assessment") - **Category:** Logical grouping (e.g., "Medical History") - **Answers:** Key-value pairs of responses #### Data Structure ```json all_questionnaires: { "qcm-uuid-1": { "questionnaire": { "id": "qcm-uuid-1", "name": "Symptom Questionnaire", "category": "Symptoms" }, "answers": { "question_1": "answer_value", "question_2": true, "question_3": 42 } }, "qcm-uuid-2": { "questionnaire": { "id": "qcm-uuid-2", "name": "Medical History", "category": "History" }, "answers": { "has_diabetes": false, "has_hypertension": true } } } ``` #### Finding Questionnaires **Option 1: By ID (Fastest)** ```json { "source_id": "q_id=qcm-uuid-1", "field_path": ["answers", "question_1"] } // Direct lookup in dictionary by ID // Performance: O(1) constant time ``` **Option 2: By Name** ```json { "source_id": "q_name=Symptom Questionnaire", "field_path": ["answers", "question_1"] } // Sequential search through all questionnaires // Performance: O(n) proportional to questionnaire count ``` **Option 3: By Category** ```json { "source_id": "q_category=Symptoms", "field_path": ["answers", "question_1"] } // Sequential search for category match // Performance: O(n) ``` **Recommendation:** Use `q_id=` for best performance. Name and category searches are slower but acceptable if IDs are not available. ### 2. Record Source (Clinical Data) #### What Is Record Data? The clinical record contains all medical information for a patient within the Research Clinic context: - Protocol inclusions status - Clinical research data (test requests, results) - Patient demographics - Medical history #### Data Structure ```json record_data: { "record": { "id": "record-uuid", "patientId": "patient-uuid", "protocol_inclusions": [ { "status": "incluse", "blockedQcmVersions": [], "clinicResearchData": [ { "requestMetaData": { "tubeId": "tube-uuid-123" }, "needRcp": false } ] } ] } } ``` #### Example Extraction ```json { "source_id": "record", "field_path": ["record", "protocol_inclusions", 0, "status"] } // Result: "incluse" { "source_id": "record", "field_path": ["record", "clinicResearchData", "*", "requestMetaData", "tubeId"] } // Result: ["tube-uuid-1", "tube-uuid-2"] // Final: "tube-uuid-1|tube-uuid-2" ``` ### 3. Inclusion Source (Inclusion Metadata) #### What Is Inclusion Data? Inclusion data contains metadata about the patient's inclusion in the research protocol: - Basic patient information (name, birthday) - Organization assignment - Inclusion status - Inclusion date #### Data Structure ```json inclusion_data: { "id": "patient-uuid", "name": "Doe, John", "birthday": "1975-05-15", "status": "incluse", "inclusionDate": "2024-10-15", "organization_id": "org-uuid-added-by-system", "organization_name": "Center Name-added-by-system" } ``` #### Example Extraction ```json { "source_id": "inclusion", "field_path": ["name"] } // Result: "Doe, John" { "source_id": "inclusion", "field_path": ["status"] } // Result: "incluse" ``` ### 4. Request Source (Lab Test Data) #### What Is Request Data? Request data contains information about laboratory tests ordered and their results: - Test request status - Diagnostic status - Individual test results - Result values #### Data Structure ```json request_data: { "id": "request-uuid", "tubeId": "tube-uuid-123", "status": "completed", "diagnostic_status": "Completed", "results": [ { "testName": "Complete Blood Count", "value": "Normal", "unit": "" }, { "testName": "Coelioscopie", "value": "Findings documented", "unit": "" } ] } ``` #### Example Extraction ```json { "source_id": "request", "field_path": ["status"] } // Result: "completed" { "source_id": "request", "field_path": ["results", "*", "testName"] } // Result: ["Complete Blood Count", "Coelioscopie"] // Final: "Complete Blood Count|Coelioscopie" ``` ### 5. Calculated Source (Custom Functions) #### What Are Calculated Fields? Calculated fields derive their values from custom business logic functions, not direct data extraction. The function can access other already-processed fields and perform complex transformations. #### Examples ```json { "source_name": "Calculated", "source_id": "search_in_fields_using_regex", "field_path": [".*SURGERY.*", "Previous_Surgery", "Recent_Surgery"] } // Function searches multiple fields using regex { "source_name": "Calculated", "source_id": "if_then_else", "field_path": ["is_true", "Requested", "$\"YES\"", "$\"NO\""] } // Function applies conditional logic { "source_name": "Calculated", "source_id": "extract_parentheses_content", "field_path": ["Status_Field"] } // Function extracts text from within parentheses ``` See **Section: Custom Functions Reference** for detailed function documentation. ### 6. Inclusion Source with Organization Enrichment (center_name) #### What Is Organization Center Mapping? The organization center mapping feature enriches patient inclusion data with standardized center identifiers. When configured, the `center_name` field is automatically added to each inclusion record, allowing you to group patients by center codes. #### Data Source: Inclusion Type ```json { "source_name": "Inclusion", "source_id": "inclusion", "source_type": "inclusion", "field_path": ["center_name"] } ``` #### Fields Available from Organization Enrichment | Field | Type | Description | Availability | |-------|------|-------------|--------------| | `center_name` | String | Standardized center identifier | If mapping file exists | | `organization_name` | String | Full organization name | Always | | `organization_id` | String | Organization UUID | Always | #### Data Structure ```json inclusion_data: { "organization_id": "org-uuid", "organization_name": "Hospital Cardiology Research Lab", "center_name": "HCR-MAIN", // ← Added by organization mapping "id": "patient-uuid", ... } ``` #### Example Extraction ```json { "source_name": "Inclusion", "source_id": "inclusion", "source_type": "inclusion", "field_path": ["center_name"] } // Result: "HCR-MAIN" { "source_name": "Inclusion", "source_id": "inclusion", "source_type": "inclusion", "field_path": ["organization_name"] } // Result: "Hospital Cardiology Research Lab" ``` #### Configuration Requirements **To use this feature:** 1. Create `eb_org_center_mapping.xlsx` in script directory (see [DOCUMENTATION_10_ARCHITECTURE.md](DOCUMENTATION_10_ARCHITECTURE.md) Organization ↔ Center Mapping section) 2. Define mapping rules in the `Org_Center_Mapping` sheet 3. Add extended field with source type "inclusion" and field_path ["center_name"] **Availability:** - ✅ If mapping file exists and organization is mapped → `center_name` = mapped value - ⚠️ If mapping file missing or organization not in mapping → `center_name` = organization name (fallback) #### Example Configuration ```json { "field_group": "Patient_Identification", "field_name": "Center_Name", "source_name": "Inclusion", "source_id": "inclusion", "source_type": "inclusion", "field_path": ["center_name"], "field_template": null, "field_condition": null, "true_if_any": null, "value_labels": null } ``` **Result in output:** ```json { "Patient_Identification": { "Organisation_Name": "Hospital Cardiology Research Lab", "Center_Name": "HCR-MAIN", ... } } ``` --- ## Field Path Syntax ### Basic Path Navigation #### Single-Level Access ```json ["field_name"] // JavaScript equivalent: data.field_name // Result: value or undefined ``` #### Multi-Level Nesting ```json ["record", "patient", "demographics", "age"] // JavaScript: data.record.patient.demographics.age ``` #### Array Index Access ```json ["items", 0, "name"] // JavaScript: data.items[0].name // Accesses first element of array ``` #### Negative Index (from end) ```json ["items", -1, "name"] // JavaScript: data.items[data.items.length - 1].name // Accesses last element of array ``` ### Wildcard Paths (Multiple Values) #### Single Wildcard (One Level) ```json ["questionnaire", "answers", "*", "value"] // Returns all values from each answer object // Result: Array of values [value1, value2, value3, ...] ``` #### Multiple Wildcards (Deep) ```json ["record", "*", "data", "*", "test"] // Matches nested wildcards at multiple levels // Returns: All tests at matching paths ``` #### Wildcard Result Flattening ```json path: ["items", "*", "values", "*", "score"] items: [ { "values": [ {"score": 10}, {"score": 20} ] }, { "values": [ {"score": 30}, {"score": 40} ] } ] // Without flattening: [[10, 20], [30, 40]] // With flattening (used): [10, 20, 30, 40] ``` ### Edge Cases & Behavior #### Missing Path ```json field_path: ["missing", "field"] data: {} Result: "undefined" (not null or empty string) ``` #### Null/None Values in Path ```json field_path: ["patient", "contact", "phone"] data: {"patient": {"contact": null}} Result: "undefined" (stops at null) ``` #### Non-Dictionary/Non-List Element ```json field_path: ["patient", "name", "first"] data: {"patient": {"name": "John"}} // "name" is string, not dict Result: "undefined" (cannot navigate string) ``` --- ## Custom Functions Reference ### Function 1: search_in_fields_using_regex **Purpose:** Search multiple fields for regex pattern match (case-insensitive) **Syntax:** ```json { "source_name": "Calculated", "source_id": "search_in_fields_using_regex", "field_path": ["regex_pattern", "field_1", "field_2", ...] } ``` **Parameters:** - **regex_pattern** (string): Regular expression pattern (case-insensitive) - **field_1, field_2, ...** (strings): Field names to search (looked up in output_inclusion) **Logic:** ``` FOR EACH field in [field_1, field_2, ...]: value = get_value_from_inclusion(field_name) IF value is string AND value matches regex_pattern: RETURN True RETURN False ``` **Return Value:** - `True` if ANY field matches the pattern - `False` if NO fields match - `"undefined"` if ALL fields are undefined **Examples:** Example 1: Detect if any surgery field contains "surgery" ```json { "field_name": "Has_Surgery_History", "source_id": "search_in_fields_using_regex", "field_path": [".*surgery.*", "Previous_Surgery", "Recent_Surgery", "Planned_Surgery"] } If any of these fields contains "surgery" → True Otherwise → False ``` Example 2: Check for specific procedures ```json { "field_name": "Is_Endoscopy_Planned", "source_id": "search_in_fields_using_regex", "field_path": ["endoscopy|colonoscopy", "Procedure_Type", "Procedure_Notes"] } Matches if "endoscopy" OR "colonoscopy" appears in either field ``` ### Function 2: extract_parentheses_content **Purpose:** Extract text within the first set of parentheses **Syntax:** ```json { "source_name": "Calculated", "source_id": "extract_parentheses_content", "field_path": ["field_name"] } ``` **Parameters:** - **field_name** (string): Field to extract from (looked up in output_inclusion) **Logic:** ``` value = get_value_from_inclusion(field_name) IF value is not defined: RETURN "undefined" MATCH first occurrence of (content) pattern IF match found: RETURN content ELSE: RETURN "undefined" ``` **Return Value:** - Text extracted from parentheses (e.g., "Active") - `"undefined"` if no parentheses found or field undefined **Examples:** Example 1: Extract status from formatted field ``` Input: "Patient Status (Active)" Output: "Active" ``` Example 2: Extract category name ``` Input: "Medical Condition (Hypertension)" Output: "Hypertension" ``` Example 3: Nested extraction ``` Input: "Surgery Scheduled (Appendectomy - Jan 15)" Output: "Appendectomy - Jan 15" ``` ### Function 3: append_terminated_suffix **Purpose:** Add " - AP" suffix to status if patient prematurely terminated **Syntax:** ```json { "source_name": "Calculated", "source_id": "append_terminated_suffix", "field_path": ["status_field_name", "is_terminated_field_name"] } ``` **Parameters:** - **status_field_name** (string): Field containing status value - **is_terminated_field_name** (string): Boolean field indicating termination **Logic:** ``` status = get_value_from_inclusion(status_field_name) is_terminated = get_value_from_inclusion(is_terminated_field_name) IF status is undefined: RETURN "undefined" IF is_terminated is TRUE: RETURN status + " - AP" ELSE: RETURN status ``` **Return Value:** - Status with " - AP" suffix if terminated - Original status if not terminated - `"undefined"` if status field undefined **Examples:** Example 1: Mark prematurely terminated patients ```json { "field_name": "Inclusion_Status", "source_id": "append_terminated_suffix", "field_path": ["Base_Status", "isPrematurelyTerminated"] } If isPrematurelyTerminated = True: "incluse" → "incluse - AP" If isPrematurelyTerminated = False: "incluse" → "incluse" ``` ### Function 4: if_then_else **Purpose:** Unified conditional logic with 8 different operators **Syntax:** ```json { "source_name": "Calculated", "source_id": "if_then_else", "field_path": ["operator", arg1, arg2_optional, result_if_true, result_if_false] } ``` #### Operator Reference ##### Operator 1: is_true **Signature:** `["is_true", field_name, result_if_true, result_if_false]` **Logic:** IF field == True THEN result_if_true ELSE result_if_false **Example:** ```json { "field_path": ["is_true", "Has_Consent", "$\"Consented\"", "$\"Not Consented\""] } // If Has_Consent = True → "Consented" // If Has_Consent = False → "Not Consented" ``` ##### Operator 2: is_false **Signature:** `["is_false", field_name, result_if_true, result_if_false]` **Logic:** IF field == False THEN result_if_true ELSE result_if_false **Example:** ```json { "field_path": ["is_false", "Has_Exclusion", "$\"Eligible\"", "$\"Excluded\""] } ``` ##### Operator 3: is_defined **Signature:** `["is_defined", field_name, result_if_true, result_if_false]` **Logic:** IF field is not undefined THEN result_if_true ELSE result_if_false **Example:** ```json { "field_path": ["is_defined", "Surgery_Date", "$\"Date Available\"", "$\"No Date\""] } ``` ##### Operator 4: is_undefined **Signature:** `["is_undefined", field_name, result_if_true, result_if_false]` **Logic:** IF field is undefined THEN result_if_true ELSE result_if_false **Example:** ```json { "field_path": ["is_undefined", "Last_Contact", "$\"Never Contacted\"", "$\"Contacted\""] } ``` ##### Operator 5: all_true **Signature:** `["all_true", [field_1, field_2, ...], result_if_true, result_if_false]` **Logic:** IF all fields == True THEN result_if_true ELSE result_if_false **Example:** ```json { "field_path": ["all_true", ["Has_Consent", "Has_Results", "Is_Complete"], "$\"READY\"", "$\"INCOMPLETE\""] } // Returns "READY" only if ALL three fields are True ``` ##### Operator 6: all_defined **Signature:** `["all_defined", [field_1, field_2, ...], result_if_true, result_if_false]` **Logic:** IF all fields are defined THEN result_if_true ELSE result_if_false **Example:** ```json { "field_path": ["all_defined", ["First_Name", "Last_Name", "Birth_Date"], "$\"COMPLETE\"", "$\"INCOMPLETE\""] } // Returns "COMPLETE" only if ALL three fields have values ``` ##### Operator 7: == **Signature:** `["==", value1, value2, result_if_true, result_if_false]` **Logic:** IF value1 == value2 THEN result_if_true ELSE result_if_false **Example:** ```json { "field_path": ["==", "Status", "$\"Active\"", "$\"Is Active\"", "$\"Not Active\""] } // If Status equals "Active" → "Is Active" ``` ##### Operator 8: != **Signature:** `["!=", value1, value2, result_if_true, result_if_false]` **Logic:** IF value1 != value2 THEN result_if_true ELSE result_if_false **Example:** ```json { "field_path": ["!=", "Status", "$\"Inactive\"", "$\"Active\"", "$\"Inactive\""] } // If Status NOT equal to "Inactive" → "Active" ``` #### Value Resolution The function supports multiple value types: **Boolean Literals:** ```json true, false // Used directly without field lookup ``` **Numeric Literals:** ```json 42, 3.14, 0, -1 // Used directly without field lookup ``` **String Literals (Prefixed with $):** ```json "$\"Active\"", "$\"Ready\"", "$\"N/A\"" // Remove $ prefix before using // $ prefix signals: don't look this up as field name ``` **Field References (No Prefix):** ```json "Status", "Is_Active", "Patient_Name" // Looked up in output_inclusion ``` **Complex Examples:** ```json { "field_path": ["==", "Status_Code", 1, "$\"Active\"", "$\"Inactive\""] } // Compare Status_Code field against numeric value 1 { "field_path": ["all_true", ["Consent_Received", "Test_Completed"], "Overall_Status", "$\"MISSING\""] } // If both conditions true, use Overall_Status value // If either false, use literal "MISSING" ``` --- ## Post-Processing Transformations ### Transformation Order ``` Raw Value Extraction ↓ Condition Check ↓ IF final_value is list: └─ Join with "|" delimiter ↓ IF final_value is score dict (has 'total' and 'max'): └─ Format as "total/max" ↓ IF true_if_any is specified: └─ Apply boolean conversion ↓ IF value_labels is specified: └─ Apply label mapping ↓ IF field_template is specified: └─ Apply formatting with $value ``` ### Transformation 1: Array Flattening **When:** Raw value is an array/list **Action:** Join elements with `|` delimiter **Example:** ``` Raw: ["Active", "Pending", "Resolved"] Output: "Active|Pending|Resolved" ``` ### Transformation 2: Score Dictionary Formatting **When:** Raw value is dict with keys ['total', 'max'] **Action:** Convert to "total/max" string format **Example:** ``` Raw: {"total": 8, "max": 10} Output: "8/10" ``` ### Transformation 3: true_if_any **When:** true_if_any is specified in configuration **Action:** Check if raw value matches ANY item in the array **Example:** ```json { "true_if_any": ["Active", "Pending", "Processing"], "raw_value": "Active" } // Result: true { "true_if_any": ["Active", "Pending"], "raw_value": "Completed" } // Result: false ``` ### Transformation 4: value_labels **When:** value_labels is specified in configuration **Action:** Map raw value to localized text **Logic:** ``` FOR EACH label_map in value_labels: IF label_map.value == raw_value: RETURN label_map.text.fr (French label) IF no match: RETURN "$$$$ Value Error: {raw_value}" ``` **Example:** ```json { "value_labels": [ {"value": "active", "text": {"fr": "Actif", "en": "Active"}}, {"value": "inactive", "text": {"fr": "Inactif", "en": "Inactive"}} ], "raw_value": "active" } // Result: "Actif" ``` ### Transformation 5: field_template **When:** field_template is specified (and value is not "undefined" or "N/A") **Action:** Replace $value placeholder with actual value **Example:** ``` template: "Score: $value/100" raw_value: 85 Result: "Score: 85/100" template: "Status [$value]" raw_value: "Active" Result: "Status [Active]" ``` --- ## Configuration Examples ### Example 1: Simple Field Extraction **Requirement:** Extract patient name from inclusion data ```json { "field_group": "Patient_Identification", "field_name": "Patient_Name", "source_name": "Inclusion", "source_id": "inclusion", "field_path": ["name"], "field_template": null, "field_condition": null, "true_if_any": null, "value_labels": null } ``` **Flow:** 1. Source: inclusion data 2. Extract: data["name"] 3. Result: "Doe, John" 4. Output: {"Patient_Identification": {"Patient_Name": "Doe, John"}} ### Example 2: Questionnaire Field with Label Mapping **Requirement:** Extract symptom severity and map to French labels ```json { "field_group": "Symptoms", "field_name": "Severity", "source_name": "Symptoms (OUI/NON)", "source_id": "q_id=77e488a1-d3c-148af-a6bc-8fe1f55e82e4", "field_path": ["answers", "question5"], "field_template": null, "field_condition": null, "true_if_any": null, "value_labels": [ {"value": 1, "text": {"fr": "Léger", "en": "Mild"}}, {"value": 2, "text": {"fr": "Modéré", "en": "Moderate"}}, {"value": 3, "text": {"fr": "Sévère", "en": "Severe"}} ] } ``` **Flow:** 1. Source: Questionnaire with ID 77e488a1-... 2. Extract: answers["question5"] → 2 3. Apply value_labels: 2 → "Modéré" 4. Output: {"Symptoms": {"Severity": "Modéré"}} ### Example 3: Conditional Field **Requirement:** Only show request status if test was requested ```json { "field_group": "Endotest", "field_name": "Request_Status", "source_name": "Request", "source_id": "request", "field_path": ["status"], "field_template": null, "field_condition": "Endotest.Request_Sent", "true_if_any": null, "value_labels": null } ``` **Flow:** 1. Check condition: Endotest.Request_Sent 2. If False → Set to "N/A" 3. If True → Extract status from request data 4. Output: {"Endotest": {"Request_Status": "completed"}} or "N/A" ### Example 4: Calculated Field with if_then_else **Requirement:** Show overall status based on inclusion and termination ```json { "field_group": "Inclusion", "field_name": "Inclusion_Status_Complete", "source_name": "Calculated", "source_id": "if_then_else", "field_path": ["is_true", "isPrematurelyTerminated", "$\"incluse - AP\"", "Inclusion_Status"], "field_template": null, "field_condition": null, "true_if_any": null, "value_labels": null } ``` **Flow:** 1. Check: Is isPrematurelyTerminated == True? 2. If YES → Return literal "incluse - AP" 3. If NO → Return value of Inclusion_Status field 4. Output: {"Inclusion": {"Inclusion_Status_Complete": "incluse - AP"}} or "incluse" ### Example 5: Array Field with Formatting **Requirement:** Extract all test names and format them ```json { "field_group": "Endotest", "field_name": "Tests_Performed", "source_name": "Request", "source_id": "request", "field_path": ["results", "*", "testName"], "field_template": "Tests: $value", "field_condition": null, "true_if_any": null, "value_labels": null } ``` **Flow:** 1. Source: request data 2. Extract: results[*].testName → ["Blood Test", "Imaging", "ECG"] 3. Array flattening → "Blood Test|Imaging|ECG" 4. Apply template → "Tests: Blood Test|Imaging|ECG" 5. Output: {"Endotest": {"Tests_Performed": "Tests: Blood Test|Imaging|ECG"}} ### Example 6: Complex Conditional Logic **Requirement:** Show surgery type based on multiple conditions ```json { "field_group": "Surgery", "field_name": "Surgery_Status", "source_name": "Calculated", "source_id": "if_then_else", "field_path": [ "all_true", ["Surgery_Planned", "Surgeon_Assigned", "Date_Set"], "$\"READY_FOR_SURGERY\"", "$\"INCOMPLETE_PREPARATION\"" ], "field_template": null, "field_condition": null, "true_if_any": null, "value_labels": null } ``` **Flow:** 1. Check: Are ALL of [Surgery_Planned, Surgeon_Assigned, Date_Set] == True? 2. If YES → "READY_FOR_SURGERY" 3. If NO → "INCOMPLETE_PREPARATION" 4. Output: Conditional status ### Example 7: Search and Boolean Conversion **Requirement:** Detect if patient has surgery history ```json { "field_group": "Medical_History", "field_name": "Has_Prior_Surgery", "source_name": "Calculated", "source_id": "search_in_fields_using_regex", "field_path": [".*surgery|.*intervention.*", "History_Notes", "Previous_Procedures"], "field_template": null, "field_condition": null, "true_if_any": null, "value_labels": null } ``` **Flow:** 1. Search History_Notes and Previous_Procedures 2. Pattern: ".*surgery|.*intervention.*" (case-insensitive) 3. If ANY field matches → true 4. If NO matches → false 5. Output: {"Medical_History": {"Has_Prior_Surgery": true}} --- ## User Guide: Adding/Modifying Fields ### Step 1: Identify Data Source Determine where the data lives: ``` Patient Name → inclusion (inclusion_data) Symptom Severity → questionnaire (q_id, q_name, or q_category) Clinical Notes → record (record_data) Test Results → request (request_data) Derived Value → calculated (custom function) ``` ### Step 2: Locate Field Path Navigate the JSON structure to find the exact path: **For Inclusion:** ``` Open endobest_inclusions_old.json Find a patient record Look for field under "Patient_Identification" Example path: ["name"] ``` **For Questionnaire:** ``` Need questionnaire ID/name/category Look inside answers object Example: q_id=abc-123, field_path: ["answers", "question_5"] ``` **For Record:** ``` Open a record with GET /api/records/byPatient Navigate structure Example: ["record", "clinicResearchData", 0, "requestMetaData"] ``` **For Request:** ``` Field from lab request response Example: ["results", "*", "testName"] ``` ### Step 3: Create Configuration Row Open Endobest_Dashboard_Config.xlsx → Inclusions_Mapping sheet ``` Row N: A: field_group (e.g., "Custom_Data") B: field_name (e.g., "Patient_Status") C: source_name (e.g., "Inclusion") D: source_id (e.g., "inclusion") E: field_path (e.g., ["status"]) F: field_template (optional, e.g., "Status: $value") G: field_condition (optional, e.g., "Inclusion.Is_Active") H: true_if_any (optional, e.g., ["active", "pending"]) I: value_labels (optional, complex JSON) ``` ### Step 4: Validate Configuration Run the dashboard in check-only mode: ```bash python eb_dashboard.py --check-only ``` **Expected Output:** ``` ✓ Loaded 81 fields from extended configuration. ✓ All checks passed successfully! ``` **If errors occur:** ``` Error in config file, row 42, field 'field_path': Invalid JSON format. ``` → Fix the JSON syntax in the cell ### Step 5: Test with Full Collection ```bash python eb_dashboard.py ``` After collection completes, verify: 1. New field appears in endobest_inclusions.json 2. Values are populated correctly 3. No data quality issues reported ### Step 6: Document the Field Add comments in a separate notes section (if available) explaining: - Purpose of the field - Data source and ID - Any special transformations - Expected value ranges/types --- ## Common Patterns & Recipes ### Pattern 1: Boolean Flag from Multiple Conditions **Requirement:** Create true/false flag based on multiple fields ```json { "field_group": "Flags", "field_name": "Is_Ready_For_Export", "source_name": "Calculated", "source_id": "if_then_else", "field_path": [ "all_true", ["Has_Consent", "Data_Complete", "Approved"], true, false ] } ``` ### Pattern 2: Score Display Formatting **Requirement:** Show quality of life score as "X/100" format ```json { "field_group": "Quality_Metrics", "field_name": "QOL_Score_Display", "source_name": "q_id=...", "source_id": "q_id=...", "field_path": ["answers", "overall_score"], "field_template": "$value/100" } ``` ### Pattern 3: Status Translation with Suffix **Requirement:** Show inclusion status with " - AP" for terminated patients ```json { "field_group": "Inclusion", "field_name": "Status_With_Termination", "source_name": "Calculated", "source_id": "append_terminated_suffix", "field_path": ["Inclusion_Status", "isPrematurelyTerminated"] } ``` ### Pattern 4: List-to-String Conversion **Requirement:** Show all diagnoses as pipe-separated text ```json { "field_group": "Medical_Data", "field_name": "All_Diagnoses", "source_name": "Record", "source_id": "record", "field_path": ["record", "diagnoses", "*", "code"] // Result: "ICD-001|ICD-002|ICD-003" } ``` ### Pattern 5: Optional Field Based on Condition **Requirement:** Only show surgery details if surgery was performed ```json { "field_group": "Surgery", "field_name": "Surgery_Details", "source_name": "Record", "source_id": "record", "field_path": ["record", "surgery", "details"], "field_condition": "Surgery.Surgery_Performed" // If Surgery_Performed = false → "N/A" } ``` ### Pattern 6: Enum-to-Text Mapping **Requirement:** Convert numeric status codes to readable text ```json { "field_group": "Status", "field_name": "Inclusion_Status_Text", "source_name": "Inclusion", "source_id": "inclusion", "field_path": ["status_code"], "value_labels": [ {"value": 0, "text": {"fr": "Pré-inclus", "en": "Pre-included"}}, {"value": 1, "text": {"fr": "Inclus", "en": "Included"}}, {"value": 2, "text": {"fr": "Exclus", "en": "Excluded"}} ] } ``` ### Pattern 7: Pattern Matching in Multiple Fields **Requirement:** Check if any medical note mentions specific condition ```json { "field_group": "Medical", "field_name": "Mentions_Hypertension", "source_name": "Calculated", "source_id": "search_in_fields_using_regex", "field_path": [ "hypertension|high.*pressure|HBP", "Medical_History", "Current_Conditions", "Medication_Notes" ] } ``` ### Pattern 8: Extracted Parenthetical Classification **Requirement:** Extract diagnosis type from formatted text like "Disease (Type A)" ```json { "field_group": "Classification", "field_name": "Diagnosis_Type", "source_name": "Calculated", "source_id": "extract_parentheses_content", "field_path": ["Formatted_Diagnosis"] } ``` --- ## Troubleshooting ### Issue 1: "Invalid JSON format" Error **Symptom:** Configuration validation fails with JSON parsing error **Cause:** Malformed JSON in field_path, value_labels, or field_condition **Solution:** 1. Open cell in JSON validator (jsonlint.com) 2. Verify all: - Array brackets: `[...]` - Object braces: `{...}` - String quotes: `"..."` - Commas between elements 3. Fix syntax errors 4. Re-run validation **Example - WRONG:** ```json ["name", "address" ] // WRONG: no comma after "name" ["name", "address"] // CORRECT ``` ### Issue 2: Field Returns "undefined" **Symptom:** Field value always "undefined" in output **Causes:** 1. Field path doesn't match actual data structure 2. Questionnaire ID incorrect 3. Source type mismatch **Solution:** 1. Check if source data exists in endobest_inclusions_old.json 2. Verify JSON path by stepping through manually 3. Check questionnaire ID (use `q_id` for fastest lookup) 4. Enable debug mode to see detailed errors ```bash python eb_dashboard.py --debug ``` ### Issue 3: Empty Array Result **Symptom:** Wildcard path returns empty array instead of values **Causes:** 1. Array elements don't exist at specified path 2. Wildcard position incorrect in path **Solution:** 1. Verify array exists in source data 2. Check array element structure 3. Test path manually in JSON tool **Example:** ```json // WRONG: No elements at this path ["record", "items", "*", "nonexistent_field"] // CORRECT: Match actual structure ["record", "items", "*", "existing_field"] ``` ### Issue 4: Calculated Field Returns Error **Symptom:** Calculated field value starts with "$$$$ " **Causes:** 1. Function name wrong 2. Function argument count mismatch 3. Referenced fields not yet processed **Solution:** 1. Check function name spelling 2. Verify argument count in field_path 3. Ensure referenced fields are defined BEFORE calculated field 4. Check for circular dependencies **Common Errors:** ``` "$$$$ Unknown Custom Function: typo_name" → Check function name spelling "$$$$ Argument Error: function requires N arguments" → Check field_path array length "$$$$ Value Error: undefined" → Referenced field is undefined; check order in config ``` ### Issue 5: value_labels Not Applied **Symptom:** Raw value shown instead of mapped label **Causes:** 1. Raw value doesn't match any entry in value_labels 2. JSON syntax error in value_labels 3. Case sensitivity mismatch **Solution:** 1. Check raw value type (string vs. number) 2. Verify exact match in value_labels 3. Check for case mismatches (e.g., "Active" vs "active") 4. Add wildcard entry if needed **Example:** ```json { "value_labels": [ {"value": "active", "text": {"fr": "Actif"}}, {"value": "inactive", "text": {"fr": "Inactif"}}, {"value": "*", "text": {"fr": "Autre"}} // Catch-all for unmapped values ] } ``` ### Issue 6: Performance Degradation After Adding Field **Symptom:** Collection takes significantly longer after adding field **Causes:** 1. Sequential questionnaire search (use q_id instead) 2. Expensive regex in search_in_fields_using_regex 3. Deep wildcard paths (multiple levels) **Solution:** 1. Use `q_id=` instead of `q_name=` or `q_category=` 2. Simplify regex patterns 3. Flatten wildcard paths where possible --- ## Summary The Field Mapping Configuration provides: ✅ **100% Externalized:** No code changes needed to add fields ✅ **Flexible Sourcing:** Support for questionnaires, records, requests, calculated fields ✅ **Rich Transformations:** Labels, templates, conditions, custom functions ✅ **User-Friendly:** Excel-based configuration with validation ✅ **Performance Optimized:** Single-call questionnaire fetching, field batching This architecture enables rapid iteration on data extraction without deploying code changes. --- **Document End**