EB_Dashboard/DOCUMENTATION/DOCUMENTATION_11_FIELD_MAPPING.md

# Endobest Field Mapping Configuration Guide

## Part 2: Field Mapping & Configuration

**Document Version:** 2.0 (Updated with new module references)
**Last Updated:** 2025-11-08
**Audience:** Developers, Business Analysts, Data Managers
**Language:** English

**Note:** Configuration file `Endobest_Dashboard_Config.xlsx` uses `Inclusions_Mapping` sheet for field definitions (see DOCUMENTATION_13_EXCEL_EXPORT.md and DOCUMENTATION_99_CONFIG_GUIDE.md for Excel export configuration)

---

## Table of Contents

1. [Overview](#overview)
2. [Technical Architecture](#technical-architecture)
3. [Field Processing Logic](#field-processing-logic)
4. [Configuration File Structure](#configuration-file-structure)
5. [Column Reference](#column-reference)
6. [Special Value Prefixes](#special-value-prefixes)
7. [Data Sources Explained](#data-sources-explained)
8. [Field Path Syntax](#field-path-syntax)
9. [Custom Functions Reference](#custom-functions-reference)
10. [Post-Processing Transformations](#post-processing-transformations)
11. [Configuration Examples](#configuration-examples)
12. [User Guide: Adding/Modifying Fields](#user-guide-adding-modifying-fields)
13. [Common Patterns & Recipes](#common-patterns--recipes)
14. [Troubleshooting](#troubleshooting)

---

## Overview

The **Field Mapping Configuration** defines which data points are extracted from multiple APIs (RC, GDD, questionnaires) and how they are transformed before export. The configuration is **100% externalized in an Excel file**, enabling non-technical users to add new fields without code modifications.

### Key Concepts

- **Field Group:** Logical container for related fields (e.g., "Patient_Identification", "Inclusion", "Endotest")
- **Field Name:** Unique identifier for the field within its group
- **Source:** Where the data comes from (questionnaire, record, inclusion, request)
- **Field Path:** JSON path to navigate nested structures
- **Transformations:** Post-processing rules (labels, templates, conditions)
- **Custom Functions:** Calculated fields with business logic

---

## Technical Architecture

### Field Extraction Pipeline (Detailed)

```
CONFIGURATION LOADING (startup):
├─ Load Endobest_Dashboard_Config.xlsx
├─ Parse Inclusions_Mapping sheet (rows 2 onwards)
├─ Validate each field configuration
├─ Parse JSON fields (field_path, value_labels, true_if_any, field_condition)
└─ Store in DASHBOARD_CONFIG array

FIELD PROCESSING (per patient):
├─ For each field in DASHBOARD_CONFIG:
│  ├─ Determine source type (questionnaire, record, inclusion, request, calculated)
│  │
│  ├─ IF source == questionnaire:
│  │  ├─ Method: Search by q_id, q_name, or q_category
│  │  ├─ Data: All questionnaires already fetched for patient
│  │  ├─ Path: Navigate to field_path within questionnaire answers
│  │  └─ Result: raw_value or "undefined"
│  │
│  ├─ IF source == record:
│  │  ├─ Data: Patient's clinical record
│  │  ├─ Path: Navigate JSON structure using field_path
│  │  └─ Result: raw_value or "undefined"
│  │
│  ├─ IF source == inclusion:
│  │  ├─ Data: Patient inclusion metadata
│  │  ├─ Path: Navigate nested inclusion structure
│  │  └─ Result: raw_value or "undefined"
│  │
│  ├─ IF source == request:
│  │  ├─ Data: Lab test request/results
│  │  ├─ Path: Navigate request JSON structure
│  │  └─ Result: raw_value or "undefined"
│  │
│  ├─ IF source == calculated:
│  │  ├─ Function: Custom business logic function
│  │  ├─ Arguments: From field_path
│  │  ├─ Access: Other fields already processed in output_inclusion
│  │  └─ Result: Computed value
│  │
│  ├─ CHECK field_condition (optional):
│  │  ├─ If condition is false → Set to "N/A"
│  │  ├─ If condition is undefined → Set to "undefined"
│  │  └─ If condition is true → Continue processing
│  │
│  ├─ APPLY post-processing transformations:
│  │  ├─ true_if_any: Convert to boolean
│  │  ├─ value_labels: Map to localized text
│  │  ├─ field_template: Apply formatting
│  │  └─ List joining: Flatten arrays
│  │
│  └─ STORE: output_inclusion[field_group][field_name] = final_value
│
└─ Result: Complete inclusion with all extended fields
```

### Questionnaire Finding Strategy

The system supports **3 methods** to locate questionnaires:

```python
def find_questionnaire(all_questionnaires, source_type, source_value):
    if source_type == "q_id":
        # Direct lookup by questionnaire ID (fastest)
        return all_questionnaires.get(source_value, {}).get("answers")

    elif source_type == "q_name":
        # Sequential search by questionnaire name
        for qcm_data in all_questionnaires.values():
            if qcm_data["questionnaire"]["name"] == source_value:
                return qcm_data.get("answers")
        return None

    elif source_type == "q_category":
        # Sequential search by questionnaire category
        for qcm_data in all_questionnaires.values():
            if qcm_data["questionnaire"]["category"] == source_value:
                return qcm_data.get("answers")
        return None
```

**Recommendation:** Use `q_id=` for best performance (direct lookup)

### Questionnaire Data Optimization

Instead of multiple filtered API calls:
```
BEFORE (slow):
GET /api/surveys/{qcm_id_1}/answers?subject={patient_id}
GET /api/surveys/{qcm_id_2}/answers?subject={patient_id}
GET /api/surveys/{qcm_id_3}/answers?subject={patient_id}
... (N calls per patient)

AFTER (optimized - single call):
POST /api/surveys/filter/with-answers
  payload: {"context": "clinic_research", "subject": patient_id}
  returns: [
    {"questionnaire": {id, name, category}, "answers": {...}},
    {"questionnaire": {id, name, category}, "answers": {...}},
    ...
  ]
```

All questionnaires are returned in a single call, indexed by ID for fast lookup.

---

## Field Processing Logic

### Step 1: Source Type Determination

| Source Prefix | Meaning | Example | Data Location |
|---------------|---------|---------|----------------|
| `q_id=` | Questionnaire by ID | `q_id=uuid-123` | `all_questionnaires[uuid-123]["answers"]` |
| `q_name=` | Questionnaire by name | `q_name=Symptom Check` | Search by `["questionnaire"]["name"]` |
| `q_category=` | Questionnaire by category | `q_category=Symptoms` | Search by `["questionnaire"]["category"]` |
| `record` | Clinical record | `record` | `record_data["record"]` |
| `inclusion` | Inclusion metadata | `inclusion` | `inclusion_data` |
| `request` | Lab test request | `request` | `request_data` |
| (Calculated) | Custom function | N/A | Function result |

### Step 2: Raw Value Extraction

The `field_path` defines how to navigate nested JSON structures:

```python
# Simple path
field_path = ["patient", "name"]
# Equivalent to: data["patient"]["name"]

# Nested path
field_path = ["record", "clinicResearchData", 0, "data"]
# Equivalent to: data["record"]["clinicResearchData"][0]["data"]

# Wildcard path (returns array)
field_path = ["record", "clinicResearchData", "*", "test_name"]
# Returns: [test_name_1, test_name_2, test_name_3, ...]

# Deep wildcard
field_path = ["record", "*", "results", "*", "value"]
# Matches all results.*.value across all record items
```

### Step 3: Field Condition Checking (Optional)

The `field_condition` allows skipping field processing based on another field's value:

```
IF field_condition is specified:
  ├─ Look up condition field value in output_inclusion
  ├─ IF condition value is None or "undefined":
  │  └─ Set final_value = "undefined" (skip further processing)
  ├─ IF condition value is not a boolean:
  │  └─ Set final_value = "$$$$ Condition Field Error"
  ├─ IF condition value is False:
  │  └─ Set final_value = "N/A" (field not applicable)
  └─ IF condition value is True:
     └─ Continue with post-processing
```

**Example:**
```json
{
  "field_group": "Endotest",
  "field_name": "Request_Status",
  "source_id": "request",
  "field_path": ["status"],
  "field_condition": "Endotest.Request_Sent"
}
```

Meaning: Only populate "Request_Status" if "Request_Sent" is True. Otherwise set to "N/A".

### Step 4: Post-Processing Transformations

#### 4a. Array Flattening
If `raw_value` is an array → Join with `|` delimiter:
```
Input: ["Active", "Pending", "Resolved"]
Output: "Active|Pending|Resolved"
```

#### 4b. Score Dictionary Formatting
If `raw_value` is dict with keys `['total', 'max']` → Format as string:
```
Input: {"total": 8, "max": 10}
Output: "8/10"
```

#### 4c. true_if_any Transformation
If `true_if_any` is specified → Convert to boolean:
```
true_if_any: ["Active", "Pending"]
raw_value: "Active"

→ Does raw_value match ANY value in true_if_any list?
→ TRUE
```

#### 4d. value_labels Mapping
If `value_labels` is specified → Map value to localized text:
```json
{
  "raw_value": "active",
  "value_labels": [
    {"value": "active", "text": {"fr": "Actif", "en": "Active"}},
    {"value": "inactive", "text": {"fr": "Inactif", "en": "Inactive"}}
  ]
}

→ Output: "Actif" (French text)
```

#### 4e. field_template Formatting
If `field_template` is specified → Apply template with `$value` placeholder:
```
field_template: "Score: $value/100"
final_value: 85

→ Output: "Score: 85/100"
```

---

## Configuration File Structure

### File Location
```
Endobest_Dashboard_Config.xlsx
├─ Sheet 1: "Inclusions_Mapping" (field mapping definition)
└─ Sheet 2: "Regression_Check" (non-regression rules)
             [See DOCUMENTATION_12_QUALITY_CHECKS.md]
```

### Inclusions_Mapping Sheet Overview

```
Row 1 (Headers):
A                 B              C                D           E
field_group      field_name     source_name      source_id   field_path
F                 G              H                I
field_template   field_condition  true_if_any     value_labels

Row 2+: Field definitions (one per row)
```

**Color Coding** (for visual identification):
- **Yellow:** Extended fields or Calculated fields (requires special attention)
- **Blue:** Questionnaire-sourced fields (q_id, q_name, q_category)
- **Red:** Fields with errors or missing required data
- **White:** Record/Inclusion/Request fields

---

## Column Reference

### Column A: field_group
**Type:** String (required)
**Description:** Logical grouping of related fields in output JSON
**Rules:**
- Must be unique within context (same field_name can exist in different groups)
- Becomes a dictionary key in JSON: `output[field_group][field_name]`
- Controls field visibility in regression checks

**Examples:**
```
Patient_Identification    → Contains patient metadata
Inclusion                 → Inclusion status and data
Endotest                  → Lab test information
Custom_Data              → Default for general fields
Infos_Générales          → General information
Antécédents Médicaux     → Medical history
```

### Column B: field_name
**Type:** String (required)
**Description:** Unique field identifier within its group
**Rules:**
- Must not be empty
- Can contain letters, numbers, underscores, hyphens
- Special text in parentheses is automatically removed
  - Example: `Patient_Age (years)` → `Patient_Age`

**Excel Behavior:** When cell contains `Patient_Age (years)`, the system parses it as:
```
field_name = "Patient_Age"  # Parenthetical text stripped
```

### Column C: source_name
**Type:** String (enum)
**Required:** Yes (unless cell contains "Not Specified")
**Valid Values:**
```
Inclusion               → Field from inclusion data
Record                  → Field from clinical record
Request                 → Field from lab test request
Patient / Douleurs      → Questionnaire name (implicit q_name=)
Signes et symptômes    → Questionnaire name (implicit q_name=)
Calculated              → Custom function (no direct source)
Not Specified           → Skip this row (used for spacing/comments)
```

### Column D: source_id
**Type:** String (enum with prefixes or JSON array)
**Description:** Specifies how to identify the data source

#### Format Options:

**1. Questionnaire by ID (Recommended)**
```
Syntax: q_id=<uuid>
Example: q_id=550e8400-e29b-41d4-a716-446655440000
Speed: Fastest (direct lookup)
```

**2. Questionnaire by Name**
```
Syntax: q_name=<name>
Example: q_name=Symptom Questionnaire
Speed: Slower (sequential search)
```

**3. Questionnaire by Category**
```
Syntax: q_category=<category>
Example: q_category=Medical History
Speed: Slower (sequential search)
```

**4. Record Source**
```
Value: record
Means: Extract from clinical record data
```

**5. Inclusion Source**
```
Value: inclusion
Means: Extract from inclusion metadata
```

**6. Request Source**
```
Value: request
Means: Extract from lab test request
```

**7. Calculated Function**
```
Syntax: <function_name>
Example: search_in_fields_using_regex, if_then_else, extract_parentheses_content
See Section: Custom Functions Reference
```

### Column E: field_path
**Type:** JSON array (required when field is specified)
**Description:** Path to navigate nested JSON structure

#### Syntax Examples:

**Simple field:**
```json
["name"]
// Equivalent to: data["name"]
```

**Nested path:**
```json
["record", "patient", "demographics", "age"]
// Equivalent to: data["record"]["patient"]["demographics"]["age"]
```

**Array index:**
```json
["record", "clinicResearchData", 0, "test_name"]
// Equivalent to: data["record"]["clinicResearchData"][0]["test_name"]
```

**Wildcard (all elements):**
```json
["record", "clinicResearchData", "*", "test_name"]
// Returns: [test_name_1, test_name_2, test_name_3, ...]
// Result: Automatically joined with "|" in final value
```

**For Calculated Functions (arguments):**
```json
[
  "search_in_fields_using_regex",
  ".*surgery.*",
  "Previous_Surgery",
  "Recent_Surgery"
]
// First element: function name
// Rest: arguments to pass to function
```

### Column F: field_template
**Type:** String with `$value` placeholder (optional)
**Description:** Apply formatting to the final value
**Rules:**
- Only applied if final_value is not "undefined" or "N/A"
- Must contain `$value` placeholder
- Result: Template with `$value` replaced by actual value

**Examples:**
```
Template: "$value%"
Value: 85
Result: "85%"

Template: "Score: $value/100"
Value: 42
Result: "Score: 42/100"

Template: "Status: $value (Updated)"
Value: "Active"
Result: "Status: Active (Updated)"
```

### Column G: field_condition
**Type:** String (field name reference, optional)
**Description:** Conditional field inclusion based on another field's value
**Rules:**
- If specified, must reference another field name already processed
- Must evaluate to a boolean value
- Referenced as `<field_group>.<field_name>`

**Logic:**
```
IF field_condition_value == True:
  Process field normally
ELIF field_condition_value == False:
  Set final_value = "N/A"
ELSE (undefined/null/non-boolean):
  Set final_value = "undefined"
```

**Examples:**
```
field_condition: Inclusion.isPrematurelyTerminated
Meaning: Only process this field if patient is prematurely terminated

field_condition: Endotest.Request_Sent
Meaning: Only process if test request was sent
```

### Column H: true_if_any
**Type:** JSON array (optional)
**Description:** Convert to boolean if value matches ANY item in array

**Syntax:**
```json
["value1", "value2", "value3"]
```

**Logic:**
```
LOOP through true_if_any array:
  IF raw_value == any_item:
    RETURN True

RETURN False
```

**Example:**
```json
{
  "field_name": "Is_Active",
  "true_if_any": ["active", "pending", "processing"]
}

raw_value = "pending"
→ Does "pending" exist in ["active", "pending", "processing"]?
→ YES → Final value = True

raw_value = "completed"
→ Does "completed" exist in list?
→ NO → Final value = False
```

### Column I: value_labels
**Type:** JSON array of mapping objects (optional)
**Description:** Map field values to localized text labels

**Syntax:**
```json
[
  {
    "value": "raw_value_1",
    "text": {
      "fr": "Libellé Français",
      "en": "English Label"
    }
  },
  {
    "value": "raw_value_2",
    "text": {
      "fr": "Autre Libellé",
      "en": "Another Label"
    }
  }
]
```

**Logic:**
```
LOOP through value_labels array:
  IF label_map.value == raw_value:
    RETURN label_map.text.fr  (French text)

IF no match found:
  RETURN "$$$$ Value Error: {raw_value}"
```

**Example:**
```json
{
  "field_name": "Status",
  "value_labels": [
    {
      "value": 1,
      "text": {"fr": "Inclus", "en": "Included"}
    },
    {
      "value": 0,
      "text": {"fr": "Pré-inclus", "en": "Pre-included"}
    }
  ]
}

raw_value = 1
→ Map to French label: "Inclus"
```

---

## Special Value Prefixes

This section documents special prefixes and keywords used in Extended Fields configuration for value resolution and field references.

### Prefix: `$` (String Literal)

**Location:** In function arguments (like `if_then_else` parameters)

**Meaning:** Marks a string value as a literal (not a field reference)

**Syntax:** `$value` (just prefix with `$`, no quotes needed)

**Without `$` prefix:**
```json
{
  "field_path": ["is_true", "Has_Consent", "YES", "NO"]
}
// "YES" is interpreted as a FIELD NAME to look up
// This will fail because no field named "YES" exists
```

**With `$` prefix (correct):**
```json
{
  "field_path": ["is_true", "Has_Consent", "$YES", "$NO"]
}
// $YES is interpreted as LITERAL STRING "YES"
// $NO is interpreted as LITERAL STRING "NO"
// Has_Consent is interpreted as FIELD NAME (no prefix)
```

**Why It Matters:** The system needs to distinguish between:
- **Field references** (look up values): `Status`, `Is_Active`, `Patient_Id`
- **Literal values** (use as-is): `$Active`, `$N/A`, `$Ready`

---

### No Prefix: Field References

**Location:** Arguments where field names are expected

**Meaning:** Refers to a field in the current inclusion data

**Examples:**
```json
{
  "field_path": ["is_true", "Has_Consent", "$YES", "$NO"]
}
// Has_Consent ← field reference (look up this field's value)
// Status ← field reference
// Is_Active ← field reference
```

**Resolution:** The system looks up the field in the current inclusion object.

---

### Wildcard: `*` in Field Paths

**Location:** In `field_path` column (Column E in Mapping sheet)

**Meaning:** Match all elements at this level

**Syntax:**
```json
["record", "*", "results", "*", "value"]
```

**Example 1: Single Level Wildcard**
```json
{
  "field_path": ["items", "*", "name"]
}
// Returns all "name" values from each item
// If items = [
//   {name: "Item 1", ...},
//   {name: "Item 2", ...},
//   {name: "Item 3", ...}
// ]
// Result: ["Item 1", "Item 2", "Item 3"]
// Final output: "Item 1|Item 2|Item 3" (pipe-joined)
```

**Example 2: Multiple Level Wildcard**
```json
{
  "field_path": ["record", "*", "data", "*", "test"]
}
// Matches test values at multiple nesting levels
```

**Post-Processing:**
- Arrays are automatically joined with `|` delimiter
- Scalar values are kept as-is

---

### Value Resolution in if_then_else

When using the `if_then_else` function, values are resolved based on their format:

| Format | Type | Resolution |
|--------|------|-----------|
| `true`, `false` | Boolean literal | Used directly |
| `42`, `3.14` | Numeric literal | Used directly |
| `$string` | String literal | Remove `$` prefix and use value |
| `field_name` | Field reference | Look up field value |

**Examples:**
```json
{
  "field_path": ["is_true", "Has_Consent", "$APPROVED", "$NOT_APPROVED"]
}
// Has_Consent → field reference (look it up)
// $APPROVED → string literal (use "APPROVED")
// $NOT_APPROVED → string literal (use "NOT_APPROVED")

{
  "field_path": ["==", "Status", "$Active", "Overall_Status", "$MISSING"]
}
// Status → field reference
// $Active → string literal (use "Active")
// Overall_Status → field reference
// $MISSING → string literal (use "MISSING")
```

---

## Summary Table: Special Prefixes

| Symbol | Meaning | Example |
|--------|---------|---------|
| `$value` | String literal (remove `$` prefix) | `$YES`, `$READY`, `$N/A` |
| No prefix | Field reference (look up) | `Status`, `Patient_Id` |
| `*` | Wildcard in field_path (all array elements) | `["items", "*", "name"]` |

---

## Data Sources Explained

### 1. Questionnaire Sources (q_id, q_name, q_category)

#### What Are Questionnaires?
Questionnaires are forms/surveys filled out by patients or clinicians in the Research Clinic system. Each questionnaire has:
- **ID:** Unique identifier (UUID)
- **Name:** Display name (e.g., "Symptom Assessment")
- **Category:** Logical grouping (e.g., "Medical History")
- **Answers:** Key-value pairs of responses

#### Data Structure
```json
all_questionnaires: {
  "qcm-uuid-1": {
    "questionnaire": {
      "id": "qcm-uuid-1",
      "name": "Symptom Questionnaire",
      "category": "Symptoms"
    },
    "answers": {
      "question_1": "answer_value",
      "question_2": true,
      "question_3": 42
    }
  },
  "qcm-uuid-2": {
    "questionnaire": {
      "id": "qcm-uuid-2",
      "name": "Medical History",
      "category": "History"
    },
    "answers": {
      "has_diabetes": false,
      "has_hypertension": true
    }
  }
}
```

#### Finding Questionnaires

**Option 1: By ID (Fastest)**
```json
{
  "source_id": "q_id=qcm-uuid-1",
  "field_path": ["answers", "question_1"]
}
// Direct lookup in dictionary by ID
// Performance: O(1) constant time
```

**Option 2: By Name**
```json
{
  "source_id": "q_name=Symptom Questionnaire",
  "field_path": ["answers", "question_1"]
}
// Sequential search through all questionnaires
// Performance: O(n) proportional to questionnaire count
```

**Option 3: By Category**
```json
{
  "source_id": "q_category=Symptoms",
  "field_path": ["answers", "question_1"]
}
// Sequential search for category match
// Performance: O(n)
```

**Recommendation:** Use `q_id=` for best performance. Name and category searches are slower but acceptable if IDs are not available.

### 2. Record Source (Clinical Data)

#### What Is Record Data?
The clinical record contains all medical information for a patient within the Research Clinic context:
- Protocol inclusions status
- Clinical research data (test requests, results)
- Patient demographics
- Medical history

#### Data Structure
```json
record_data: {
  "record": {
    "id": "record-uuid",
    "patientId": "patient-uuid",
    "protocol_inclusions": [
      {
        "status": "incluse",
        "blockedQcmVersions": [],
        "clinicResearchData": [
          {
            "requestMetaData": {
              "tubeId": "tube-uuid-123"
            },
            "needRcp": false
          }
        ]
      }
    ]
  }
}
```

#### Example Extraction
```json
{
  "source_id": "record",
  "field_path": ["record", "protocol_inclusions", 0, "status"]
}
// Result: "incluse"

{
  "source_id": "record",
  "field_path": ["record", "clinicResearchData", "*", "requestMetaData", "tubeId"]
}
// Result: ["tube-uuid-1", "tube-uuid-2"]
// Final: "tube-uuid-1|tube-uuid-2"
```

### 3. Inclusion Source (Inclusion Metadata)

#### What Is Inclusion Data?
Inclusion data contains metadata about the patient's inclusion in the research protocol:
- Basic patient information (name, birthday)
- Organization assignment
- Inclusion status
- Inclusion date

#### Data Structure
```json
inclusion_data: {
  "id": "patient-uuid",
  "name": "Doe, John",
  "birthday": "1975-05-15",
  "status": "incluse",
  "inclusionDate": "2024-10-15",
  "organization_id": "org-uuid-added-by-system",
  "organization_name": "Center Name-added-by-system"
}
```

#### Example Extraction
```json
{
  "source_id": "inclusion",
  "field_path": ["name"]
}
// Result: "Doe, John"

{
  "source_id": "inclusion",
  "field_path": ["status"]
}
// Result: "incluse"
```

### 4. Request Source (Lab Test Data)

#### What Is Request Data?
Request data contains information about laboratory tests ordered and their results:
- Test request status
- Diagnostic status
- Individual test results
- Result values

#### Data Structure
```json
request_data: {
  "id": "request-uuid",
  "tubeId": "tube-uuid-123",
  "status": "completed",
  "diagnostic_status": "Completed",
  "results": [
    {
      "testName": "Complete Blood Count",
      "value": "Normal",
      "unit": ""
    },
    {
      "testName": "Coelioscopie",
      "value": "Findings documented",
      "unit": ""
    }
  ]
}
```

#### Example Extraction
```json
{
  "source_id": "request",
  "field_path": ["status"]
}
// Result: "completed"

{
  "source_id": "request",
  "field_path": ["results", "*", "testName"]
}
// Result: ["Complete Blood Count", "Coelioscopie"]
// Final: "Complete Blood Count|Coelioscopie"
```

### 5. Calculated Source (Custom Functions)

#### What Are Calculated Fields?
Calculated fields derive their values from custom business logic functions, not direct data extraction. The function can access other already-processed fields and perform complex transformations.

#### Examples
```json
{
  "source_name": "Calculated",
  "source_id": "search_in_fields_using_regex",
  "field_path": [".*SURGERY.*", "Previous_Surgery", "Recent_Surgery"]
}
// Function searches multiple fields using regex

{
  "source_name": "Calculated",
  "source_id": "if_then_else",
  "field_path": ["is_true", "Requested", "$\"YES\"", "$\"NO\""]
}
// Function applies conditional logic

{
  "source_name": "Calculated",
  "source_id": "extract_parentheses_content",
  "field_path": ["Status_Field"]
}
// Function extracts text from within parentheses
```

See **Section: Custom Functions Reference** for detailed function documentation.

### 6. Inclusion Source with Organization Enrichment (center_name)

#### What Is Organization Center Mapping?

The organization center mapping feature enriches patient inclusion data with standardized center identifiers. When configured, the `center_name` field is automatically added to each inclusion record, allowing you to group patients by center codes.

#### Data Source: Inclusion Type

```json
{
  "source_name": "Inclusion",
  "source_id": "inclusion",
  "source_type": "inclusion",
  "field_path": ["center_name"]
}
```

#### Fields Available from Organization Enrichment

| Field | Type | Description | Availability |
|-------|------|-------------|--------------|
| `center_name` | String | Standardized center identifier | If mapping file exists |
| `organization_name` | String | Full organization name | Always |
| `organization_id` | String | Organization UUID | Always |

#### Data Structure

```json
inclusion_data: {
  "organization_id": "org-uuid",
  "organization_name": "Hospital Cardiology Research Lab",
  "center_name": "HCR-MAIN",  // ← Added by organization mapping
  "id": "patient-uuid",
  ...
}
```

#### Example Extraction

```json
{
  "source_name": "Inclusion",
  "source_id": "inclusion",
  "source_type": "inclusion",
  "field_path": ["center_name"]
}
// Result: "HCR-MAIN"

{
  "source_name": "Inclusion",
  "source_id": "inclusion",
  "source_type": "inclusion",
  "field_path": ["organization_name"]
}
// Result: "Hospital Cardiology Research Lab"
```

#### Configuration Requirements

**To use this feature:**

1. Create `eb_org_center_mapping.xlsx` in script directory (see [DOCUMENTATION_10_ARCHITECTURE.md](DOCUMENTATION_10_ARCHITECTURE.md) Organization ↔ Center Mapping section)
2. Define mapping rules in the `Org_Center_Mapping` sheet
3. Add extended field with source type "inclusion" and field_path ["center_name"]

**Availability:**
- ✅ If mapping file exists and organization is mapped → `center_name` = mapped value
- ⚠️ If mapping file missing or organization not in mapping → `center_name` = organization name (fallback)

#### Example Configuration

```json
{
  "field_group": "Patient_Identification",
  "field_name": "Center_Name",
  "source_name": "Inclusion",
  "source_id": "inclusion",
  "source_type": "inclusion",
  "field_path": ["center_name"],
  "field_template": null,
  "field_condition": null,
  "true_if_any": null,
  "value_labels": null
}
```

**Result in output:**
```json
{
  "Patient_Identification": {
    "Organisation_Name": "Hospital Cardiology Research Lab",
    "Center_Name": "HCR-MAIN",
    ...
  }
}
```

---

## Field Path Syntax

### Basic Path Navigation

#### Single-Level Access
```json
["field_name"]
// JavaScript equivalent: data.field_name
// Result: value or undefined
```

#### Multi-Level Nesting
```json
["record", "patient", "demographics", "age"]
// JavaScript: data.record.patient.demographics.age
```

#### Array Index Access
```json
["items", 0, "name"]
// JavaScript: data.items[0].name
// Accesses first element of array
```

#### Negative Index (from end)
```json
["items", -1, "name"]
// JavaScript: data.items[data.items.length - 1].name
// Accesses last element of array
```

### Wildcard Paths (Multiple Values)

#### Single Wildcard (One Level)
```json
["questionnaire", "answers", "*", "value"]
// Returns all values from each answer object
// Result: Array of values [value1, value2, value3, ...]
```

#### Multiple Wildcards (Deep)
```json
["record", "*", "data", "*", "test"]
// Matches nested wildcards at multiple levels
// Returns: All tests at matching paths
```

#### Wildcard Result Flattening
```json
path: ["items", "*", "values", "*", "score"]
items: [
  {
    "values": [
      {"score": 10},
      {"score": 20}
    ]
  },
  {
    "values": [
      {"score": 30},
      {"score": 40}
    ]
  }
]

// Without flattening: [[10, 20], [30, 40]]
// With flattening (used): [10, 20, 30, 40]
```

### Edge Cases & Behavior

#### Missing Path
```json
field_path: ["missing", "field"]
data: {}

Result: "undefined" (not null or empty string)
```

#### Null/None Values in Path
```json
field_path: ["patient", "contact", "phone"]
data: {"patient": {"contact": null}}

Result: "undefined" (stops at null)
```

#### Non-Dictionary/Non-List Element
```json
field_path: ["patient", "name", "first"]
data: {"patient": {"name": "John"}}  // "name" is string, not dict

Result: "undefined" (cannot navigate string)
```

---

## Custom Functions Reference

### Function 1: search_in_fields_using_regex

**Purpose:** Search multiple fields for regex pattern match (case-insensitive)

**Syntax:**
```json
{
  "source_name": "Calculated",
  "source_id": "search_in_fields_using_regex",
  "field_path": ["regex_pattern", "field_1", "field_2", ...]
}
```

**Parameters:**
- **regex_pattern** (string): Regular expression pattern (case-insensitive)
- **field_1, field_2, ...** (strings): Field names to search (looked up in output_inclusion)

**Logic:**
```
FOR EACH field in [field_1, field_2, ...]:
  value = get_value_from_inclusion(field_name)
  IF value is string AND value matches regex_pattern:
    RETURN True

RETURN False
```

**Return Value:**
- `True` if ANY field matches the pattern
- `False` if NO fields match
- `"undefined"` if ALL fields are undefined

**Examples:**

Example 1: Detect if any surgery field contains "surgery"
```json
{
  "field_name": "Has_Surgery_History",
  "source_id": "search_in_fields_using_regex",
  "field_path": [".*surgery.*", "Previous_Surgery", "Recent_Surgery", "Planned_Surgery"]
}

If any of these fields contains "surgery" → True
Otherwise → False
```

Example 2: Check for specific procedures
```json
{
  "field_name": "Is_Endoscopy_Planned",
  "source_id": "search_in_fields_using_regex",
  "field_path": ["endoscopy|colonoscopy", "Procedure_Type", "Procedure_Notes"]
}

Matches if "endoscopy" OR "colonoscopy" appears in either field
```

### Function 2: extract_parentheses_content

**Purpose:** Extract text within the first set of parentheses

**Syntax:**
```json
{
  "source_name": "Calculated",
  "source_id": "extract_parentheses_content",
  "field_path": ["field_name"]
}
```

**Parameters:**
- **field_name** (string): Field to extract from (looked up in output_inclusion)

**Logic:**
```
value = get_value_from_inclusion(field_name)
IF value is not defined:
  RETURN "undefined"

MATCH first occurrence of (content) pattern
IF match found:
  RETURN content
ELSE:
  RETURN "undefined"
```

**Return Value:**
- Text extracted from parentheses (e.g., "Active")
- `"undefined"` if no parentheses found or field undefined

**Examples:**

Example 1: Extract status from formatted field
```
Input: "Patient Status (Active)"
Output: "Active"
```

Example 2: Extract category name
```
Input: "Medical Condition (Hypertension)"
Output: "Hypertension"
```

Example 3: Nested extraction
```
Input: "Surgery Scheduled (Appendectomy - Jan 15)"
Output: "Appendectomy - Jan 15"
```

### Function 3: append_terminated_suffix

**Purpose:** Add " - AP" suffix to status if patient prematurely terminated

**Syntax:**
```json
{
  "source_name": "Calculated",
  "source_id": "append_terminated_suffix",
  "field_path": ["status_field_name", "is_terminated_field_name"]
}
```

**Parameters:**
- **status_field_name** (string): Field containing status value
- **is_terminated_field_name** (string): Boolean field indicating termination

**Logic:**
```
status = get_value_from_inclusion(status_field_name)
is_terminated = get_value_from_inclusion(is_terminated_field_name)

IF status is undefined:
  RETURN "undefined"

IF is_terminated is TRUE:
  RETURN status + " - AP"
ELSE:
  RETURN status
```

**Return Value:**
- Status with " - AP" suffix if terminated
- Original status if not terminated
- `"undefined"` if status field undefined

**Examples:**

Example 1: Mark prematurely terminated patients
```json
{
  "field_name": "Inclusion_Status",
  "source_id": "append_terminated_suffix",
  "field_path": ["Base_Status", "isPrematurelyTerminated"]
}

If isPrematurelyTerminated = True:
  "incluse" → "incluse - AP"

If isPrematurelyTerminated = False:
  "incluse" → "incluse"
```

### Function 4: if_then_else

**Purpose:** Unified conditional logic with 8 different operators

**Syntax:**
```json
{
  "source_name": "Calculated",
  "source_id": "if_then_else",
  "field_path": ["operator", arg1, arg2_optional, result_if_true, result_if_false]
}
```

#### Operator Reference

##### Operator 1: is_true
**Signature:** `["is_true", field_name, result_if_true, result_if_false]`
**Logic:** IF field == True THEN result_if_true ELSE result_if_false
**Example:**
```json
{
  "field_path": ["is_true", "Has_Consent", "$\"Consented\"", "$\"Not Consented\""]
}
// If Has_Consent = True → "Consented"
// If Has_Consent = False → "Not Consented"
```

##### Operator 2: is_false
**Signature:** `["is_false", field_name, result_if_true, result_if_false]`
**Logic:** IF field == False THEN result_if_true ELSE result_if_false
**Example:**
```json
{
  "field_path": ["is_false", "Has_Exclusion", "$\"Eligible\"", "$\"Excluded\""]
}
```

##### Operator 3: is_defined
**Signature:** `["is_defined", field_name, result_if_true, result_if_false]`
**Logic:** IF field is not undefined THEN result_if_true ELSE result_if_false
**Example:**
```json
{
  "field_path": ["is_defined", "Surgery_Date", "$\"Date Available\"", "$\"No Date\""]
}
```

##### Operator 4: is_undefined
**Signature:** `["is_undefined", field_name, result_if_true, result_if_false]`
**Logic:** IF field is undefined THEN result_if_true ELSE result_if_false
**Example:**
```json
{
  "field_path": ["is_undefined", "Last_Contact", "$\"Never Contacted\"", "$\"Contacted\""]
}
```

##### Operator 5: all_true
**Signature:** `["all_true", [field_1, field_2, ...], result_if_true, result_if_false]`
**Logic:** IF all fields == True THEN result_if_true ELSE result_if_false
**Example:**
```json
{
  "field_path": ["all_true", ["Has_Consent", "Has_Results", "Is_Complete"], "$\"READY\"", "$\"INCOMPLETE\""]
}
// Returns "READY" only if ALL three fields are True
```

##### Operator 6: all_defined
**Signature:** `["all_defined", [field_1, field_2, ...], result_if_true, result_if_false]`
**Logic:** IF all fields are defined THEN result_if_true ELSE result_if_false
**Example:**
```json
{
  "field_path": ["all_defined", ["First_Name", "Last_Name", "Birth_Date"], "$\"COMPLETE\"", "$\"INCOMPLETE\""]
}
// Returns "COMPLETE" only if ALL three fields have values
```

##### Operator 7: ==
**Signature:** `["==", value1, value2, result_if_true, result_if_false]`
**Logic:** IF value1 == value2 THEN result_if_true ELSE result_if_false
**Example:**
```json
{
  "field_path": ["==", "Status", "$\"Active\"", "$\"Is Active\"", "$\"Not Active\""]
}
// If Status equals "Active" → "Is Active"
```

##### Operator 8: !=
**Signature:** `["!=", value1, value2, result_if_true, result_if_false]`
**Logic:** IF value1 != value2 THEN result_if_true ELSE result_if_false
**Example:**
```json
{
  "field_path": ["!=", "Status", "$\"Inactive\"", "$\"Active\"", "$\"Inactive\""]
}
// If Status NOT equal to "Inactive" → "Active"
```

#### Value Resolution

The function supports multiple value types:

**Boolean Literals:**
```json
true, false
// Used directly without field lookup
```

**Numeric Literals:**
```json
42, 3.14, 0, -1
// Used directly without field lookup
```

**String Literals (Prefixed with $):**
```json
"$\"Active\"", "$\"Ready\"", "$\"N/A\""
// Remove $ prefix before using
// $ prefix signals: don't look this up as field name
```

**Field References (No Prefix):**
```json
"Status", "Is_Active", "Patient_Name"
// Looked up in output_inclusion
```

**Complex Examples:**
```json
{
  "field_path": ["==", "Status_Code", 1, "$\"Active\"", "$\"Inactive\""]
}
// Compare Status_Code field against numeric value 1

{
  "field_path": ["all_true", ["Consent_Received", "Test_Completed"], "Overall_Status", "$\"MISSING\""]
}
// If both conditions true, use Overall_Status value
// If either false, use literal "MISSING"
```

---

## Post-Processing Transformations

### Transformation Order

```
Raw Value Extraction
    ↓
Condition Check
    ↓
IF final_value is list:
  └─ Join with "|" delimiter
    ↓
IF final_value is score dict (has 'total' and 'max'):
  └─ Format as "total/max"
    ↓
IF true_if_any is specified:
  └─ Apply boolean conversion
    ↓
IF value_labels is specified:
  └─ Apply label mapping
    ↓
IF field_template is specified:
  └─ Apply formatting with $value
```

### Transformation 1: Array Flattening

**When:** Raw value is an array/list
**Action:** Join elements with `|` delimiter
**Example:**
```
Raw: ["Active", "Pending", "Resolved"]
Output: "Active|Pending|Resolved"
```

### Transformation 2: Score Dictionary Formatting

**When:** Raw value is dict with keys ['total', 'max']
**Action:** Convert to "total/max" string format
**Example:**
```
Raw: {"total": 8, "max": 10}
Output: "8/10"
```

### Transformation 3: true_if_any

**When:** true_if_any is specified in configuration
**Action:** Check if raw value matches ANY item in the array
**Example:**
```json
{
  "true_if_any": ["Active", "Pending", "Processing"],
  "raw_value": "Active"
}
// Result: true

{
  "true_if_any": ["Active", "Pending"],
  "raw_value": "Completed"
}
// Result: false
```

### Transformation 4: value_labels

**When:** value_labels is specified in configuration
**Action:** Map raw value to localized text
**Logic:**
```
FOR EACH label_map in value_labels:
  IF label_map.value == raw_value:
    RETURN label_map.text.fr  (French label)

IF no match:
  RETURN "$$$$ Value Error: {raw_value}"
```

**Example:**
```json
{
  "value_labels": [
    {"value": "active", "text": {"fr": "Actif", "en": "Active"}},
    {"value": "inactive", "text": {"fr": "Inactif", "en": "Inactive"}}
  ],
  "raw_value": "active"
}
// Result: "Actif"
```

### Transformation 5: field_template

**When:** field_template is specified (and value is not "undefined" or "N/A")
**Action:** Replace $value placeholder with actual value
**Example:**
```
template: "Score: $value/100"
raw_value: 85
Result: "Score: 85/100"

template: "Status [$value]"
raw_value: "Active"
Result: "Status [Active]"
```

---

## Configuration Examples

### Example 1: Simple Field Extraction

**Requirement:** Extract patient name from inclusion data

```json
{
  "field_group": "Patient_Identification",
  "field_name": "Patient_Name",
  "source_name": "Inclusion",
  "source_id": "inclusion",
  "field_path": ["name"],
  "field_template": null,
  "field_condition": null,
  "true_if_any": null,
  "value_labels": null
}
```

**Flow:**
1. Source: inclusion data
2. Extract: data["name"]
3. Result: "Doe, John"
4. Output: {"Patient_Identification": {"Patient_Name": "Doe, John"}}

### Example 2: Questionnaire Field with Label Mapping

**Requirement:** Extract symptom severity and map to French labels

```json
{
  "field_group": "Symptoms",
  "field_name": "Severity",
  "source_name": "Symptoms (OUI/NON)",
  "source_id": "q_id=77e488a1-d3c-148af-a6bc-8fe1f55e82e4",
  "field_path": ["answers", "question5"],
  "field_template": null,
  "field_condition": null,
  "true_if_any": null,
  "value_labels": [
    {"value": 1, "text": {"fr": "Léger", "en": "Mild"}},
    {"value": 2, "text": {"fr": "Modéré", "en": "Moderate"}},
    {"value": 3, "text": {"fr": "Sévère", "en": "Severe"}}
  ]
}
```

**Flow:**
1. Source: Questionnaire with ID 77e488a1-...
2. Extract: answers["question5"] → 2
3. Apply value_labels: 2 → "Modéré"
4. Output: {"Symptoms": {"Severity": "Modéré"}}

### Example 3: Conditional Field

**Requirement:** Only show request status if test was requested

```json
{
  "field_group": "Endotest",
  "field_name": "Request_Status",
  "source_name": "Request",
  "source_id": "request",
  "field_path": ["status"],
  "field_template": null,
  "field_condition": "Endotest.Request_Sent",
  "true_if_any": null,
  "value_labels": null
}
```

**Flow:**
1. Check condition: Endotest.Request_Sent
2. If False → Set to "N/A"
3. If True → Extract status from request data
4. Output: {"Endotest": {"Request_Status": "completed"}} or "N/A"

### Example 4: Calculated Field with if_then_else

**Requirement:** Show overall status based on inclusion and termination

```json
{
  "field_group": "Inclusion",
  "field_name": "Inclusion_Status_Complete",
  "source_name": "Calculated",
  "source_id": "if_then_else",
  "field_path": ["is_true", "isPrematurelyTerminated", "$\"incluse - AP\"", "Inclusion_Status"],
  "field_template": null,
  "field_condition": null,
  "true_if_any": null,
  "value_labels": null
}
```

**Flow:**
1. Check: Is isPrematurelyTerminated == True?
2. If YES → Return literal "incluse - AP"
3. If NO → Return value of Inclusion_Status field
4. Output: {"Inclusion": {"Inclusion_Status_Complete": "incluse - AP"}} or "incluse"

### Example 5: Array Field with Formatting

**Requirement:** Extract all test names and format them

```json
{
  "field_group": "Endotest",
  "field_name": "Tests_Performed",
  "source_name": "Request",
  "source_id": "request",
  "field_path": ["results", "*", "testName"],
  "field_template": "Tests: $value",
  "field_condition": null,
  "true_if_any": null,
  "value_labels": null
}
```

**Flow:**
1. Source: request data
2. Extract: results[*].testName → ["Blood Test", "Imaging", "ECG"]
3. Array flattening → "Blood Test|Imaging|ECG"
4. Apply template → "Tests: Blood Test|Imaging|ECG"
5. Output: {"Endotest": {"Tests_Performed": "Tests: Blood Test|Imaging|ECG"}}

### Example 6: Complex Conditional Logic

**Requirement:** Show surgery type based on multiple conditions

```json
{
  "field_group": "Surgery",
  "field_name": "Surgery_Status",
  "source_name": "Calculated",
  "source_id": "if_then_else",
  "field_path": [
    "all_true",
    ["Surgery_Planned", "Surgeon_Assigned", "Date_Set"],
    "$\"READY_FOR_SURGERY\"",
    "$\"INCOMPLETE_PREPARATION\""
  ],
  "field_template": null,
  "field_condition": null,
  "true_if_any": null,
  "value_labels": null
}
```

**Flow:**
1. Check: Are ALL of [Surgery_Planned, Surgeon_Assigned, Date_Set] == True?
2. If YES → "READY_FOR_SURGERY"
3. If NO → "INCOMPLETE_PREPARATION"
4. Output: Conditional status

### Example 7: Search and Boolean Conversion

**Requirement:** Detect if patient has surgery history

```json
{
  "field_group": "Medical_History",
  "field_name": "Has_Prior_Surgery",
  "source_name": "Calculated",
  "source_id": "search_in_fields_using_regex",
  "field_path": [".*surgery|.*intervention.*", "History_Notes", "Previous_Procedures"],
  "field_template": null,
  "field_condition": null,
  "true_if_any": null,
  "value_labels": null
}
```

**Flow:**
1. Search History_Notes and Previous_Procedures
2. Pattern: ".*surgery|.*intervention.*" (case-insensitive)
3. If ANY field matches → true
4. If NO matches → false
5. Output: {"Medical_History": {"Has_Prior_Surgery": true}}

---

## User Guide: Adding/Modifying Fields

### Step 1: Identify Data Source

Determine where the data lives:
```
Patient Name          → inclusion (inclusion_data)
Symptom Severity      → questionnaire (q_id, q_name, or q_category)
Clinical Notes        → record (record_data)
Test Results          → request (request_data)
Derived Value         → calculated (custom function)
```

### Step 2: Locate Field Path

Navigate the JSON structure to find the exact path:

**For Inclusion:**
```
Open endobest_inclusions_old.json
Find a patient record
Look for field under "Patient_Identification"
Example path: ["name"]
```

**For Questionnaire:**
```
Need questionnaire ID/name/category
Look inside answers object
Example: q_id=abc-123, field_path: ["answers", "question_5"]
```

**For Record:**
```
Open a record with GET /api/records/byPatient
Navigate structure
Example: ["record", "clinicResearchData", 0, "requestMetaData"]
```

**For Request:**
```
Field from lab request response
Example: ["results", "*", "testName"]
```

### Step 3: Create Configuration Row

Open Endobest_Dashboard_Config.xlsx → Inclusions_Mapping sheet

```
Row N:
A: field_group          (e.g., "Custom_Data")
B: field_name           (e.g., "Patient_Status")
C: source_name          (e.g., "Inclusion")
D: source_id            (e.g., "inclusion")
E: field_path           (e.g., ["status"])
F: field_template       (optional, e.g., "Status: $value")
G: field_condition      (optional, e.g., "Inclusion.Is_Active")
H: true_if_any          (optional, e.g., ["active", "pending"])
I: value_labels         (optional, complex JSON)
```

### Step 4: Validate Configuration

Run the dashboard in check-only mode:
```bash
python eb_dashboard.py --check-only
```

**Expected Output:**
```
✓ Loaded 81 fields from extended configuration.
✓ All checks passed successfully!
```

**If errors occur:**
```
Error in config file, row 42, field 'field_path': Invalid JSON format.
```
→ Fix the JSON syntax in the cell

### Step 5: Test with Full Collection

```bash
python eb_dashboard.py
```

After collection completes, verify:
1. New field appears in endobest_inclusions.json
2. Values are populated correctly
3. No data quality issues reported

### Step 6: Document the Field

Add comments in a separate notes section (if available) explaining:
- Purpose of the field
- Data source and ID
- Any special transformations
- Expected value ranges/types

---

## Common Patterns & Recipes

### Pattern 1: Boolean Flag from Multiple Conditions

**Requirement:** Create true/false flag based on multiple fields

```json
{
  "field_group": "Flags",
  "field_name": "Is_Ready_For_Export",
  "source_name": "Calculated",
  "source_id": "if_then_else",
  "field_path": [
    "all_true",
    ["Has_Consent", "Data_Complete", "Approved"],
    true,
    false
  ]
}
```

### Pattern 2: Score Display Formatting

**Requirement:** Show quality of life score as "X/100" format

```json
{
  "field_group": "Quality_Metrics",
  "field_name": "QOL_Score_Display",
  "source_name": "q_id=...",
  "source_id": "q_id=...",
  "field_path": ["answers", "overall_score"],
  "field_template": "$value/100"
}
```

### Pattern 3: Status Translation with Suffix

**Requirement:** Show inclusion status with " - AP" for terminated patients

```json
{
  "field_group": "Inclusion",
  "field_name": "Status_With_Termination",
  "source_name": "Calculated",
  "source_id": "append_terminated_suffix",
  "field_path": ["Inclusion_Status", "isPrematurelyTerminated"]
}
```

### Pattern 4: List-to-String Conversion

**Requirement:** Show all diagnoses as pipe-separated text

```json
{
  "field_group": "Medical_Data",
  "field_name": "All_Diagnoses",
  "source_name": "Record",
  "source_id": "record",
  "field_path": ["record", "diagnoses", "*", "code"]
  // Result: "ICD-001|ICD-002|ICD-003"
}
```

### Pattern 5: Optional Field Based on Condition

**Requirement:** Only show surgery details if surgery was performed

```json
{
  "field_group": "Surgery",
  "field_name": "Surgery_Details",
  "source_name": "Record",
  "source_id": "record",
  "field_path": ["record", "surgery", "details"],
  "field_condition": "Surgery.Surgery_Performed"
  // If Surgery_Performed = false → "N/A"
}
```

### Pattern 6: Enum-to-Text Mapping

**Requirement:** Convert numeric status codes to readable text

```json
{
  "field_group": "Status",
  "field_name": "Inclusion_Status_Text",
  "source_name": "Inclusion",
  "source_id": "inclusion",
  "field_path": ["status_code"],
  "value_labels": [
    {"value": 0, "text": {"fr": "Pré-inclus", "en": "Pre-included"}},
    {"value": 1, "text": {"fr": "Inclus", "en": "Included"}},
    {"value": 2, "text": {"fr": "Exclus", "en": "Excluded"}}
  ]
}
```

### Pattern 7: Pattern Matching in Multiple Fields

**Requirement:** Check if any medical note mentions specific condition

```json
{
  "field_group": "Medical",
  "field_name": "Mentions_Hypertension",
  "source_name": "Calculated",
  "source_id": "search_in_fields_using_regex",
  "field_path": [
    "hypertension|high.*pressure|HBP",
    "Medical_History",
    "Current_Conditions",
    "Medication_Notes"
  ]
}
```

### Pattern 8: Extracted Parenthetical Classification

**Requirement:** Extract diagnosis type from formatted text like "Disease (Type A)"

```json
{
  "field_group": "Classification",
  "field_name": "Diagnosis_Type",
  "source_name": "Calculated",
  "source_id": "extract_parentheses_content",
  "field_path": ["Formatted_Diagnosis"]
}
```

---

## Troubleshooting

### Issue 1: "Invalid JSON format" Error

**Symptom:** Configuration validation fails with JSON parsing error

**Cause:** Malformed JSON in field_path, value_labels, or field_condition

**Solution:**
1. Open cell in JSON validator (jsonlint.com)
2. Verify all:
   - Array brackets: `[...]`
   - Object braces: `{...}`
   - String quotes: `"..."`
   - Commas between elements
3. Fix syntax errors
4. Re-run validation

**Example - WRONG:**
```json
["name", "address" ]  // WRONG: no comma after "name"

["name", "address"]   // CORRECT
```

### Issue 2: Field Returns "undefined"

**Symptom:** Field value always "undefined" in output

**Causes:**
1. Field path doesn't match actual data structure
2. Questionnaire ID incorrect
3. Source type mismatch

**Solution:**
1. Check if source data exists in endobest_inclusions_old.json
2. Verify JSON path by stepping through manually
3. Check questionnaire ID (use `q_id` for fastest lookup)
4. Enable debug mode to see detailed errors

```bash
python eb_dashboard.py --debug
```

### Issue 3: Empty Array Result

**Symptom:** Wildcard path returns empty array instead of values

**Causes:**
1. Array elements don't exist at specified path
2. Wildcard position incorrect in path

**Solution:**
1. Verify array exists in source data
2. Check array element structure
3. Test path manually in JSON tool

**Example:**
```json
// WRONG: No elements at this path
["record", "items", "*", "nonexistent_field"]

// CORRECT: Match actual structure
["record", "items", "*", "existing_field"]
```

### Issue 4: Calculated Field Returns Error

**Symptom:** Calculated field value starts with "$$$$ "

**Causes:**
1. Function name wrong
2. Function argument count mismatch
3. Referenced fields not yet processed

**Solution:**
1. Check function name spelling
2. Verify argument count in field_path
3. Ensure referenced fields are defined BEFORE calculated field
4. Check for circular dependencies

**Common Errors:**
```
"$$$$ Unknown Custom Function: typo_name"
→ Check function name spelling

"$$$$ Argument Error: function requires N arguments"
→ Check field_path array length

"$$$$ Value Error: undefined"
→ Referenced field is undefined; check order in config
```

### Issue 5: value_labels Not Applied

**Symptom:** Raw value shown instead of mapped label

**Causes:**
1. Raw value doesn't match any entry in value_labels
2. JSON syntax error in value_labels
3. Case sensitivity mismatch

**Solution:**
1. Check raw value type (string vs. number)
2. Verify exact match in value_labels
3. Check for case mismatches (e.g., "Active" vs "active")
4. Add wildcard entry if needed

**Example:**
```json
{
  "value_labels": [
    {"value": "active", "text": {"fr": "Actif"}},
    {"value": "inactive", "text": {"fr": "Inactif"}},
    {"value": "*", "text": {"fr": "Autre"}}  // Catch-all for unmapped values
  ]
}
```

### Issue 6: Performance Degradation After Adding Field

**Symptom:** Collection takes significantly longer after adding field

**Causes:**
1. Sequential questionnaire search (use q_id instead)
2. Expensive regex in search_in_fields_using_regex
3. Deep wildcard paths (multiple levels)

**Solution:**
1. Use `q_id=` instead of `q_name=` or `q_category=`
2. Simplify regex patterns
3. Flatten wildcard paths where possible

---

## Summary

The Field Mapping Configuration provides:

✅ **100% Externalized:** No code changes needed to add fields
✅ **Flexible Sourcing:** Support for questionnaires, records, requests, calculated fields
✅ **Rich Transformations:** Labels, templates, conditions, custom functions
✅ **User-Friendly:** Excel-based configuration with validation
✅ **Performance Optimized:** Single-call questionnaire fetching, field batching

This architecture enables rapid iteration on data extraction without deploying code changes.

---

**Document End**