Files

Abdelkouddous LHACHIMI cb8b5d9a12 Version fonctionnelle

2025-12-12 23:07:26 +01:00

53 KiB

Raw Permalink Blame History

Endobest Field Mapping Configuration Guide

Part 2: Field Mapping & Configuration

Document Version: 2.0 (Updated with new module references) Last Updated: 2025-11-08 Audience: Developers, Business Analysts, Data Managers Language: English

Note: Configuration file Endobest_Dashboard_Config.xlsx uses Inclusions_Mapping sheet for field definitions (see DOCUMENTATION_13_EXCEL_EXPORT.md and DOCUMENTATION_99_CONFIG_GUIDE.md for Excel export configuration)

Overview
Technical Architecture
Field Processing Logic
Configuration File Structure
Column Reference
Special Value Prefixes
Data Sources Explained
Field Path Syntax
Custom Functions Reference
Post-Processing Transformations
Configuration Examples
User Guide: Adding/Modifying Fields
Common Patterns & Recipes
Troubleshooting

Overview

The Field Mapping Configuration defines which data points are extracted from multiple APIs (RC, GDD, questionnaires) and how they are transformed before export. The configuration is 100% externalized in an Excel file, enabling non-technical users to add new fields without code modifications.

Key Concepts

Field Group: Logical container for related fields (e.g., "Patient_Identification", "Inclusion", "Endotest")
Field Name: Unique identifier for the field within its group
Source: Where the data comes from (questionnaire, record, inclusion, request)
Field Path: JSON path to navigate nested structures
Transformations: Post-processing rules (labels, templates, conditions)
Custom Functions: Calculated fields with business logic

Technical Architecture

Field Extraction Pipeline (Detailed)

CONFIGURATION LOADING (startup):
├─ Load Endobest_Dashboard_Config.xlsx
├─ Parse Inclusions_Mapping sheet (rows 2 onwards)
├─ Validate each field configuration
├─ Parse JSON fields (field_path, value_labels, true_if_any, field_condition)
└─ Store in DASHBOARD_CONFIG array

FIELD PROCESSING (per patient):
├─ For each field in DASHBOARD_CONFIG:
│  ├─ Determine source type (questionnaire, record, inclusion, request, calculated)
│  │
│  ├─ IF source == questionnaire:
│  │  ├─ Method: Search by q_id, q_name, or q_category
│  │  ├─ Data: All questionnaires already fetched for patient
│  │  ├─ Path: Navigate to field_path within questionnaire answers
│  │  └─ Result: raw_value or "undefined"
│  │
│  ├─ IF source == record:
│  │  ├─ Data: Patient's clinical record
│  │  ├─ Path: Navigate JSON structure using field_path
│  │  └─ Result: raw_value or "undefined"
│  │
│  ├─ IF source == inclusion:
│  │  ├─ Data: Patient inclusion metadata
│  │  ├─ Path: Navigate nested inclusion structure
│  │  └─ Result: raw_value or "undefined"
│  │
│  ├─ IF source == request:
│  │  ├─ Data: Lab test request/results
│  │  ├─ Path: Navigate request JSON structure
│  │  └─ Result: raw_value or "undefined"
│  │
│  ├─ IF source == calculated:
│  │  ├─ Function: Custom business logic function
│  │  ├─ Arguments: From field_path
│  │  ├─ Access: Other fields already processed in output_inclusion
│  │  └─ Result: Computed value
│  │
│  ├─ CHECK field_condition (optional):
│  │  ├─ If condition is false → Set to "N/A"
│  │  ├─ If condition is undefined → Set to "undefined"
│  │  └─ If condition is true → Continue processing
│  │
│  ├─ APPLY post-processing transformations:
│  │  ├─ true_if_any: Convert to boolean
│  │  ├─ value_labels: Map to localized text
│  │  ├─ field_template: Apply formatting
│  │  └─ List joining: Flatten arrays
│  │
│  └─ STORE: output_inclusion[field_group][field_name] = final_value
│
└─ Result: Complete inclusion with all extended fields

Questionnaire Finding Strategy

The system supports 3 methods to locate questionnaires:

def find_questionnaire(all_questionnaires, source_type, source_value):
    if source_type == "q_id":
        # Direct lookup by questionnaire ID (fastest)
        return all_questionnaires.get(source_value, {}).get("answers")

    elif source_type == "q_name":
        # Sequential search by questionnaire name
        for qcm_data in all_questionnaires.values():
            if qcm_data["questionnaire"]["name"] == source_value:
                return qcm_data.get("answers")
        return None

    elif source_type == "q_category":
        # Sequential search by questionnaire category
        for qcm_data in all_questionnaires.values():
            if qcm_data["questionnaire"]["category"] == source_value:
                return qcm_data.get("answers")
        return None

Recommendation: Use q_id= for best performance (direct lookup)

Questionnaire Data Optimization

Instead of multiple filtered API calls:

BEFORE (slow):
GET /api/surveys/{qcm_id_1}/answers?subject={patient_id}
GET /api/surveys/{qcm_id_2}/answers?subject={patient_id}
GET /api/surveys/{qcm_id_3}/answers?subject={patient_id}
... (N calls per patient)

AFTER (optimized - single call):
POST /api/surveys/filter/with-answers
  payload: {"context": "clinic_research", "subject": patient_id}
  returns: [
    {"questionnaire": {id, name, category}, "answers": {...}},
    {"questionnaire": {id, name, category}, "answers": {...}},
    ...
  ]

All questionnaires are returned in a single call, indexed by ID for fast lookup.

Field Processing Logic

Step 1: Source Type Determination

Source Prefix	Meaning	Example	Data Location
`q_id=`	Questionnaire by ID	`q_id=uuid-123`	`all_questionnaires[uuid-123]["answers"]`
`q_name=`	Questionnaire by name	`q_name=Symptom Check`	Search by `["questionnaire"]["name"]`
`q_category=`	Questionnaire by category	`q_category=Symptoms`	Search by `["questionnaire"]["category"]`
`record`	Clinical record	`record`	`record_data["record"]`
`inclusion`	Inclusion metadata	`inclusion`	`inclusion_data`
`request`	Lab test request	`request`	`request_data`
(Calculated)	Custom function	N/A	Function result

Step 2: Raw Value Extraction

The field_path defines how to navigate nested JSON structures:

# Simple path
field_path = ["patient", "name"]
# Equivalent to: data["patient"]["name"]

# Nested path
field_path = ["record", "clinicResearchData", 0, "data"]
# Equivalent to: data["record"]["clinicResearchData"][0]["data"]

# Wildcard path (returns array)
field_path = ["record", "clinicResearchData", "*", "test_name"]
# Returns: [test_name_1, test_name_2, test_name_3, ...]

# Deep wildcard
field_path = ["record", "*", "results", "*", "value"]
# Matches all results.*.value across all record items

Step 3: Field Condition Checking (Optional)

The field_condition allows skipping field processing based on another field's value:

IF field_condition is specified:
  ├─ Look up condition field value in output_inclusion
  ├─ IF condition value is None or "undefined":
  │  └─ Set final_value = "undefined" (skip further processing)
  ├─ IF condition value is not a boolean:
  │  └─ Set final_value = "$$$$ Condition Field Error"
  ├─ IF condition value is False:
  │  └─ Set final_value = "N/A" (field not applicable)
  └─ IF condition value is True:
     └─ Continue with post-processing

Example:

{
  "field_group": "Endotest",
  "field_name": "Request_Status",
  "source_id": "request",
  "field_path": ["status"],
  "field_condition": "Endotest.Request_Sent"
}

Meaning: Only populate "Request_Status" if "Request_Sent" is True. Otherwise set to "N/A".

Step 4: Post-Processing Transformations

4a. Array Flattening

If raw_value is an array → Join with | delimiter:

Input: ["Active", "Pending", "Resolved"]
Output: "Active|Pending|Resolved"

4b. Score Dictionary Formatting

If raw_value is dict with keys ['total', 'max'] → Format as string:

Input: {"total": 8, "max": 10}
Output: "8/10"

4c. true_if_any Transformation

If true_if_any is specified → Convert to boolean:

true_if_any: ["Active", "Pending"]
raw_value: "Active"

→ Does raw_value match ANY value in true_if_any list?
→ TRUE

4d. value_labels Mapping

If value_labels is specified → Map value to localized text:

{
  "raw_value": "active",
  "value_labels": [
    {"value": "active", "text": {"fr": "Actif", "en": "Active"}},
    {"value": "inactive", "text": {"fr": "Inactif", "en": "Inactive"}}
  ]
}

→ Output: "Actif" (French text)

4e. field_template Formatting

If field_template is specified → Apply template with $value placeholder:

field_template: "Score: $value/100"
final_value: 85

→ Output: "Score: 85/100"

Configuration File Structure

File Location

Endobest_Dashboard_Config.xlsx
├─ Sheet 1: "Inclusions_Mapping" (field mapping definition)
└─ Sheet 2: "Regression_Check" (non-regression rules)
             [See DOCUMENTATION_12_QUALITY_CHECKS.md]

Inclusions_Mapping Sheet Overview

Row 1 (Headers):
A                 B              C                D           E
field_group      field_name     source_name      source_id   field_path
F                 G              H                I
field_template   field_condition  true_if_any     value_labels

Row 2+: Field definitions (one per row)

Color Coding (for visual identification):

Yellow: Extended fields or Calculated fields (requires special attention)
Blue: Questionnaire-sourced fields (q_id, q_name, q_category)
Red: Fields with errors or missing required data
White: Record/Inclusion/Request fields

Column Reference

Column A: field_group

Type: String (required) Description: Logical grouping of related fields in output JSON Rules:

Must be unique within context (same field_name can exist in different groups)
Becomes a dictionary key in JSON: output[field_group][field_name]
Controls field visibility in regression checks

Examples:

Patient_Identification    → Contains patient metadata
Inclusion                 → Inclusion status and data
Endotest                  → Lab test information
Custom_Data              → Default for general fields
Infos_Générales          → General information
Antécédents Médicaux     → Medical history

Column B: field_name

Type: String (required) Description: Unique field identifier within its group Rules:

Must not be empty
Can contain letters, numbers, underscores, hyphens
Special text in parentheses is automatically removed
- Example: Patient_Age (years) → Patient_Age

Excel Behavior: When cell contains Patient_Age (years), the system parses it as:

field_name = "Patient_Age"  # Parenthetical text stripped

Column C: source_name

Type: String (enum) Required: Yes (unless cell contains "Not Specified") Valid Values:

Inclusion               → Field from inclusion data
Record                  → Field from clinical record
Request                 → Field from lab test request
Patient / Douleurs      → Questionnaire name (implicit q_name=)
Signes et symptômes    → Questionnaire name (implicit q_name=)
Calculated              → Custom function (no direct source)
Not Specified           → Skip this row (used for spacing/comments)

Column D: source_id

Type: String (enum with prefixes or JSON array) Description: Specifies how to identify the data source

Format Options:

1. Questionnaire by ID (Recommended)

Syntax: q_id=<uuid>
Example: q_id=550e8400-e29b-41d4-a716-446655440000
Speed: Fastest (direct lookup)

2. Questionnaire by Name

Syntax: q_name=<name>
Example: q_name=Symptom Questionnaire
Speed: Slower (sequential search)

3. Questionnaire by Category

Syntax: q_category=<category>
Example: q_category=Medical History
Speed: Slower (sequential search)

4. Record Source

Value: record
Means: Extract from clinical record data

5. Inclusion Source

Value: inclusion
Means: Extract from inclusion metadata

6. Request Source

Value: request
Means: Extract from lab test request

7. Calculated Function

Syntax: <function_name>
Example: search_in_fields_using_regex, if_then_else, extract_parentheses_content
See Section: Custom Functions Reference

Column E: field_path

Type: JSON array (required when field is specified) Description: Path to navigate nested JSON structure

Syntax Examples:

Simple field:

["name"]
// Equivalent to: data["name"]

Nested path:

["record", "patient", "demographics", "age"]
// Equivalent to: data["record"]["patient"]["demographics"]["age"]

Array index:

["record", "clinicResearchData", 0, "test_name"]
// Equivalent to: data["record"]["clinicResearchData"][0]["test_name"]

Wildcard (all elements):

["record", "clinicResearchData", "*", "test_name"]
// Returns: [test_name_1, test_name_2, test_name_3, ...]
// Result: Automatically joined with "|" in final value

For Calculated Functions (arguments):

[
  "search_in_fields_using_regex",
  ".*surgery.*",
  "Previous_Surgery",
  "Recent_Surgery"
]
// First element: function name
// Rest: arguments to pass to function

Column F: field_template

Type: String with $value placeholder (optional) Description: Apply formatting to the final value Rules:

Only applied if final_value is not "undefined" or "N/A"
Must contain $value placeholder
Result: Template with $value replaced by actual value

Examples:

Template: "$value%"
Value: 85
Result: "85%"

Template: "Score: $value/100"
Value: 42
Result: "Score: 42/100"

Template: "Status: $value (Updated)"
Value: "Active"
Result: "Status: Active (Updated)"

Column G: field_condition

Type: String (field name reference, optional) Description: Conditional field inclusion based on another field's value Rules:

If specified, must reference another field name already processed
Must evaluate to a boolean value
Referenced as <field_group>.<field_name>

Logic:

IF field_condition_value == True:
  Process field normally
ELIF field_condition_value == False:
  Set final_value = "N/A"
ELSE (undefined/null/non-boolean):
  Set final_value = "undefined"

Examples:

field_condition: Inclusion.isPrematurelyTerminated
Meaning: Only process this field if patient is prematurely terminated

field_condition: Endotest.Request_Sent
Meaning: Only process if test request was sent

Column H: true_if_any

Type: JSON array (optional) Description: Convert to boolean if value matches ANY item in array

Syntax:

["value1", "value2", "value3"]

Logic:

LOOP through true_if_any array:
  IF raw_value == any_item:
    RETURN True

RETURN False

Example:

{
  "field_name": "Is_Active",
  "true_if_any": ["active", "pending", "processing"]
}

raw_value = "pending"
→ Does "pending" exist in ["active", "pending", "processing"]?
→ YES → Final value = True

raw_value = "completed"
→ Does "completed" exist in list?
→ NO → Final value = False

Column I: value_labels

Type: JSON array of mapping objects (optional) Description: Map field values to localized text labels

Syntax:

[
  {
    "value": "raw_value_1",
    "text": {
      "fr": "Libellé Français",
      "en": "English Label"
    }
  },
  {
    "value": "raw_value_2",
    "text": {
      "fr": "Autre Libellé",
      "en": "Another Label"
    }
  }
]

Logic:

LOOP through value_labels array:
  IF label_map.value == raw_value:
    RETURN label_map.text.fr  (French text)

IF no match found:
  RETURN "$$$$ Value Error: {raw_value}"

Example:

{
  "field_name": "Status",
  "value_labels": [
    {
      "value": 1,
      "text": {"fr": "Inclus", "en": "Included"}
    },
    {
      "value": 0,
      "text": {"fr": "Pré-inclus", "en": "Pre-included"}
    }
  ]
}

raw_value = 1
→ Map to French label: "Inclus"

Special Value Prefixes

This section documents special prefixes and keywords used in Extended Fields configuration for value resolution and field references.

Prefix: `$` (String Literal)

Location: In function arguments (like if_then_else parameters)

Meaning: Marks a string value as a literal (not a field reference)

Syntax: $value (just prefix with $, no quotes needed)

Without $ prefix:

{
  "field_path": ["is_true", "Has_Consent", "YES", "NO"]
}
// "YES" is interpreted as a FIELD NAME to look up
// This will fail because no field named "YES" exists

With $ prefix (correct):

{
  "field_path": ["is_true", "Has_Consent", "$YES", "$NO"]
}
// $YES is interpreted as LITERAL STRING "YES"
// $NO is interpreted as LITERAL STRING "NO"
// Has_Consent is interpreted as FIELD NAME (no prefix)

Why It Matters: The system needs to distinguish between:

Field references (look up values): Status, Is_Active, Patient_Id
Literal values (use as-is): $Active, $N/A, $Ready

No Prefix: Field References

Location: Arguments where field names are expected

Meaning: Refers to a field in the current inclusion data

Examples:

{
  "field_path": ["is_true", "Has_Consent", "$YES", "$NO"]
}
// Has_Consent ← field reference (look up this field's value)
// Status ← field reference
// Is_Active ← field reference

Resolution: The system looks up the field in the current inclusion object.

Wildcard: `*` in Field Paths

Location: In field_path column (Column E in Mapping sheet)

Meaning: Match all elements at this level

Syntax:

["record", "*", "results", "*", "value"]

Example 1: Single Level Wildcard

{
  "field_path": ["items", "*", "name"]
}
// Returns all "name" values from each item
// If items = [
//   {name: "Item 1", ...},
//   {name: "Item 2", ...},
//   {name: "Item 3", ...}
// ]
// Result: ["Item 1", "Item 2", "Item 3"]
// Final output: "Item 1|Item 2|Item 3" (pipe-joined)

Example 2: Multiple Level Wildcard

{
  "field_path": ["record", "*", "data", "*", "test"]
}
// Matches test values at multiple nesting levels

Post-Processing:

Arrays are automatically joined with | delimiter
Scalar values are kept as-is

Value Resolution in if_then_else

When using the if_then_else function, values are resolved based on their format:

Format	Type	Resolution
`true`, `false`	Boolean literal	Used directly
`42`, `3.14`	Numeric literal	Used directly
`$string`	String literal	Remove `$` prefix and use value
`field_name`	Field reference	Look up field value

Examples:

{
  "field_path": ["is_true", "Has_Consent", "$APPROVED", "$NOT_APPROVED"]
}
// Has_Consent → field reference (look it up)
// $APPROVED → string literal (use "APPROVED")
// $NOT_APPROVED → string literal (use "NOT_APPROVED")

{
  "field_path": ["==", "Status", "$Active", "Overall_Status", "$MISSING"]
}
// Status → field reference
// $Active → string literal (use "Active")
// Overall_Status → field reference
// $MISSING → string literal (use "MISSING")

Summary Table: Special Prefixes

Symbol	Meaning	Example
`$value`	String literal (remove `$` prefix)	`$YES`, `$READY`, `$N/A`
No prefix	Field reference (look up)	`Status`, `Patient_Id`
`*`	Wildcard in field_path (all array elements)	`["items", "*", "name"]`

Data Sources Explained

1. Questionnaire Sources (q_id, q_name, q_category)

What Are Questionnaires?

Questionnaires are forms/surveys filled out by patients or clinicians in the Research Clinic system. Each questionnaire has:

ID: Unique identifier (UUID)
Name: Display name (e.g., "Symptom Assessment")
Category: Logical grouping (e.g., "Medical History")
Answers: Key-value pairs of responses

Data Structure

all_questionnaires: {
  "qcm-uuid-1": {
    "questionnaire": {
      "id": "qcm-uuid-1",
      "name": "Symptom Questionnaire",
      "category": "Symptoms"
    },
    "answers": {
      "question_1": "answer_value",
      "question_2": true,
      "question_3": 42
    }
  },
  "qcm-uuid-2": {
    "questionnaire": {
      "id": "qcm-uuid-2",
      "name": "Medical History",
      "category": "History"
    },
    "answers": {
      "has_diabetes": false,
      "has_hypertension": true
    }
  }
}

Finding Questionnaires

Option 1: By ID (Fastest)

{
  "source_id": "q_id=qcm-uuid-1",
  "field_path": ["answers", "question_1"]
}
// Direct lookup in dictionary by ID
// Performance: O(1) constant time

Option 2: By Name

{
  "source_id": "q_name=Symptom Questionnaire",
  "field_path": ["answers", "question_1"]
}
// Sequential search through all questionnaires
// Performance: O(n) proportional to questionnaire count

Option 3: By Category

{
  "source_id": "q_category=Symptoms",
  "field_path": ["answers", "question_1"]
}
// Sequential search for category match
// Performance: O(n)

Recommendation: Use q_id= for best performance. Name and category searches are slower but acceptable if IDs are not available.

2. Record Source (Clinical Data)

What Is Record Data?

The clinical record contains all medical information for a patient within the Research Clinic context:

Protocol inclusions status
Clinical research data (test requests, results)
Patient demographics
Medical history

Data Structure

record_data: {
  "record": {
    "id": "record-uuid",
    "patientId": "patient-uuid",
    "protocol_inclusions": [
      {
        "status": "incluse",
        "blockedQcmVersions": [],
        "clinicResearchData": [
          {
            "requestMetaData": {
              "tubeId": "tube-uuid-123"
            },
            "needRcp": false
          }
        ]
      }
    ]
  }
}

Example Extraction

{
  "source_id": "record",
  "field_path": ["record", "protocol_inclusions", 0, "status"]
}
// Result: "incluse"

{
  "source_id": "record",
  "field_path": ["record", "clinicResearchData", "*", "requestMetaData", "tubeId"]
}
// Result: ["tube-uuid-1", "tube-uuid-2"]
// Final: "tube-uuid-1|tube-uuid-2"

3. Inclusion Source (Inclusion Metadata)

What Is Inclusion Data?

Inclusion data contains metadata about the patient's inclusion in the research protocol:

Basic patient information (name, birthday)
Organization assignment
Inclusion status
Inclusion date

Data Structure

inclusion_data: {
  "id": "patient-uuid",
  "name": "Doe, John",
  "birthday": "1975-05-15",
  "status": "incluse",
  "inclusionDate": "2024-10-15",
  "organization_id": "org-uuid-added-by-system",
  "organization_name": "Center Name-added-by-system"
}

Example Extraction

{
  "source_id": "inclusion",
  "field_path": ["name"]
}
// Result: "Doe, John"

{
  "source_id": "inclusion",
  "field_path": ["status"]
}
// Result: "incluse"

4. Request Source (Lab Test Data)

What Is Request Data?

Request data contains information about laboratory tests ordered and their results:

Test request status
Diagnostic status
Individual test results
Result values

Data Structure

request_data: {
  "id": "request-uuid",
  "tubeId": "tube-uuid-123",
  "status": "completed",
  "diagnostic_status": "Completed",
  "results": [
    {
      "testName": "Complete Blood Count",
      "value": "Normal",
      "unit": ""
    },
    {
      "testName": "Coelioscopie",
      "value": "Findings documented",
      "unit": ""
    }
  ]
}

Example Extraction

{
  "source_id": "request",
  "field_path": ["status"]
}
// Result: "completed"

{
  "source_id": "request",
  "field_path": ["results", "*", "testName"]
}
// Result: ["Complete Blood Count", "Coelioscopie"]
// Final: "Complete Blood Count|Coelioscopie"

5. Calculated Source (Custom Functions)

What Are Calculated Fields?

Calculated fields derive their values from custom business logic functions, not direct data extraction. The function can access other already-processed fields and perform complex transformations.

Examples

{
  "source_name": "Calculated",
  "source_id": "search_in_fields_using_regex",
  "field_path": [".*SURGERY.*", "Previous_Surgery", "Recent_Surgery"]
}
// Function searches multiple fields using regex

{
  "source_name": "Calculated",
  "source_id": "if_then_else",
  "field_path": ["is_true", "Requested", "$\"YES\"", "$\"NO\""]
}
// Function applies conditional logic

{
  "source_name": "Calculated",
  "source_id": "extract_parentheses_content",
  "field_path": ["Status_Field"]
}
// Function extracts text from within parentheses

See Section: Custom Functions Reference for detailed function documentation.

6. Inclusion Source with Organization Enrichment (center_name)

What Is Organization Center Mapping?

The organization center mapping feature enriches patient inclusion data with standardized center identifiers. When configured, the center_name field is automatically added to each inclusion record, allowing you to group patients by center codes.

Data Source: Inclusion Type

{
  "source_name": "Inclusion",
  "source_id": "inclusion",
  "source_type": "inclusion",
  "field_path": ["center_name"]
}

Fields Available from Organization Enrichment

Field	Type	Description	Availability
`center_name`	String	Standardized center identifier	If mapping file exists
`organization_name`	String	Full organization name	Always
`organization_id`	String	Organization UUID	Always

Data Structure

inclusion_data: {
  "organization_id": "org-uuid",
  "organization_name": "Hospital Cardiology Research Lab",
  "center_name": "HCR-MAIN",  // ← Added by organization mapping
  "id": "patient-uuid",
  ...
}

Example Extraction

{
  "source_name": "Inclusion",
  "source_id": "inclusion",
  "source_type": "inclusion",
  "field_path": ["center_name"]
}
// Result: "HCR-MAIN"

{
  "source_name": "Inclusion",
  "source_id": "inclusion",
  "source_type": "inclusion",
  "field_path": ["organization_name"]
}
// Result: "Hospital Cardiology Research Lab"

Configuration Requirements

To use this feature:

Create eb_org_center_mapping.xlsx in script directory (see DOCUMENTATION_10_ARCHITECTURE.md Organization ↔ Center Mapping section)
Define mapping rules in the Org_Center_Mapping sheet
Add extended field with source type "inclusion" and field_path ["center_name"]

Availability:

✅ If mapping file exists and organization is mapped → center_name = mapped value
⚠️ If mapping file missing or organization not in mapping → center_name = organization name (fallback)

Example Configuration

{
  "field_group": "Patient_Identification",
  "field_name": "Center_Name",
  "source_name": "Inclusion",
  "source_id": "inclusion",
  "source_type": "inclusion",
  "field_path": ["center_name"],
  "field_template": null,
  "field_condition": null,
  "true_if_any": null,
  "value_labels": null
}

Result in output:

{
  "Patient_Identification": {
    "Organisation_Name": "Hospital Cardiology Research Lab",
    "Center_Name": "HCR-MAIN",
    ...
  }
}

Field Path Syntax

Single-Level Access

["field_name"]
// JavaScript equivalent: data.field_name
// Result: value or undefined

Multi-Level Nesting

["record", "patient", "demographics", "age"]
// JavaScript: data.record.patient.demographics.age

Array Index Access

["items", 0, "name"]
// JavaScript: data.items[0].name
// Accesses first element of array

Negative Index (from end)

["items", -1, "name"]
// JavaScript: data.items[data.items.length - 1].name
// Accesses last element of array

Wildcard Paths (Multiple Values)

Single Wildcard (One Level)

["questionnaire", "answers", "*", "value"]
// Returns all values from each answer object
// Result: Array of values [value1, value2, value3, ...]

Multiple Wildcards (Deep)

["record", "*", "data", "*", "test"]
// Matches nested wildcards at multiple levels
// Returns: All tests at matching paths

Wildcard Result Flattening

path: ["items", "*", "values", "*", "score"]
items: [
  {
    "values": [
      {"score": 10},
      {"score": 20}
    ]
  },
  {
    "values": [
      {"score": 30},
      {"score": 40}
    ]
  }
]

// Without flattening: [[10, 20], [30, 40]]
// With flattening (used): [10, 20, 30, 40]

Edge Cases & Behavior

Missing Path

field_path: ["missing", "field"]
data: {}

Result: "undefined" (not null or empty string)

Null/None Values in Path

field_path: ["patient", "contact", "phone"]
data: {"patient": {"contact": null}}

Result: "undefined" (stops at null)

Non-Dictionary/Non-List Element

field_path: ["patient", "name", "first"]
data: {"patient": {"name": "John"}}  // "name" is string, not dict

Result: "undefined" (cannot navigate string)

Custom Functions Reference

Function 1: search_in_fields_using_regex

Purpose: Search multiple fields for regex pattern match (case-insensitive)

Syntax:

{
  "source_name": "Calculated",
  "source_id": "search_in_fields_using_regex",
  "field_path": ["regex_pattern", "field_1", "field_2", ...]
}

Parameters:

regex_pattern (string): Regular expression pattern (case-insensitive)
field_1, field_2, ... (strings): Field names to search (looked up in output_inclusion)

Logic:

FOR EACH field in [field_1, field_2, ...]:
  value = get_value_from_inclusion(field_name)
  IF value is string AND value matches regex_pattern:
    RETURN True

RETURN False

Return Value:

True if ANY field matches the pattern
False if NO fields match
"undefined" if ALL fields are undefined

Examples:

Example 1: Detect if any surgery field contains "surgery"

{
  "field_name": "Has_Surgery_History",
  "source_id": "search_in_fields_using_regex",
  "field_path": [".*surgery.*", "Previous_Surgery", "Recent_Surgery", "Planned_Surgery"]
}

If any of these fields contains "surgery" → True
Otherwise → False

Example 2: Check for specific procedures

{
  "field_name": "Is_Endoscopy_Planned",
  "source_id": "search_in_fields_using_regex",
  "field_path": ["endoscopy|colonoscopy", "Procedure_Type", "Procedure_Notes"]
}

Matches if "endoscopy" OR "colonoscopy" appears in either field

Function 2: extract_parentheses_content

Purpose: Extract text within the first set of parentheses

Syntax:

{
  "source_name": "Calculated",
  "source_id": "extract_parentheses_content",
  "field_path": ["field_name"]
}

Parameters:

field_name (string): Field to extract from (looked up in output_inclusion)

Logic:

value = get_value_from_inclusion(field_name)
IF value is not defined:
  RETURN "undefined"

MATCH first occurrence of (content) pattern
IF match found:
  RETURN content
ELSE:
  RETURN "undefined"

Return Value:

Text extracted from parentheses (e.g., "Active")
"undefined" if no parentheses found or field undefined

Examples:

Example 1: Extract status from formatted field

Input: "Patient Status (Active)"
Output: "Active"

Example 2: Extract category name

Input: "Medical Condition (Hypertension)"
Output: "Hypertension"

Example 3: Nested extraction

Input: "Surgery Scheduled (Appendectomy - Jan 15)"
Output: "Appendectomy - Jan 15"

Function 3: append_terminated_suffix

Purpose: Add " - AP" suffix to status if patient prematurely terminated

Syntax:

{
  "source_name": "Calculated",
  "source_id": "append_terminated_suffix",
  "field_path": ["status_field_name", "is_terminated_field_name"]
}

Parameters:

status_field_name (string): Field containing status value
is_terminated_field_name (string): Boolean field indicating termination

Logic:

status = get_value_from_inclusion(status_field_name)
is_terminated = get_value_from_inclusion(is_terminated_field_name)

IF status is undefined:
  RETURN "undefined"

IF is_terminated is TRUE:
  RETURN status + " - AP"
ELSE:
  RETURN status

Return Value:

Status with " - AP" suffix if terminated
Original status if not terminated
"undefined" if status field undefined

Examples:

Example 1: Mark prematurely terminated patients

{
  "field_name": "Inclusion_Status",
  "source_id": "append_terminated_suffix",
  "field_path": ["Base_Status", "isPrematurelyTerminated"]
}

If isPrematurelyTerminated = True:
  "incluse" → "incluse - AP"

If isPrematurelyTerminated = False:
  "incluse" → "incluse"

Function 4: if_then_else

Purpose: Unified conditional logic with 8 different operators

Syntax:

{
  "source_name": "Calculated",
  "source_id": "if_then_else",
  "field_path": ["operator", arg1, arg2_optional, result_if_true, result_if_false]
}

Operator Reference

Operator 1: is_true

Signature: ["is_true", field_name, result_if_true, result_if_false] Logic: IF field == True THEN result_if_true ELSE result_if_false Example:

{
  "field_path": ["is_true", "Has_Consent", "$\"Consented\"", "$\"Not Consented\""]
}
// If Has_Consent = True → "Consented"
// If Has_Consent = False → "Not Consented"

Operator 2: is_false

Signature: ["is_false", field_name, result_if_true, result_if_false] Logic: IF field == False THEN result_if_true ELSE result_if_false Example:

{
  "field_path": ["is_false", "Has_Exclusion", "$\"Eligible\"", "$\"Excluded\""]
}

Operator 3: is_defined

Signature: ["is_defined", field_name, result_if_true, result_if_false] Logic: IF field is not undefined THEN result_if_true ELSE result_if_false Example:

{
  "field_path": ["is_defined", "Surgery_Date", "$\"Date Available\"", "$\"No Date\""]
}

Operator 4: is_undefined

Signature: ["is_undefined", field_name, result_if_true, result_if_false] Logic: IF field is undefined THEN result_if_true ELSE result_if_false Example:

{
  "field_path": ["is_undefined", "Last_Contact", "$\"Never Contacted\"", "$\"Contacted\""]
}

Operator 5: all_true

Signature: ["all_true", [field_1, field_2, ...], result_if_true, result_if_false] Logic: IF all fields == True THEN result_if_true ELSE result_if_false Example:

{
  "field_path": ["all_true", ["Has_Consent", "Has_Results", "Is_Complete"], "$\"READY\"", "$\"INCOMPLETE\""]
}
// Returns "READY" only if ALL three fields are True

Operator 6: all_defined

Signature: ["all_defined", [field_1, field_2, ...], result_if_true, result_if_false] Logic: IF all fields are defined THEN result_if_true ELSE result_if_false Example:

{
  "field_path": ["all_defined", ["First_Name", "Last_Name", "Birth_Date"], "$\"COMPLETE\"", "$\"INCOMPLETE\""]
}
// Returns "COMPLETE" only if ALL three fields have values

Operator 7: ==

Signature: ["==", value1, value2, result_if_true, result_if_false] Logic: IF value1 == value2 THEN result_if_true ELSE result_if_false Example:

{
  "field_path": ["==", "Status", "$\"Active\"", "$\"Is Active\"", "$\"Not Active\""]
}
// If Status equals "Active" → "Is Active"

Operator 8: !=

Signature: ["!=", value1, value2, result_if_true, result_if_false] Logic: IF value1 != value2 THEN result_if_true ELSE result_if_false Example:

{
  "field_path": ["!=", "Status", "$\"Inactive\"", "$\"Active\"", "$\"Inactive\""]
}
// If Status NOT equal to "Inactive" → "Active"

Value Resolution

The function supports multiple value types:

Boolean Literals:

true, false
// Used directly without field lookup

Numeric Literals:

42, 3.14, 0, -1
// Used directly without field lookup

String Literals (Prefixed with $):

"$\"Active\"", "$\"Ready\"", "$\"N/A\""
// Remove $ prefix before using
// $ prefix signals: don't look this up as field name

Field References (No Prefix):

"Status", "Is_Active", "Patient_Name"
// Looked up in output_inclusion

Complex Examples:

{
  "field_path": ["==", "Status_Code", 1, "$\"Active\"", "$\"Inactive\""]
}
// Compare Status_Code field against numeric value 1

{
  "field_path": ["all_true", ["Consent_Received", "Test_Completed"], "Overall_Status", "$\"MISSING\""]
}
// If both conditions true, use Overall_Status value
// If either false, use literal "MISSING"

Post-Processing Transformations

Transformation Order

Raw Value Extraction
    ↓
Condition Check
    ↓
IF final_value is list:
  └─ Join with "|" delimiter
    ↓
IF final_value is score dict (has 'total' and 'max'):
  └─ Format as "total/max"
    ↓
IF true_if_any is specified:
  └─ Apply boolean conversion
    ↓
IF value_labels is specified:
  └─ Apply label mapping
    ↓
IF field_template is specified:
  └─ Apply formatting with $value

Transformation 1: Array Flattening

When: Raw value is an array/list Action: Join elements with | delimiter Example:

Raw: ["Active", "Pending", "Resolved"]
Output: "Active|Pending|Resolved"

Transformation 2: Score Dictionary Formatting

When: Raw value is dict with keys ['total', 'max'] Action: Convert to "total/max" string format Example:

Raw: {"total": 8, "max": 10}
Output: "8/10"

Transformation 3: true_if_any

When: true_if_any is specified in configuration Action: Check if raw value matches ANY item in the array Example:

{
  "true_if_any": ["Active", "Pending", "Processing"],
  "raw_value": "Active"
}
// Result: true

{
  "true_if_any": ["Active", "Pending"],
  "raw_value": "Completed"
}
// Result: false

Transformation 4: value_labels

When: value_labels is specified in configuration Action: Map raw value to localized text Logic:

FOR EACH label_map in value_labels:
  IF label_map.value == raw_value:
    RETURN label_map.text.fr  (French label)

IF no match:
  RETURN "$$$$ Value Error: {raw_value}"

Example:

{
  "value_labels": [
    {"value": "active", "text": {"fr": "Actif", "en": "Active"}},
    {"value": "inactive", "text": {"fr": "Inactif", "en": "Inactive"}}
  ],
  "raw_value": "active"
}
// Result: "Actif"

Transformation 5: field_template

When: field_template is specified (and value is not "undefined" or "N/A") Action: Replace $value placeholder with actual value Example:

template: "Score: $value/100"
raw_value: 85
Result: "Score: 85/100"

template: "Status [$value]"
raw_value: "Active"
Result: "Status [Active]"

Configuration Examples

Example 1: Simple Field Extraction

Requirement: Extract patient name from inclusion data

{
  "field_group": "Patient_Identification",
  "field_name": "Patient_Name",
  "source_name": "Inclusion",
  "source_id": "inclusion",
  "field_path": ["name"],
  "field_template": null,
  "field_condition": null,
  "true_if_any": null,
  "value_labels": null
}

Flow:

Source: inclusion data
Extract: data["name"]
Result: "Doe, John"
Output: {"Patient_Identification": {"Patient_Name": "Doe, John"}}

Example 2: Questionnaire Field with Label Mapping

Requirement: Extract symptom severity and map to French labels

{
  "field_group": "Symptoms",
  "field_name": "Severity",
  "source_name": "Symptoms (OUI/NON)",
  "source_id": "q_id=77e488a1-d3c-148af-a6bc-8fe1f55e82e4",
  "field_path": ["answers", "question5"],
  "field_template": null,
  "field_condition": null,
  "true_if_any": null,
  "value_labels": [
    {"value": 1, "text": {"fr": "Léger", "en": "Mild"}},
    {"value": 2, "text": {"fr": "Modéré", "en": "Moderate"}},
    {"value": 3, "text": {"fr": "Sévère", "en": "Severe"}}
  ]
}

Flow:

Source: Questionnaire with ID 77e488a1-...
Extract: answers["question5"] → 2
Apply value_labels: 2 → "Modéré"
Output: {"Symptoms": {"Severity": "Modéré"}}

Example 3: Conditional Field

Requirement: Only show request status if test was requested

{
  "field_group": "Endotest",
  "field_name": "Request_Status",
  "source_name": "Request",
  "source_id": "request",
  "field_path": ["status"],
  "field_template": null,
  "field_condition": "Endotest.Request_Sent",
  "true_if_any": null,
  "value_labels": null
}

Flow:

Check condition: Endotest.Request_Sent
If False → Set to "N/A"
If True → Extract status from request data
Output: {"Endotest": {"Request_Status": "completed"}} or "N/A"

Example 4: Calculated Field with if_then_else

Requirement: Show overall status based on inclusion and termination

{
  "field_group": "Inclusion",
  "field_name": "Inclusion_Status_Complete",
  "source_name": "Calculated",
  "source_id": "if_then_else",
  "field_path": ["is_true", "isPrematurelyTerminated", "$\"incluse - AP\"", "Inclusion_Status"],
  "field_template": null,
  "field_condition": null,
  "true_if_any": null,
  "value_labels": null
}

Flow:

Check: Is isPrematurelyTerminated == True?
If YES → Return literal "incluse - AP"
If NO → Return value of Inclusion_Status field
Output: {"Inclusion": {"Inclusion_Status_Complete": "incluse - AP"}} or "incluse"

Example 5: Array Field with Formatting

Requirement: Extract all test names and format them

{
  "field_group": "Endotest",
  "field_name": "Tests_Performed",
  "source_name": "Request",
  "source_id": "request",
  "field_path": ["results", "*", "testName"],
  "field_template": "Tests: $value",
  "field_condition": null,
  "true_if_any": null,
  "value_labels": null
}

Flow:

Source: request data
Extract: results[*].testName → ["Blood Test", "Imaging", "ECG"]
Array flattening → "Blood Test|Imaging|ECG"
Apply template → "Tests: Blood Test|Imaging|ECG"
Output: {"Endotest": {"Tests_Performed": "Tests: Blood Test|Imaging|ECG"}}

Example 6: Complex Conditional Logic

Requirement: Show surgery type based on multiple conditions

{
  "field_group": "Surgery",
  "field_name": "Surgery_Status",
  "source_name": "Calculated",
  "source_id": "if_then_else",
  "field_path": [
    "all_true",
    ["Surgery_Planned", "Surgeon_Assigned", "Date_Set"],
    "$\"READY_FOR_SURGERY\"",
    "$\"INCOMPLETE_PREPARATION\""
  ],
  "field_template": null,
  "field_condition": null,
  "true_if_any": null,
  "value_labels": null
}

Flow:

Check: Are ALL of [Surgery_Planned, Surgeon_Assigned, Date_Set] == True?
If YES → "READY_FOR_SURGERY"
If NO → "INCOMPLETE_PREPARATION"
Output: Conditional status

Example 7: Search and Boolean Conversion

Requirement: Detect if patient has surgery history

{
  "field_group": "Medical_History",
  "field_name": "Has_Prior_Surgery",
  "source_name": "Calculated",
  "source_id": "search_in_fields_using_regex",
  "field_path": [".*surgery|.*intervention.*", "History_Notes", "Previous_Procedures"],
  "field_template": null,
  "field_condition": null,
  "true_if_any": null,
  "value_labels": null
}

Flow:

Search History_Notes and Previous_Procedures
Pattern: ".*surgery|.intervention." (case-insensitive)
If ANY field matches → true
If NO matches → false
Output: {"Medical_History": {"Has_Prior_Surgery": true}}

User Guide: Adding/Modifying Fields

Step 1: Identify Data Source

Determine where the data lives:

Patient Name          → inclusion (inclusion_data)
Symptom Severity      → questionnaire (q_id, q_name, or q_category)
Clinical Notes        → record (record_data)
Test Results          → request (request_data)
Derived Value         → calculated (custom function)

Step 2: Locate Field Path

Navigate the JSON structure to find the exact path:

For Inclusion:

Open endobest_inclusions_old.json
Find a patient record
Look for field under "Patient_Identification"
Example path: ["name"]

For Questionnaire:

Need questionnaire ID/name/category
Look inside answers object
Example: q_id=abc-123, field_path: ["answers", "question_5"]

For Record:

Open a record with GET /api/records/byPatient
Navigate structure
Example: ["record", "clinicResearchData", 0, "requestMetaData"]

For Request:

Field from lab request response
Example: ["results", "*", "testName"]

Step 3: Create Configuration Row

Open Endobest_Dashboard_Config.xlsx → Inclusions_Mapping sheet

Row N:
A: field_group          (e.g., "Custom_Data")
B: field_name           (e.g., "Patient_Status")
C: source_name          (e.g., "Inclusion")
D: source_id            (e.g., "inclusion")
E: field_path           (e.g., ["status"])
F: field_template       (optional, e.g., "Status: $value")
G: field_condition      (optional, e.g., "Inclusion.Is_Active")
H: true_if_any          (optional, e.g., ["active", "pending"])
I: value_labels         (optional, complex JSON)

Step 4: Validate Configuration

Run the dashboard in check-only mode:

python eb_dashboard.py --check-only

Expected Output:

✓ Loaded 81 fields from extended configuration.
✓ All checks passed successfully!

If errors occur:

Error in config file, row 42, field 'field_path': Invalid JSON format.

→ Fix the JSON syntax in the cell

Step 5: Test with Full Collection

python eb_dashboard.py

After collection completes, verify:

New field appears in endobest_inclusions.json
Values are populated correctly
No data quality issues reported

Step 6: Document the Field

Add comments in a separate notes section (if available) explaining:

Purpose of the field
Data source and ID
Any special transformations
Expected value ranges/types

Common Patterns & Recipes

Pattern 1: Boolean Flag from Multiple Conditions

Requirement: Create true/false flag based on multiple fields

{
  "field_group": "Flags",
  "field_name": "Is_Ready_For_Export",
  "source_name": "Calculated",
  "source_id": "if_then_else",
  "field_path": [
    "all_true",
    ["Has_Consent", "Data_Complete", "Approved"],
    true,
    false
  ]
}

Pattern 2: Score Display Formatting

Requirement: Show quality of life score as "X/100" format

{
  "field_group": "Quality_Metrics",
  "field_name": "QOL_Score_Display",
  "source_name": "q_id=...",
  "source_id": "q_id=...",
  "field_path": ["answers", "overall_score"],
  "field_template": "$value/100"
}

Pattern 3: Status Translation with Suffix

Requirement: Show inclusion status with " - AP" for terminated patients

{
  "field_group": "Inclusion",
  "field_name": "Status_With_Termination",
  "source_name": "Calculated",
  "source_id": "append_terminated_suffix",
  "field_path": ["Inclusion_Status", "isPrematurelyTerminated"]
}

Pattern 4: List-to-String Conversion

Requirement: Show all diagnoses as pipe-separated text

{
  "field_group": "Medical_Data",
  "field_name": "All_Diagnoses",
  "source_name": "Record",
  "source_id": "record",
  "field_path": ["record", "diagnoses", "*", "code"]
  // Result: "ICD-001|ICD-002|ICD-003"
}

Pattern 5: Optional Field Based on Condition

Requirement: Only show surgery details if surgery was performed

{
  "field_group": "Surgery",
  "field_name": "Surgery_Details",
  "source_name": "Record",
  "source_id": "record",
  "field_path": ["record", "surgery", "details"],
  "field_condition": "Surgery.Surgery_Performed"
  // If Surgery_Performed = false → "N/A"
}

Pattern 6: Enum-to-Text Mapping

Requirement: Convert numeric status codes to readable text

{
  "field_group": "Status",
  "field_name": "Inclusion_Status_Text",
  "source_name": "Inclusion",
  "source_id": "inclusion",
  "field_path": ["status_code"],
  "value_labels": [
    {"value": 0, "text": {"fr": "Pré-inclus", "en": "Pre-included"}},
    {"value": 1, "text": {"fr": "Inclus", "en": "Included"}},
    {"value": 2, "text": {"fr": "Exclus", "en": "Excluded"}}
  ]
}

Pattern 7: Pattern Matching in Multiple Fields

Requirement: Check if any medical note mentions specific condition

{
  "field_group": "Medical",
  "field_name": "Mentions_Hypertension",
  "source_name": "Calculated",
  "source_id": "search_in_fields_using_regex",
  "field_path": [
    "hypertension|high.*pressure|HBP",
    "Medical_History",
    "Current_Conditions",
    "Medication_Notes"
  ]
}

Pattern 8: Extracted Parenthetical Classification

Requirement: Extract diagnosis type from formatted text like "Disease (Type A)"

{
  "field_group": "Classification",
  "field_name": "Diagnosis_Type",
  "source_name": "Calculated",
  "source_id": "extract_parentheses_content",
  "field_path": ["Formatted_Diagnosis"]
}

Troubleshooting

Issue 1: "Invalid JSON format" Error

Symptom: Configuration validation fails with JSON parsing error

Cause: Malformed JSON in field_path, value_labels, or field_condition

Solution:

Open cell in JSON validator (jsonlint.com)
Verify all:
- Array brackets: [...]
- Object braces: {...}
- String quotes: "..."
- Commas between elements
Fix syntax errors
Re-run validation

Example - WRONG:

["name", "address" ]  // WRONG: no comma after "name"

["name", "address"]   // CORRECT

Issue 2: Field Returns "undefined"

Symptom: Field value always "undefined" in output

Causes:

Field path doesn't match actual data structure
Questionnaire ID incorrect
Source type mismatch

Solution:

Check if source data exists in endobest_inclusions_old.json
Verify JSON path by stepping through manually
Check questionnaire ID (use q_id for fastest lookup)
Enable debug mode to see detailed errors

python eb_dashboard.py --debug

Issue 3: Empty Array Result

Symptom: Wildcard path returns empty array instead of values

Causes:

Array elements don't exist at specified path
Wildcard position incorrect in path

Solution:

Verify array exists in source data
Check array element structure
Test path manually in JSON tool

Example:

// WRONG: No elements at this path
["record", "items", "*", "nonexistent_field"]

// CORRECT: Match actual structure
["record", "items", "*", "existing_field"]

Issue 4: Calculated Field Returns Error

Symptom: Calculated field value starts with " "

Causes:

Function name wrong
Function argument count mismatch
Referenced fields not yet processed

Solution:

Check function name spelling
Verify argument count in field_path
Ensure referenced fields are defined BEFORE calculated field
Check for circular dependencies

Common Errors:

"$$$$ Unknown Custom Function: typo_name"
→ Check function name spelling

"$$$$ Argument Error: function requires N arguments"
→ Check field_path array length

"$$$$ Value Error: undefined"
→ Referenced field is undefined; check order in config

Issue 5: value_labels Not Applied

Symptom: Raw value shown instead of mapped label

Causes:

Raw value doesn't match any entry in value_labels
JSON syntax error in value_labels
Case sensitivity mismatch

Solution:

Check raw value type (string vs. number)
Verify exact match in value_labels
Check for case mismatches (e.g., "Active" vs "active")
Add wildcard entry if needed

Example:

{
  "value_labels": [
    {"value": "active", "text": {"fr": "Actif"}},
    {"value": "inactive", "text": {"fr": "Inactif"}},
    {"value": "*", "text": {"fr": "Autre"}}  // Catch-all for unmapped values
  ]
}

Issue 6: Performance Degradation After Adding Field

Symptom: Collection takes significantly longer after adding field

Causes:

Sequential questionnaire search (use q_id instead)
Expensive regex in search_in_fields_using_regex
Deep wildcard paths (multiple levels)

Solution:

Use q_id= instead of q_name= or q_category=
Simplify regex patterns
Flatten wildcard paths where possible

Summary

The Field Mapping Configuration provides:

✅ 100% Externalized: No code changes needed to add fields ✅ Flexible Sourcing: Support for questionnaires, records, requests, calculated fields ✅ Rich Transformations: Labels, templates, conditions, custom functions ✅ User-Friendly: Excel-based configuration with validation ✅ Performance Optimized: Single-call questionnaire fetching, field batching

This architecture enables rapid iteration on data extraction without deploying code changes.

Document End

53 KiB Raw Permalink Blame History

Endobest Field Mapping Configuration Guide

Part 2: Field Mapping & Configuration

Table of Contents

Overview

Key Concepts

Technical Architecture

Field Extraction Pipeline (Detailed)

Questionnaire Finding Strategy

Questionnaire Data Optimization

Field Processing Logic

Step 1: Source Type Determination

Step 2: Raw Value Extraction

Step 3: Field Condition Checking (Optional)

Step 4: Post-Processing Transformations

4a. Array Flattening

4b. Score Dictionary Formatting

4c. true_if_any Transformation

4d. value_labels Mapping

4e. field_template Formatting

Configuration File Structure

File Location

Inclusions_Mapping Sheet Overview

Column Reference

Column A: field_group

Column B: field_name

Column C: source_name

Column D: source_id

Format Options:

Column E: field_path

Syntax Examples:

Column F: field_template

Column G: field_condition

Column H: true_if_any

Column I: value_labels

Special Value Prefixes

Prefix: $ (String Literal)

No Prefix: Field References

Wildcard: * in Field Paths

Value Resolution in if_then_else

Summary Table: Special Prefixes

Data Sources Explained

1. Questionnaire Sources (q_id, q_name, q_category)

What Are Questionnaires?

Data Structure

Finding Questionnaires

2. Record Source (Clinical Data)

What Is Record Data?

Data Structure

Example Extraction

3. Inclusion Source (Inclusion Metadata)

What Is Inclusion Data?

Data Structure

Example Extraction

4. Request Source (Lab Test Data)

What Is Request Data?

Data Structure

Example Extraction

5. Calculated Source (Custom Functions)

What Are Calculated Fields?

Examples

6. Inclusion Source with Organization Enrichment (center_name)

What Is Organization Center Mapping?

Data Source: Inclusion Type

Fields Available from Organization Enrichment

Data Structure

Example Extraction

Configuration Requirements

Example Configuration

Field Path Syntax

Basic Path Navigation

Single-Level Access

Multi-Level Nesting

Array Index Access

Negative Index (from end)

Wildcard Paths (Multiple Values)

Single Wildcard (One Level)

Multiple Wildcards (Deep)

Wildcard Result Flattening

Edge Cases & Behavior

53 KiB

Raw Permalink Blame History

Prefix: `$` (String Literal)

Wildcard: `*` in Field Paths