PARROT Dataset Analysis Results
Dataset: PARROT v1.0 - Multi-language Radiology Reports
Analysis Date: October 2025
Analysis Framework: Real-world usage patterns for Xt-EHR model validation
🤖 AI-Assisted Analysis Attribution
AI System Used: Claude Sonnet 4.5 (Anthropic) - General-Purpose AI Model
This analysis was compiled with AI assistance in accordance with EU AI Act Article 52 transparency requirements. The AI system performed: - Statistical analysis of 2,738 imaging reports - Pattern recognition and frequency calculations - Data element mapping and classification - Report compilation and visualization
All findings are validated against source data and subject to expert review. See EU-AI-ACT-COMPLIANCE.md for regulatory compliance details.
Dataset Overview
The PARROT (PAneuropean Radiology Reporting cOrpus for arTificial intelligence) v1.0 dataset provides a comprehensive collection of real-world imaging reports for evidence-based analysis of healthcare data models.
Dataset Characteristics
- Total Reports: 2,738 imaging reports
- Geographic Coverage: 21 countries across Europe
- Language Diversity: 14 languages
- Source Repository: PARROT-reports/PARROT_v1.0
- Research Purpose: Evidence-based assessment of imaging report data elements
Analysis Objective
Compare real-world imaging report content against the Xt-EHR Imaging Report model v0.2.1 to identify: - Essential data elements with high clinical usage - Administrative elements with limited real-world presence - Implementation priorities for basic vs. beyond basic classification
Data Sources
Referenced Models
This analysis is based on:
- PARROT v1.0 Dataset: Multi-language Radiology Reports
- Real-world evidence base for usage pattern analysis
- 2,738 reports across 14 languages and 21 countries
- Xt-EHR FHIR IG v0.2.1: EHDS Logical Information Models
- Imaging Report Model: EHDSImagingReport.fsh
- Repository: Xt-EHR/xt-ehr-common
- Imaging Study model: https://build.fhir.org/ig/Xt-EHR/xt-ehr-common/StructureDefinition-EHDSImagingStudy.html
Executive Summary
Analysis of 2,738 real-world imaging reports from the PARROT dataset reveals significant insights about data element usage patterns when compared to the Xt-EHR Imaging Report information model.
Dataset Overview
- Total Reports: 2,738 imaging reports
- Languages: 14 different languages (Polish most common: 837 reports)
- Modalities: 10 different imaging modalities (CT most common: 989 reports)
- Anatomical Areas: 126 different anatomical areas (chest most common)
- Countries: 21 different countries
- Subspecialties: Multiple subspecialties represented
Key Findings
Real-World Data Element Usage
Always Present in PARROT Dataset (100% coverage):
- Report Content - Every report contains narrative text
- Modality - Imaging technique used
- Anatomical Area - Body region examined
- Language - Report language
- Country/Provider - Geographic/institutional context
- Clinical Classification - ICD codes for findings
- Subspecialty - Medical specialty context
Frequently Present in Report Content:
- Measurements: 74.3% of reports contain quantitative measurements
- Normal Findings: 61.7% contain normal/unremarkable findings
- Pathological Findings: 49.1% contain abnormal findings
- Contrast Use: 37.4% mention contrast agents
- Recommendations: 36.4% include follow-up recommendations
- Comparison Studies: 23.7% reference prior studies
Missing Elements (Not Available in PARROT)
Administrative/Technical Elements:
- Document identifiers and timestamps
- Authorship details and signatures
- Status and workflow information
- Accession numbers
- Insurance/payment information
- DICOM metadata and technical parameters
Clinical Workflow Elements:
- Order information and requesting physician details
- Detailed patient demographics and clinical context
- Radiation dose and exposure information
- Structured specimen information
- Legal authentication details
Mapping to Xt-EHR Elements
HIGH USAGE ELEMENTS (Present in Real-World Practice)
Header Section - Core Elements:
header.documentType- Implied (imaging report)header.documentTitle- Derivable from modality + areaheader.language- Available (14 languages observed)header.serviceSpecialty- Available (subspecialty field)
Examination Report - Essential Elements:
body.examinationReport.modality- Always present (10 modalities)body.examinationReport.bodyPart- Always present (126 areas)body.examinationReport.resultData.resultText- Always presentbody.examinationReport.conclusion.impression- Always presentbody.examinationReport.conclusion.conditionOrFinding- Available via ICD codes
Clinical Content - Frequently Present:
body.examinationReport.medication- When contrast mentioned (37.4%)body.recommendation- When recommendations present (36.4%)body.comparisonStudy- When prior studies referenced (23.7%)
MEDIUM USAGE ELEMENTS (Sometimes Present)
Supporting Information - Variable Presence:
body.supportingInformation.condition- Derivable from clinical contextbody.examinationReport.observationResults- From measurements (74.3%)header.authorship.author- Limited (contributor codes available)
LOW/NO USAGE ELEMENTS (Candidates for "Beyond Basic")
Administrative Overhead (0% in real-world data):
header.identifier- No unique document identifiersheader.authorship.datetime- No authoring timestampsheader.lastUpdate- No update trackingheader.status- No status workflowheader.statusReason- No status reasonsheader.version- No versioningheader.period- No service periodsheader.accessionNumber- No accession numbersheader.healthInsuranceAndPaymentInformation- No insurance dataheader.intendedRecipient- No recipient specifications
Legal/Workflow Elements (0% in real-world data):
header.attestation- No attestation detailsheader.legalAuthentication- No legal authenticationheader.custodian- No custodian informationheader.confidentiality- No confidentiality levels
Order Management (0% in real-world data):
body.orderInformation.*- Complete section rarely usedorderId,orderDateAndTime,orderPlacerorderReasonText,orderReason,clinicalQuestion
Technical Metadata (0% in real-world data):
body.exposureInformation.*- Radiation dose informationeffectiveDose,equivalentDoseInformationbody.specimen- Limited specimen informationdicomStudyMetadata- No DICOM technical metadataattachments- No file attachmentspresentedForm- No alternative formats
Recommendations for Basic vs Beyond Basic Classification
BASIC ELEMENTS (Essential for core imaging reports):
Required for All Reports:
header.documentType- Imaging report identifierheader.documentTitle- Human-readable titleheader.language- Report languagebody.examinationReport.modality- Imaging techniquebody.examinationReport.bodyPart- Body regionbody.examinationReport.resultData.resultText- Report narrativebody.examinationReport.conclusion.impression- Clinical interpretation
Essential for Clinical Value:
header.serviceSpecialty- Medical specialty contextbody.examinationReport.conclusion.conditionOrFinding- Structured findingsbody.examinationReport.observationResults- Quantitative findings
INTERMEDIATE ELEMENTS (Valuable but not always needed):
Clinical Enhancement:
body.examinationReport.medication- Contrast/medication informationbody.recommendation- Follow-up recommendationsbody.comparisonStudy- Prior study referencesbody.supportingInformation.condition- Clinical context
BEYOND BASIC ELEMENTS (Advanced/Administrative):
Administrative Overhead:
- All
header.identifier,authorship.datetime,statusrelated elements header.accessionNumber,healthInsuranceAndPaymentInformationheader.attestation,legalAuthentication,custodianheader.confidentiality,intendedRecipient
Workflow Management:
- Complete
body.orderInformationsection header.version,lastUpdate,period
Technical Metadata:
body.exposureInformation(radiation dose)dicomStudyMetadata(DICOM technical details)body.specimen(detailed specimen information)attachments,presentedForm
Impact Assessment
Basic Elements Coverage:
- 7-10 core elements cover 90%+ of real-world imaging report value
- Focus on clinical content and essential metadata
- Supports primary use cases: clinical care, patient records, basic interoperability
Beyond Basic Impact:
- 35+ additional elements add administrative and technical capabilities
- Support advanced workflows: legal documentation, quality management, research
- Enable comprehensive DICOM integration and institutional processes
- Required for complex multi-institutional scenarios
Implementation Recommendation:
- Basic Profile: Focus on 7-10 core elements for initial implementations
- Enhanced Profile: Add clinical enhancement elements based on use case needs
- Full Profile: Include all elements for comprehensive institutional deployments
This analysis provides evidence-based guidance for prioritizing Xt-EHR element implementation based on real-world usage patterns.