VR Komari

Healthcare Data Engineer & Data Architect

Building production-grade healthcare data platforms with OMOP CDM standardization, multi-cloud architecture (Azure + Snowflake), and comprehensive data governance through Microsoft Purview and Fabric

Explore Domains View Tech Stack

🚧 Portfolio Under Active Development

Interactive dashboards are being deployed incrementally. Check back soon for live analytics across all domains!

Multi-Domain Expertise

Production-ready data engineering solutions across diverse industries

🏥

Healthcare Analytics

FHIR-OMOP interoperability showcase with synthetic patient data and CMS hospital insights

  • OMOP CDM implementation
  • FHIR transformation pipeline
  • 9.9M+ clinical concepts
  • Population health analytics
Dashboards Coming Soon
🛒

Retail Analytics

Brazilian e-commerce marketplace analysis with order lifecycle and customer segmentation

  • 100K+ order transactions
  • Customer lifetime value
  • Product performance metrics
  • Delivery optimization
Dashboards Coming Soon
📊

Marketing Intelligence

Sample Superstore sales performance and customer profitability analysis

  • Sales performance tracking
  • Regional analytics
  • Customer segmentation
  • Profit margin analysis
Dashboards Coming Soon
💹

Financial Markets

Real-time SP100 monitoring with 15-minute incremental updates

  • Real-time market data
  • Sector performance analysis
  • Volatility tracking
  • 1M+ data points
Dashboards Coming Soon

Technology Stack

Multi-cloud platforms and enterprise-grade tools for production data engineering

☁️ Cloud & Data Platforms

Azure Data Factory Snowflake SQL Microsoft Fabric Azure Synapse AWS S3

🔐 Governance & Compliance

Microsoft Purview Data Lineage Metadata Catalog OpenLineage

🏥 Healthcare Standards

OMOP CDM v5.4 FHIR R4 SNOMED CT RxNorm LOINC ICD-10

⚙️ ETL/ELT & Orchestration

SSIS dbt Apache Iceberg CDC Patterns Mage AI

🗄️ Databases

PostgreSQL SQL Server DuckDB SSAS Tabular

🐍 Data Processing

Python Polars pandas R NumPy

🤖 ML & Analytics

scikit-learn OHDSI HADES spaCy NLP PatientLevelPrediction

📊 BI & Visualization

Power BI Premium Tableau DAX Plotly Streamlit

🔧 DevOps & Infrastructure

Docker CI/CD nginx Git systemd

Microsoft Stack Integration

Unified data platform leveraging the Microsoft ecosystem

Microsoft Fabric serves as the unified analytics platform, integrating Data Factory for ETL orchestration, Synapse for data warehousing, and Power BI for visualization—all within a single SaaS environment. Microsoft Purview provides the governance layer, automatically cataloging data assets, tracking lineage across the entire pipeline, and enforcing compliance policies. Together with Azure's enterprise-grade security and Snowflake's computational efficiency, this stack enables scalable, governed, and interoperable healthcare data platforms that meet both regulatory requirements and analytical demands.

Architecture & Design Principles

Production-grade patterns for scalable healthcare data platforms

✅ Medallion Architecture

Bronze (raw) → Silver (OMOP standardized) → Gold (analytics-ready) with clear separation of concerns and data lineage tracking

✅ Multi-Cloud Engineering

Azure Data Factory + Snowflake SQL for scalable ETL/ELT, SSIS for legacy integration, cross-cloud data movement patterns

✅ OMOP as Semantic Layer

OMOP CDM v5.4 as canonical data model enabling federated analytics, standardized vocabularies, and research network participation

✅ Data Quality Automation

DataQualityDashboard with 3,500+ validation checks, Kahn Framework dimensions, automated monitoring achieving 95%+ quality scores

✅ FHIR-OMOP Interoperability

Bidirectional transformation pipelines, USCDI compliance, Bulk FHIR exports, legacy HL7v2 to FHIR R4 migration workflows

✅ Production ETL Patterns

Change data capture (CDC), incremental loading, watermark-based processing, error handling, idempotency, comprehensive monitoring

Design Philosophy

Core principles that guide architecture decisions

Standards before scale

OMOP CDM standardization enables trust, interoperability, and federated analytics—don't scale fragmentation

Governance is a feature, not overhead

Data quality frameworks, lineage tracking, and metadata catalogs are product features that enable AI and ML

AI outputs must be clinically validated

Machine learning predictions require clinical SME review, bias detection, and continuous monitoring in healthcare

Multi-cloud by design, not by accident

Azure + Snowflake integration patterns provide flexibility, cost optimization, and best-of-breed capabilities

Interoperability is a product, not a project

FHIR-OMOP bidirectional mapping is ongoing architecture work, not one-time integration—plan for evolution