Engineering Manager and Platform Architect
At Atlan, I have worked across multiple roles spanning senior individual contributor, platform architect, and engineering manager. My work has focused on building and scaling the metadata backbone that powers Atlan's enterprise data platform across AWS, Azure, and Google Cloud.
- Problem space
Atlan operates at the intersection of data governance, metadata management, and enterprise scale. The platform supports hundreds of enterprise customers, each with strict requirements around reliability, security, isolation, and performance. As usage scaled, the underlying systems needed to evolve from infrastructure focused execution to a purpose built data platform capable of supporting AI native use cases.
- What I owned
I led the design and evolution of Atlan's core platform and data infrastructure, including the Metastore and Metadata Lakehouse. These systems form the control plane for metadata, search, and governance across billions of metadata records.
My scope included platform architecture, multi tenancy design, reliability engineering, and team leadership. Over time, I grew the team from a small infrastructure focused group into a data platform organization supporting enterprise scale workloads.
- Key contributions and outcomes
- • Led the transition from infrastructure centric systems to core data platform ownership
- • Built and scaled platform and infrastructure teams from 4 to 10+ engineers
- • Delivered a 7x improvement in tag propagation throughput while sustaining 99.9% reliability
- • Improved search performance by 3x, enabling AI driven discovery and governance use cases
- • Reduced customer reported incidents by 58% through systematic reliability engineering
- • Designed and operated multi cloud systems across AWS, Azure, and Google Cloud
- • Established architecture review processes to drive long term technical quality and consistency
- Current focus
My current work centers on building enterprise grade data platforms that can support unstructured metadata at scale for AI native use cases. This includes:
- • Defining and operationalizing reliability standards across distributed systems
- • Driving multi cloud orchestration and operational maturity
- • Operating distributed systems handling billions of metadata records while raising the platform reliability bar toward a 99.99% SLA
- Websitehttps://www.atlan.com/
- ArticleLoft - Customer Success Story
- WebinarHow Atlan Built A Kubernetes Platform In the Cloud
