June 28, 2026 3 min read

Database & Data Engineering Consulting

Pragmatic consulting for PostgreSQL operations, data modeling, and data engineering infrastructure.

On this page

I help startups and engineering teams set up, scale, and improve operations for their databases and data platforms.

Whether you are deploying new database infrastructure or optimizing existing operations at scale, I provide pragmatic engineering support.

Here are the two areas I focus on:

1. PostgreSQL Operations & Data Modeling

I have worked on database internals and production operations. I help teams get the most out of Postgres without adding unnecessary operational complexity.

Operations & Scale: Designing and bootstrapping new PostgreSQL deployments, scaling connection pooling, and establishing high availability patterns.
Diagnostics & Optimization: Isolating performance bottlenecks, latency spikes, and lock contention using query execution tracing and active session monitoring.
Data Modeling & Schema Design: Designing database schemas for high throughput, structuring table partition strategies, and designing optimized index layouts.
Change Data Capture (CDC) & Security: Building reliable CDC pipelines (Debezium/logical replication) and auditing complex database access control lists (ACLs).

2. Data Engineering & Analytics Infrastructure

I have built and scaled query engines and data lake platforms processing petabytes of data. I help teams optimize their data platform architecture and query performance.

Query Engines at Scale: Setting up and tuning query engines like Apache Spark, Presto/Trino, and Apache Iceberg for high-throughput analytics.
Data Pipeline Operations: Designing robust ETL/ELT pipelines and streamlining existing scripts to reduce execution times and compute costs.
Data Infrastructure Observability: Building end-to-end monitoring, logging, and metrics across platforms (Spark, Yarn, Trino, HDFS) to ensure reliability.
Data Platform Architecture: Designing storage layouts, partitioning schemes, and choosing catalog tools for scalable, cost-efficient data lakes.

How I Work

I prefer focused, high-impact engagements rather than open-ended arrangements:

Infrastructure Design & Setup: Partnering with your team to design, bootstrap, and deploy new Postgres setups, CDC pipelines, or data lake architectures (Iceberg/Trino).
Operations & Performance Audits: Conducting a 1-2 week review of your database metrics, schema designs, or ETL execution profiles to provide an optimization roadmap.
Technical Advisory: Collaborating with your team as a fractional expert on data architecture, scalability reviews, and query engine configuration.

Technical Background

I have spent my career building, scaling, and contributing to database engines and data platforms:

Yugabyte: Implemented transactional streaming CDC, OpenTelemetry tracing, and Active Session History.
LinkedIn: Led big data infrastructure observability initiatives and consulted with startups on database scaling and ETL optimization.
StarTree: Worked on real-time analytics infrastructure, query performance, and scaling Apache Pinot for high-throughput, low-latency workloads.
Qubole: Led the SQL teams for Hive and Presto services, founding the data engineering team and scaling usage.
Yahoo: Built and scaled Perl/C++ query engines processing petabytes of daily data.

Contact

If you have a problem you would like to discuss, email me at [email protected] or find me on LinkedIn.