Inference Systems
Serving, tuning, and simplifying model workflows so product teams can run AI features without guesswork.
I work on model-serving infrastructure, developer tooling, and technical writing that helps teams move from prototype to production.
Current Route
Status
Shipping
Serving models, building tooling, and turning implementation details into usable docs.
Systems
Triton, vLLM, Ollama, TensorRT, and LLM product workflows.
Output
Projects, explainers, and technical notes built to be inspected and reused.
Focus
Production AI
Strength
Clear Interfaces
Approach
Hands-On
The focus is practical AI engineering with enough product and frontend discipline to make the systems useful, observable, and understandable.
Serving, tuning, and simplifying model workflows so product teams can run AI features without guesswork.
Building frontends and explainers that make technical systems easier to inspect and use.
Turning experiments into notes, tutorials, and implementation guides that other engineers can reuse.
In this post, we outline the Galton-Watson Process.
Policy Iteration with Maze Example.
Check if an agent is doing the right thing with Policy Evaluation Methods
Developed a Deep Q-Learning agent in a simulated racing environment to autonomously navigate tracks by learning from reward feedback.
Transfer-learned an EfficientNet backbone to classify multiple waste categories in images, enhancing model performance through data augmentation.
Explored prompt-engineering strategies with large language models to generate structured haikus, enforcing syllable constraints and thematic coherence.
Interactive web guide demonstrating core D3.js features and visualization techniques.
Real-time chat app with WebSocket messaging; Elm frontend and Express.js backend.
Python Discord bot with Twitch API integration for live stream alerts and moderation.
Cross-platform desktop color picker built with Rust and Tauri, showcasing native UI integration.
Interactive Sudoku solver with step-by-step guidance built in Python using a backtracking algorithm and heuristic improvements.
Interactive wordle solver using entropy with a practice mode for improvements.
The roadmap tracks topics worth studying deeply enough to turn into usable systems, notes, or experiments.
Reading
Reinforcement Learning (Sutton & Barto), Deep Learning (Goodfellow et al.)
Building With
Exploring
RL algorithms • Policy gradients
Building With
Exploring
Distributed systems • GPU acceleration
Operating Principles
Reading
Graph Representation Learning (Hamilton)
Building With
Exploring
Knowledge graphs • GNNs
Reading
Generative Agents (Park et al.)
Building With
Exploring
Multi-agent systems • Emergent behavior
Yard Notes
These are references for clarity, speed, and product restraint rather than trend chasing.
Generative UI prototyping for fast idea validation.
Sketch-first thinking for product and systems work.
Clear execution and issue tracking without clutter.
Diagramming that stays rough, legible, and quick to share.
Research synthesis and study support from messy source material.
Motion and storytelling references for interface exploration.