Orchestrated backend development for the FAR AI decentralized inference network at FAR Labs, enabling distributed compute across networks instead of centralized cloud infrastructure.
Provisioned GPU clusters using Kubernetes to support scalable, distributed, and decentralized AI inference across heterogeneous compute environments.
Accelerated distributed inference for large language models exceeding 70B parameters on lowperformance GPUs, reaching ~10 tokens/sec through model and network optimization.
Constructed a Rust (Axum) service for managing centralized exchange (CEX) accounts, including a data pipeline and dashboards for financial and business insights.
Fine-tuned LoRA models on Stable Diffusion XL (SDXL) using RTX 5090 GPUs and integrated the workflow with ComfyUI for avatar generation.
Launched multiple Telegram applications, including an AGI-powered chatbot and an AI assistant tailored for gaming users.
Architected Go-based microservices to aggregate exchange data and relay it to backend systems using Kafka on Amazon EKS for auto-scaling and load balancing.
Engineered a multi-threaded Rust WebSocket proxy using Tokio runtime and Axum to manage subscriptions and distribute market data.
Extended the SaaS backend with Nest.js, improving trade operation workflows and user interaction handling.
Rewrote the order book implementation from TypeScript to Rust, improving execution speed and memory efficiency.
Integrated APIs from Coinbase, Binance, Kraken, and Deribit for spot, futures, and options trading, including conformance testing.
Reduced latency by optimizing database queries and introducing Redis caching to improve concurrency in TypeScript services.
Strengthened logging and monitoring with Elasticsearch and Grafana, improving system visibility and debugging.
Resolved over 50 production issues by working with client support and applying targeted fixes.