High-performance AI inference platform with intelligent model routing — classifies query complexity via embeddings and heuristics, routing simple queries to fast/cheap models and complex ones to GPT-4 class models.
Multi-level caching: L1 in-memory (Moka) + L2 distributed (Redis) with semantic similarity matching via cosine similarity on embeddings.
Full observability stack with Prometheus metrics and pre-configured Grafana dashboards for inference latency, cache hit rates, and model health.
Fully async Rust architecture with horizontal scaling via Docker and Kubernetes.
Curated a research repository focused on exploring Quantum Computing's potential in cryptography, particularly for breaching encryption algorithms like RSA, SHA-256, and SHA-128.
Developed and optimized Quantum Circuits, including Quantum Fourier Transform, GHZ State, and various commutation relation experiments using Qiskit.
Worked on key Open-QASM codes for quantum registers and entanglement scenarios, improving execution times by 20%.