R*****i
About Candidate
I usually describe myself as a senior full-stack engineer who has spent most of my career building real, production systems that people actually depend on. I’ve worked across healthcare, finance, and large enterprise platforms, and over time I naturally grew into roles where I wasn’t just coding, but also thinking about architecture, scale, and reliability. My strongest skills are backend development, cloud systems, and data-heavy applications, but I’m also very comfortable on the frontend, working with modern JavaScript frameworks and building clean user experiences.
Uh, in the last several years, I’ve focused a lot on AI and machine learning systems, especially in regulated environments like healthcare, where accuracy, security, and explainability really matter. I’m very hands-on — I like digging into problems, understanding why something breaks, and fixing it properly instead of putting quick patches on top. At the same time, I enjoy working with people, mentoring teammates, and translating technical ideas into simple language for product and business partners.
Overall, I see myself as someone who’s reliable, curious, and calm under pressure. When something is unclear or complex, I slow down, break it into small pieces, and move forward step by step. That’s usually how I earn trust quickly and make a strong first impression.
Nationality
Iqama
Location
Work & Experience
At Capgemini, I led AI engineering and platform optimization for Virtual Care AI, a secure enterprise healthcare platform supporting large-scale predictive modeling, multimodal AI, and LLM-driven automation. In addition to model development, I owned GPU performance optimization, inference reliability, and production-grade infrastructure across cloud and bare-metal environments, ensuring systems were explainable, auditable, and clinically trustworthy. Key Contributions & Results Predictive Modeling, Training & AI Workloads Scaled distributed training and inference across 10M+ EMR records using PySpark + Ray, improving patient risk detection accuracy by 27%. Trained and served PyTorch and Hugging Face transformer models for classification, summarization, and retrieval-augmented generation (RAG) workloads. Supported large-scale inference pipelines using vLLM, Triton Inference Server, and ONNX Runtime, optimizing throughput for LLM-based clinical assistants. Benchmarked training and inference performance using representative workloads (Megatron-style transformer configurations, long-context LLMs), comparing latency, throughput, and GPU utilization across model variants. Fine-tuned domain-specific LLMs via LoRA, instruction tuning, and task chaining, improving response accuracy and clinical relevance. NVIDIA GPU Stack & Performance Optimization Optimized workloads on NVIDIA A100 GPUs, tuning CUDA, cuDNN, TensorRT, NCCL, and NVLink configurations for multi-GPU performance. Diagnosed GPU bottlenecks using nvidia-smi, DCGM metrics, CUDA profiling, and application-level telemetry. Tuned NCCL collective operations and batch sizing to improve multi-GPU training and inference efficiency. Ensured driver, CUDA, and firmware compatibility across environments to prevent performance regressions during upgrades. Piloted AWS Inferentia for cost-efficient inference where clinically appropriate, benchmarking against GPU baselines. Bare-Metal Linux & Systems Engineering Supported bare-metal Linux GPU nodes for high-performance workloads, collaborating with platform teams on kernel options, driver stacks, and system tuning. Validated GPU nodes post-provisioning, ensuring correct PCIe/NVLink topology, NUMA alignment, and memory configuration. Troubleshot Linux-level performance issues (I/O, memory pressure, CPU pinning) impacting AI workloads. Authored operational runbooks for GPU node validation, failure recovery, and performance troubleshooting. Workload Orchestration & Cluster Management Ran AI workloads across Kubernetes clusters (EKS, GKE, AKS) using GPU scheduling, node affinity, and resource quotas. Deployed inference and training services via Helm + ArgoCD, enabling reproducible, auditable rollouts. Integrated Slurm-based batch scheduling for large training and benchmarking jobs in hybrid environments. Used Ansible for cluster-level configuration management and environment consistency. Coordinated capacity planning across Kubernetes and batch workloads to meet clinical SLAs. Containers & Runtime Environments Containerized AI workloads using Docker and Singularity, packaging CUDA, cuDNN, NCCL, and model dependencies for reproducibility. Published GPU-optimized containers via NGC/Singularity images, reducing environment drift across teams. Supported secure container execution in HIPAA-regulated environments with minimal attack surface. Cloud Provisioning & Automation Automated GPU infrastructure provisioning with Terraform and cloud-init, reducing cluster setup time from weeks to minutes. Built golden GPU images bundling drivers, CUDA libraries, and monitoring agents for rapid scaling. Standardized infrastructure templates across AWS and GCP to support multi-client deployments. Networking & East/West Traffic Awareness Worked with platform teams on Layer 2/Layer 3 networking fundamentals (TCP/IP, DNS, VLANs, bonding) for GPU clusters. Supported east-west traffic optimization for multi-node inference and training workloads. Gained hands-on familiarity with RoCE / InfiniBand-backed GPU environments during benchmarking and capacity planning discussions. Observability, Monitoring & Reliability Implemented GPU and application observability using Prometheus + Grafana, tracking utilization, memory, thermals, and error rates. Integrated CloudWatch and ELK-style logging for centralized debugging and audit trails. Built alerts for GPU saturation, memory leaks, and inference latency regressions. Delivered executive dashboards (Tableau, Power BI, Looker) summarizing AI system health and cost efficiency. Generative AI, RAG & Safety Built production RAG pipelines using LangChain + FAISS, benchmarking Pinecone, Weaviate, and Chroma for scale and cost. Enforced guardrails, grounding filters, fallback chains, and PHI-safe workflows for clinical GenAI. Led LLM red-teaming sessions with clinicians and security teams to identify failure modes and harden deployments. Leadership & Impact Led and mentored a 5-person team spanning ML, LLMOps, and platform engineering. Authored internal standards for GPU operations, benchmarking, observability, and model deployment. Partnered with product, compliance, and clinical leadership to align AI platform capabilities with regulatory and patient-care requirements. Secured $500K+ in follow-on funding by demonstrating scalable, trustworthy AI infrastructure. Established the client’s reputation for production-grade, audit-ready healthcare AI.
At Capital One, I delivered NLP, automation, and statistical modeling solutions that streamlined document-heavy workflows, improved compliance reporting, and enabled scalable retrieval systems across banking system. Key Contributions & Results • Designed and fine-tuned BERT, RoBERTa, and DistilBERT models for classification, entity recognition, and summarization; deployed on Google AI Platform, reducing manual claims review effort by 40%+ for a healthcare client. • Built reusable end-to-end NLP pipelines with spaCy, scikit-learn, and TensorFlow for document parsing and intent tagging, accelerating delivery across healthcare, insurance, and finance projects. • Automated OCR-heavy claims ingestion by integrating Tesseract OCR with Dataflow + Pub/Sub, cutting batch processing latency by 35% compared to legacy Spark-only workflows. Statistical Reporting & Risk Modeling • Applied regression and time-series forecasting in R and BigQuery ML to complement NLP outputs with interpretable baselines, enabling risk forecasting under regulatory scrutiny. • Delivered exploratory analysis and reporting in RMarkdown and ggplot2, improving transparency and communication of findings to auditors and executives. Vector Search, Indexing & Retrieval • Implemented hybrid retrieval systems combining FAISS vector indexes with Elasticsearch BM25, boosting top-5 retrieval accuracy from 68% → 84% through SME-driven evaluation. • Added structured filtering via BigQuery and MongoDB, enabling jurisdiction-specific and policy-specific retrieval. • Benchmarked FAISS vs Elasticsearch for recall, latency, and scalability, delivering adoption recommendations. Conversational AI & Virtual Assistants • Built multilingual chatbots with Rasa, Node.js, and MongoDB for web, mobile, and WhatsApp, supporting thousands of concurrent users. • Designed fallback workflows, sentiment-based routing, and escalation logic, reducing live-agent escalations by 33% and improving customer satisfaction scores. MATLAB Simulation & Analysis • Prototyped classification, forecasting, and topic modeling in MATLAB using Statistics, Deep Learning, and Text Analytics toolboxes. • Conducted Monte Carlo simulations and time-series stress testing to evaluate robustness under data drift, improving production reliability. Cloud & Deployment • Deployed NLP and retrieval models on Google AI Platform and TensorFlow Serving, exposing REST endpoints for production. • Containerized pipelines with Docker and integrated Jenkins CI/CD, cutting deployment turnaround from weeks to days. • Implemented RBAC, encryption, and logging for GCP deployments, ensuring compliance with healthcare and finance standards. MLOps & Monitoring • Automated evaluation with Python scripts logging precision/recall metrics to BigQuery, enabling systematic experiment tracking. • Integrated drift detection into Dataflow ingestion jobs, flagging shifts in claims and document distributions to preserve model accuracy. • Standardized experiment tracking via TensorBoard + BigQuery logs for reproducibility across teams. Business Intelligence & Reporting • Automated regulatory KPI dashboards with Python, R, BigQuery, and Tableau/Power BI, providing real-time AI system monitoring. • Built audit-ready reporting pipelines consolidating NLP, forecasting, and risk outputs, accelerating compliance sign-offs. Business Impact • Delivered 6 production systems and 10+ PoCs across insurance, healthcare, and banking, saving clients 8,000+ labor hours annually. • Authored reusable scripts and retrieval configs for internal knowledge bases, speeding project delivery and improving cross-team consistency.
At LoanDepot, I worked with enterprise clients to modernize analytics systems, automate document-heavy workflows, and deploy scalable, cloud-based solutions. My focus was on full-stack Python development, data pipelines, and cloud integration. • Architected document intelligence pipelines in Python and Elasticsearch, indexing 50M+ contracts and policies with custom analyzers and sharding strategies to support sub-second retrieval under regulatory workloads. • Designed forecasting frameworks using ARIMA, regression, and scikit-learn ensembles to predict claims volumes and retail demand, integrating outputs with BI dashboards for real-time operational planning. • Engineered ETL pipelines in Python and SQL, orchestrated with Airflow and batch schedulers, handling TB-scale datasets across heterogeneous healthcare and finance systems. • Piloted containerized deployments with Docker, packaging analytics workloads into reproducible builds that cut onboarding/setup time by 70%. • Delivered early Azure-based analytics services, wiring in enterprise IAM, encryption-at-rest, and audit logging for compliance-readiness. • Built CI/CD automation with Git and custom Python test harnesses, reducing deployment cycles from biweekly to daily while maintaining regulatory audit trails. • Optimized high-volume batch jobs with parallelization, query tuning, and caching strategies, reducing processing time by 40%+ on critical reporting workflows. • Developed interactive dashboards in JavaScript, React, and D3.js to visualize mortgage pipeline health, enabling executives to drill into loan-level analytics with real-time filters. • Implemented responsive web forms in AngularJS for borrower data capture, integrating validations and API calls to back-end decision engines, improving user experience and reducing input errors by 25%. • Built reusable JavaScript utility libraries for client-side data formatting and asynchronous API handling, which reduced code duplication across multiple loan origination applications.
At Amazon, I focused on backend Python development, data pipelines, and automation to improve order processing efficiency and customer analytics. • Built ETL pipelines in Python and SQL to process multi-TB order datasets, ensuring accurate reporting and timely insights for retail operations. • Developed automation scripts to streamline reconciliation of order, payment, and shipping data, reducing manual effort by 50%. • Designed scalable data services using Flask APIs and RESTful endpoints, enabling integration with internal analytics and customer insights tools. • Optimized high-volume batch jobs with parallelization and query tuning, cutting processing time by 30%+. • Contributed to frontend reporting tools with JavaScript and jQuery, adding dynamic filtering and drill-down capabilities for business users.


