Abdul
Rahman
AI/ML Engineer · PhD Candidate · IEEE VIS Best Paper
I build production LLM systems and publish the research behind them. My work spans RAG pipelines, multi-agent workflows, fine-tuned language models, and interactive visual analytics, with 9 peer-reviewed papers, 90+ citations, and prior ML infrastructure experience at Amazon.
Years in ML
Publications
Citations
Paper · IEEE VIS
AI/ML engineer and researcher completing a PhD in Computer Science at Northern Illinois University (GPA 3.9). My research spans large language models, multimodal learning, and interactive visual analytics, with publications at IEEE VIS, TVCG, JCDL, and Scientometrics.
Before my PhD, I built data pipelines at Amazon that reduced query latency by 30% across 500K+ daily records. At NIU, I lead the Visual Analytics Lab and build production systems, from RAG pipelines and multi-agent workflows to computer vision with ViT and SAM. Currently exploring how LLMs can reshape multi-view data exploration.
Education
Ph.D. Computer Science
Northern Illinois University
2020 – Present
GPA 3.9
M.S. Computer Science
Northern Illinois University
2018 – 2020
GPA 3.9
B.E. Computer Science
Osmania University, Hyderabad
2013 – 2017
Core Stack
ML & Data Science
LLM / GenAI
Data & Infrastructure
Languages & Visualization
Experience
Research and industry, with measurable outcomes.
Researcher
Northern Illinois University · DATA Lab, VA Lab & WASTE Lab
Built LLM-powered pipelines, multi-agent workflows, and computer vision systems across 3 labs. Fine-tuned LLMs with LoRA (−60% GPU memory, −45% training time). Published 9 papers at IEEE VIS, TVCG, JCDL.
Lab Head, Visual Analytics Lab
Northern Illinois University
Leading AI-driven visualization research. Mentored 5 grad students (100% completion, 4 co-authored papers). PC member: CIKM, WWW, JCDL.
Data Analyst
Amazon
SQL/MongoDB pipelines that cut query latency by 30%. Built data quality workflows validating 500K+ daily records across e-commerce datasets.
Teaching Assistant
Northern Illinois University · Algorithms, Databases, C/C++, Java
Supported 70+ students across core CS courses; 20% improvement in class performance.
Research
Peer-reviewed work in AI, LLMs, data visualization, and scientometrics.
Publications
Citations
In Review
YouTube and Science: Models for Research Impact
Abdul Rahman Shaikh, Hamed Alhoori, M. Sun
Can YouTube videos predict a paper's real-world influence? This work introduces new datasets linking video content to scholarly articles and trains ML models that forecast citation counts and public engagement using altmetrics signals, measuring scientific impact beyond academia.
iTrace: Interactive Tracing of Cross-View Data Relationships
Abdul Rahman Shaikh, Maoyuan Sun, Xingchen Liu, Hamed Alhoori, David Koop
When dashboards have many linked views, finding connections between distant data points is hard. iTrace introduces smooth focus transitions that guide attention across views, making cross-view relationship tracing faster and less error-prone.
Toward systematic design considerations of organizing multiple views
Abdul Rahman Shaikh, David Koop, Hamed Alhoori, Maoyuan Sun
How should multiple visualization panels be arranged? This paper reviews dozens of multi-view systems and distills layout principles grounded in perception and content, providing a framework for designing dashboards that help users connect information across views.
SightBi: Exploring Cross-View Data Relationships with Biclusters
Maoyuan Sun, Abdul Rahman Shaikh, Hamed Alhoori, Jian Zhao
Exploring linked data across views usually involves tedious trial-and-error. SightBi formalizes cross-view relationships as biclusters and creates dedicated relationship-views that surface hidden connections, turning guesswork into guided exploration. Awarded Best Paper Honorable Mention at IEEE VIS 2021.
Generation, Evaluation, and Explanation of Novelists' Styles with Single-Token Prompts
Mosab Rezaei, Mina Rajaei Moghadam, Abdul Rahman Shaikh, Hamed Alhoori, Reva Freedman
Can you teach an LLM a novelist's writing style with a single token? This work fine-tunes language models to generate 19th-century literary styles using minimal prompts, then evaluates the output with a transformer-based detector and explainable AI analyses.
Projects
Open-source tools, systems, and experiments.
LLMFlow: Scholarly Document Summarization & QA
End-to-end pipeline that chunks scholarly PDFs, generates hierarchical summaries via Llama/Ollama and GPT, and supports multi-turn Q&A with source attribution. Reduced reading time for 30+ page papers by ~60%.
Rufus: Intelligent Web Data Extraction for LLMs
AI-powered web crawler that navigates sites, extracts relevant content, and synthesizes it into structured documents optimized for RAG ingestion. Handles JS-rendered pages via async Selenium.
GenHealth: Multimodal Medical Report Analysis
Multimodal AI system that fuses clinical text, medical imaging, and structured signals to boost diagnostic extraction accuracy. Combines transformer encoders with domain-specific preprocessing.
Pexos: Safe Python Execution Sandbox
Secure code execution service for running untrusted Python with syscall restrictions, resource limits, and network isolation. Built for safe LLM code generation evaluation.
ChatFit: Personalized Fitness Chatbot
Conversational fitness assistant that collects user goals through dialogue and generates personalized workout and diet plans using GPT with structured output parsing.
VoxCore: Voice Authentication System
Voice authentication using OpenAI Whisper for transcription and custom PyTorch models for speaker verification. Achieves 89% accuracy in multi-speaker environments with real-time processing.
Let's build with AI.
© Abdul Rahman Shaikh 2025 · Open to full-time AI/ML roles & research collaboration