Abdul
Rahman

AI/ML Engineer · PhD Candidate · IEEE VIS Best Paper

I build production LLM systems and publish the research behind them. My work spans RAG pipelines, multi-agent workflows, fine-tuned language models, and interactive visual analytics, with 9 peer-reviewed papers, 90+ citations, and prior ML infrastructure experience at Amazon.

View Work Get in Touch

About

Open to AI/ML roles · Available now

Years in ML

Publications

90+

Citations

Best

Paper · IEEE VIS

AI/ML engineer and researcher completing a PhD in Computer Science at Northern Illinois University (GPA 3.9). My research spans large language models, multimodal learning, and interactive visual analytics, with publications at IEEE VIS, TVCG, JCDL, and Scientometrics.

Before my PhD, I built data pipelines at Amazon that reduced query latency by 30% across 500K+ daily records. At NIU, I lead the Visual Analytics Lab and build production systems, from RAG pipelines and multi-agent workflows to computer vision with ViT and SAM. Currently exploring how LLMs can reshape multi-view data exploration.

Education

Ph.D. Computer Science

Northern Illinois University

2020 – Present

GPA 3.9

M.S. Computer Science

Northern Illinois University

2018 – 2020

GPA 3.9

B.E. Computer Science

Osmania University, Hyderabad

2013 – 2017

Core Stack

ML & Data Science

PyTorch HuggingFace Scikit-learn XGBoost LoRA/QLoRA CLIP SHAP/LIME OpenCV A/B Testing Feature Engineering

LLM / GenAI

LangChain LangGraph AutoGen CrewAI FAISS RAG

Data & Infrastructure

PostgreSQL MongoDB Snowflake Docker Kubernetes FastAPI MLflow AWS Kafka

Languages & Visualization

Python SQL JavaScript R C/C++ Pandas NumPy D3.js Tableau Streamlit

Experience

Research and industry, with measurable outcomes.

2018 – Present DeKalb, IL

Researcher

Northern Illinois University · DATA Lab, VA Lab & WASTE Lab

Built LLM-powered pipelines, multi-agent workflows, and computer vision systems across 3 labs. Fine-tuned LLMs with LoRA (−60% GPU memory, −45% training time). Published 9 papers at IEEE VIS, TVCG, JCDL.

2020 – Present DeKalb, IL

Lab Head, Visual Analytics Lab

Northern Illinois University

Leading AI-driven visualization research. Mentored 5 grad students (100% completion, 4 co-authored papers). PC member: CIKM, WWW, JCDL.

2016 – 2017 Hyderabad, India

Data Analyst

Amazon

SQL/MongoDB pipelines that cut query latency by 30%. Built data quality workflows validating 500K+ daily records across e-commerce datasets.

2018 – 2023 DeKalb, IL

Teaching Assistant

Northern Illinois University · Algorithms, Databases, C/C++, Java

Supported 70+ students across core CS courses; 20% improvement in class performance.

Research

Peer-reviewed work in AI, LLMs, data visualization, and scientometrics.

Publications

90+

Citations

In Review

Featured Published Scientometrics 2023

YouTube and Science: Models for Research Impact

Abdul Rahman Shaikh, Hamed Alhoori, M. Sun

Can YouTube videos predict a paper's real-world influence? This work introduces new datasets linking video content to scholarly articles and trains ML models that forecast citation counts and public engagement using altmetrics signals, measuring scientific impact beyond academia.

Paper Code

Published Graphics Interface 2025

iTrace: Interactive Tracing of Cross-View Data Relationships

Abdul Rahman Shaikh, Maoyuan Sun, Xingchen Liu, Hamed Alhoori, David Koop

When dashboards have many linked views, finding connections between distant data points is hard. iTrace introduces smooth focus transitions that guide attention across views, making cross-view relationship tracing faster and less error-prone.

Paper Code

Published IEEE VIS 2022

Toward systematic design considerations of organizing multiple views

Abdul Rahman Shaikh, David Koop, Hamed Alhoori, Maoyuan Sun

How should multiple visualization panels be arranged? This paper reviews dozens of multi-view systems and distills layout principles grounded in perception and content, providing a framework for designing dashboards that help users connect information across views.

Paper

Published TVCG 2021 Best Paper

SightBi: Exploring Cross-View Data Relationships with Biclusters

Maoyuan Sun, Abdul Rahman Shaikh, Hamed Alhoori, Jian Zhao

Exploring linked data across views usually involves tedious trial-and-error. SightBi formalizes cross-view relationships as biclusters and creates dedicated relationship-views that surface hidden connections, turning guesswork into guided exploration. Awarded Best Paper Honorable Mention at IEEE VIS 2021.

Paper Code

Published JCDL 2025

Generation, Evaluation, and Explanation of Novelists' Styles with Single-Token Prompts

Mosab Rezaei, Mina Rajaei Moghadam, Abdul Rahman Shaikh, Hamed Alhoori, Reva Freedman

Can you teach an LLM a novelist's writing style with a single token? This work fine-tunes language models to generate 19th-century literary styles using minimal prompts, then evaluates the output with a transformer-based detector and explainable AI analyses.

Paper Code

View all on Google Scholar

Projects

Open-source tools, systems, and experiments.

@sabdulrahman

LLM RAG

LLMFlow: Scholarly Document Summarization & QA

End-to-end pipeline that chunks scholarly PDFs, generates hierarchical summaries via Llama/Ollama and GPT, and supports multi-turn Q&A with source attribution. Reduced reading time for 30+ page papers by ~60%.

Python · LangChain · FastAPI · React

Code

RAG Crawl

Rufus: Intelligent Web Data Extraction for LLMs

AI-powered web crawler that navigates sites, extracts relevant content, and synthesizes it into structured documents optimized for RAG ingestion. Handles JS-rendered pages via async Selenium.

Python · Asyncio · Selenium · RAG

Code

Health AI Multimodal

GenHealth: Multimodal Medical Report Analysis

Multimodal AI system that fuses clinical text, medical imaging, and structured signals to boost diagnostic extraction accuracy. Combines transformer encoders with domain-specific preprocessing.

Python · PyTorch · Transformers · FastAPI

Code

Sandbox Docker

Pexos: Safe Python Execution Sandbox

Secure code execution service for running untrusted Python with syscall restrictions, resource limits, and network isolation. Built for safe LLM code generation evaluation.

Python · Flask · nsjail · Docker

Code

Chatbot GPT

ChatFit: Personalized Fitness Chatbot

Conversational fitness assistant that collects user goals through dialogue and generates personalized workout and diet plans using GPT with structured output parsing.

Python · OpenAI GPT · Streamlit · Flask

Code

Audio Auth

VoxCore: Voice Authentication System

Voice authentication using OpenAI Whisper for transcription and custom PyTorch models for speaker verification. Achieves 89% accuracy in multi-speaker environments with real-time processing.

Python · Whisper · PyTorch

Code

View all repositories on GitHub

Let's build with AI.

iamsabdurahman@gmail.com

Schedule

Book a call

Social

Abdul
Rahman

About

Education

Core Stack

ML & Data Science

LLM / GenAI

Data & Infrastructure

Languages & Visualization

Experience

Researcher

Lab Head, Visual Analytics Lab

Data Analyst

Teaching Assistant

Research

YouTube and Science: Models for Research Impact

iTrace: Interactive Tracing of Cross-View Data Relationships

Toward systematic design considerations of organizing multiple views

SightBi: Exploring Cross-View Data Relationships with Biclusters

Generation, Evaluation, and Explanation of Novelists' Styles with Single-Token Prompts

Quantifying the online long-term interest in research

Examining the Representation of Youth in the US Policy Documents through the Lens of Research

Predicting patent citations to measure economic impact of scholarly research

Modeling the Broader Impact of Science and Health Using Social Media

Boundary Blending: Reconsidering the Design of Multi-View Visualizations

Projects

LLMFlow: Scholarly Document Summarization & QA

Rufus: Intelligent Web Data Extraction for LLMs

GenHealth: Multimodal Medical Report Analysis

Pexos: Safe Python Execution Sandbox

ChatFit: Personalized Fitness Chatbot

VoxCore: Voice Authentication System

Let's build with AI.

AbdulRahman

About

Education

Core Stack

ML & Data Science

LLM / GenAI

Data & Infrastructure

Languages & Visualization

Experience

Researcher

Lab Head, Visual Analytics Lab

Data Analyst

Teaching Assistant

Research

Projects

LLMFlow: Scholarly Document Summarization & QA

Rufus: Intelligent Web Data Extraction for LLMs

GenHealth: Multimodal Medical Report Analysis

Pexos: Safe Python Execution Sandbox

ChatFit: Personalized Fitness Chatbot

VoxCore: Voice Authentication System

Let's build with AI.

Abdul
Rahman