What is Harshit Dave's expertise in AI and ML?

Harshit Dave specializes in Large Language Models (LLMs), conversational AI, multimodal AI, and production-scale AI systems. Published researcher at EMNLP & AAAI with expertise in GPT-4 and vector databases.

What AI technologies does Harshit Dave work with?

Works with Large Language Models (LLMs), GPT-4, Whisper ASR, conversational AI, vector databases, multimodal AI, PyTorch, TensorFlow, and production ML deployment systems.

What are Harshit Dave's research contributions?

Published research at EMNLP 2023 on multimodal fact verification and at AAAI 2024 on LLM-based agent recommendations. Built production systems achieving 98.9% precision and significant performance improvements.

Harshit Dave

Bridging state-of-the-art AI research with production-grade fintech solutions.

I am a Data Scientist at slice, where I build production-scale LLM-powered agentic systems and multi-agent architectures for production applications These systems bridge cutting-edge research and real-world applications in the fintech space. My work is centered around LLMs, multimodal AI, and creating intelligent conversational systems.

My journey in AI/ML began at IIITM Gwalior, where I reseached and engineered the Integrated-Particle Swarm Optimization (i-PSO) technique and analyzed with various ML algorithms for detecting malware, under the supervision of Dr. Saumya Bhadauria and Prof. Aditya Trivedi. This early work taught me how to blend deep theoretical concepts with practical, real-world problem-solving.

Later, as a remote research intern with the artificial intelligence institute of University of South Carolina (AIISC), I worked with a team on multimodal fact verification. We used Stable Diffusion 2 to enhance the mutlimodal data and built more robust datasets (FACTIFY3M) by injecting adversarial attacks in the form of fake news. This research, which was accepted at EMNLP 2023, gave me firsthand experience in working on multimodal AI.

During my internship at IBM Research, I benchmarked open-source LLMs for enterprise tasks and developed a 3-stage LLM-based approach for NBA (next best agent) recommendation in cold-start settings, work that was accepted at IAAI-24 (co-located with AAAI-24).

My Work at slice

Since joining slice, I've had the opportunity to build several impactful systems.

sliceMCP: I architected the pipeline that combines a Confluence qdrant vector database with a KG to answer complex questions, reducing the onboarding time for new projects from weeks to just a few hours. To make it fast, I designed a data pipeline using Merkle trees that improved synchronization speeds by 100 times for vector database.

Pay-Via-Voice: Built a voice-first payment system using Whisper ASR and multi-agent orchestration. We achieved 1.013s average transaction latency with full UPI integration and policy validation.

ConvoBot: Developed a context-aware chatbot using GPT-4o prompt-chaining that achieves 98.9% precision, 96.7% recall, and 37-42% end-to-end resolution rates.

SMS Processing: Inferenced LLMs like Qwen2.5-3B and Llama-3.1-8B on GPU with batch inference to reduce latency. Got an accuracy of 79% with 0.55s per 3 SMS. Then, trained HRM (Hierarchial Reasoning Model) using DAPT (Domain Adaptive Pre-Training) and SFT for SMS categorization, achieving 86% macro-F1 score with latency around 2ms per SMS.

Research Interests

I am interested in developing AI systems capable in symbolic reasoning and build efficient collaborative systems that works alongside people in meaningful ways. My current focus is on building collaborative, agentic models that can understand goals, share context, and adapt through interaction, ideas that connect closely with Human-Computer Interaction (HCI) and Human-Centered AI (HCAI).

Currently Exploring

Exploring how reinforcement learning can shape LLM personas for collaborative agent systems, specifically through post-training a Qwen3 MoE model

Currently digging into this paper, which takes an interesting approach—instead of relying on expert-annotated feedback, it uses real human interactions to develop the model's persona

Experimenting with the Muon Optimizer, which has been showing better training efficiency than traditional optimizers in recent benchmarks

On the implementation side, diving into a source code that has claimed to have improved performance over flash-attention to see where I can squeeze out performance gains

Learning Triton along the way since the performance optimization code is written in it

Research Publications

Research contributions in AI, NLP, and machine learning across prestigious conferences

Multi-Stage Prompting for Next Best Agent Recommendations in Adaptive Workflows

IAAI-24 [Collocated with AAAI-24], 2024

View Paper

Traditional business processes such as loan processing, order processing, or procurement have a series of steps that are pre-defined at design and executed by enterprise systems. Recent advancements in new-age businesses, however, focus on having adaptive and ad-hoc processes by stitching together a set of functions or steps enabled through autonomous agents. Further, to enable business users to execute a flexible set of steps, there have been works on providing a conversational interface to interact and execute automation. Often, it is necessary to guide the user through the set of possible steps in the process (or workflow). Existing work on recommending the next agent to run relies on historical data. However, with changing workflows and new automation constantly getting added, it is important to provide recommendations without historical data. Additionally, hand-crafted recommendation rules do not scale. The adaptive workflow being a combination of structured and unstructured information, makes it harder to mine. Hence, in this work, we leverage Large Language Models (LLMs) to combine process knowledge with the meta-data of agents to discover NBAs specifically at cold-start. We propose a multi-stage approach that uses existing process knowledge and agent meta-data information to prompt LLM and recommend meaningful next best agent (NBA) based on user utterances.

FACTIFY3M: A benchmark for multimodal fact verification with explainability through 5W Question-Answering

EMNLP 2023

View Paper

Combating disinformation is one of the burning societal crises - about 67% of the American population believes that disinformation produces a lot of uncertainty, and 10% of them knowingly propagate disinformation. Evidence shows that disinformation can manipulate democratic processes and public opinion, causing disruption in the share market, panic and anxiety in society, and even death during crises. Therefore, disinformation should be identified promptly and, if possible, mitigated. With approximately 3.2 billion images and 720,000 hours of video shared online daily on social media platforms, scalable detection of multimodal disinformation requires efficient fact verification. Despite progress in automatic text-based fact verification (e.g., FEVER, LIAR), the research community lacks substantial effort in multimodal fact verification. To address this gap, we introduce FACTIFY 3M, a dataset of 3 million samples that pushes the boundaries of the domain of fact verification via a multimodal fake news dataset, in addition to offering explainability through the concept of 5W question-answering. Salient features of the dataset include: (i) textual claims, (ii) ChatGPT-generated paraphrased claims, (iii) associated images, (iv) stable diffusion-generated additional images (i.e., visual paraphrases), (v) pixel-level image heatmap to foster image-text explainability of the claim, (vi) 5W QA pairs, and (vii) adversarial fake news stories.

Interested in discussing AI and machine learning? I enjoy connecting with fellow researchers, engineers, and teams working on interesting problems in conversational AI, multimodal systems, applied ML, reinforcement learning and HCAI.

@

Email

harshitkd@gmail.com

Send Email

in

harshit-dave

Connect

GitHub

harshitkd

View Profile

Google Scholar

harshit

View Research

Blog

Insights and explorations in AI, ML, and research

Work in Progress

Currently working on blog posts covering my experiences with LLMs