ARCHIVES
Agentic AI Based Smart Assistant: A Multimodal Visual Question Answering System Using Fast API and GROQ Vision-Language Models
¹ Student, Department of AIML, ADGIPS, FC-²⁶ Shastri Park, Shahdara, New Delhi, India. ² Assistant Professor, Department of AIML, ADGIPS, FC-²⁶ Shastri Park, Shahdara, New Delhi, India.
Published Online: May-August 2026
Pages: 237-240
Cite this article
↗ https://www.doi.org/10.59256/indjcst.20260502026This paper presents the design and implementation of an Agentic AI Based Smart Assistant — a multimodal web application capable of analyzing images and answering natural language queries in real time. The system integrates FastAPI as the backend web framework with the GROQ API, leveraging LLaMA-based Vision-Language Models (VLMs) to interpret both visual and textual data simultaneously. An additional Retrieval-Augmented Generation (RAG) pipeline using TF-IDF vectorization enables document-aware question answering from uploaded PDFs. The system was tested across ten functional scenarios including valid and invalid inputs, large images, concurrent requests, and API fault conditions — all passing successfully. Results demonstrate strong contextual accuracy and low-latency performance suitable for real-world applications in medical imaging, smart education, and automated inspection.
Related Articles
2026
Artificial Intelligence in Learning and Teaching
2026
Admin Assist: An AI – Driven Configuration and Orchestration for Enterprise Application
2026
Enhancing Blood Group Identification using pigeon inspired optimization: An Innovative Approach
2026
Eco-Genius: Power Up Smart, Power Down Waste
2026
Crowd-Sourced Disaster Response and Rescue Assistant
2026
Unveiling Deepfake Detection Using Vision Transformers: A Survey and Experimental Study
2026
A Novel Stateful Orchestration Pattern for Data Affinity and Transactional Integrity in Sharded Backend Architectures
2026
Legal Challenges of Agentic AI Systems in Education and Employment Decision-Making
2026
New-Hybrid Soft Computing Model for Stock Market Predictions
2026
Human Emotion Distribution Learning from Face Images Using CNN


