This project leverages advanced AI technologies to provide comprehensive analysis of video content, transforming vast amounts of unstructured video data into actionable insights. By integrating state-of-the-art Vision Language Models (VLMs) and Large Language Models (LLMs), our platform offers real-time monitoring, detailed summarizations, and interactive querying capabilities.
Key Features:
-
Automated Video Logging and Summarization: Efficiently processes and condenses extensive video footage into concise summaries for quick review.
-
Specialized Analytical Agents: Includes agents for detecting specific events such as fires, assaults, thefts, and other suspicious activities, enhancing security and operational efficiency.
-
Interactive Chat Interface: Allows users to engage with video logs through natural language queries, facilitating intuitive data retrieval and analysis.
-
Support for Streaming and Uploaded Videos: Capable of analyzing both live video streams and pre-recorded footage, providing flexibility across various use cases.
Use Cases:
-
Public Safety: Assists government agencies in monitoring and responding to criminal activities, thereby enhancing community safety.
-
Retail Analytics: Provides insights into customer behavior, supporting targeted marketing strategies and improving customer experiences.
-
Industrial Monitoring: Ensures compliance with safety protocols and detects anomalies in operational processes, reducing risks and improving efficiency.
Fig: FrameWise Architecture Diagram
This is a table of contents for your project. It helps the reader navigate through the README quickly.
- Alert/Email Notifier Agent
- Summarization Agent
- Chat agent
- Fire Detection Agent
- Assault Detection Agent
- Crime Detection Agent
- Drug Detection Agent
- Theft Detection Agent
- Tamper Detection Agent
- Suspicious Activity Agent
- Customer Behaviour Agent
- Nvidia Cosmos 34B VLM model
- OpenAI
- OpenAI Embeddings
- LangChain
- Pgvector
- PostgreSQL
- Supabase
- Django
- ReactJS
- Video is uploaded
- Uploaded video is passed through the Nvidia Cosmos 34B VLM model to generate logs every 30s which are stored on Supabase PostgreSQL Database
- The logs are vectorized and stored in pgvector
- Summary is generated from that vectorized information. Summary tries to detect any incident of crime/theft/assault
- Automated agents such as fire agent, assault agent, crime agent, drug agent, etc analyze the uploaded video and give a specialised response about the severity if they are detected
- Users can also chat with the video information using the chat agent to get more detailed and personalised information from the video logs
- Video is uploaded
- Uploaded video is passed through the Nvidia Cosmos 34B VLM model to generate logs every 30s which are stored on Supabase PostgreSQL Database
- The logs are vectorized and stored in pgvector
- Summary is generated from that vectorized information. Summary tries to detect any incident of crime/theft/assault
- Users can set custom alerts by defining specific criteria they want to monitor.
- Alert/Email agent keeps on checking for that custom requirement and sends a mail to the user if it is detected.
- Users can also chat with the video information using the chat agent to get more detailed and personalised information from the video logs

