Patent on CBVR Published by Dr Jatindra Dash

The Department of Computer Science and Engineering is delighted to announce the publication of a patent titled “A Content-Based Video Retrieval (CBVR) System and Method Thereof,” with application number 202541004749. The invention offers an efficient and accurate approach developed for content-based video retrieval by Dr Jatindra Kumar Dash, Associate Professor in the Department, along with his PhD scholar, Mr Farooq Shaik.

 Brief Abstract:

Imagine searching for a specific scene in a movie, not by remembering its title or actors, but by describing the action itself. This is the essence of content-based video retrieval (CBVR), a technique that searches for a video based on what’s inside it, rather than relying solely on manually assigned labels. Unlike traditional methods, which can be time-consuming, error-prone, and struggle with vast datasets, CBVR offers a more efficient and accurate approach. Our proposed system leverages the strong capability of deep learning, a subset of artificial intelligence, to analyse videos and extract their key characteristics. This process occurs in two stages: offline and online. Through the first stage, important features are extracted from all videos in the dataset and stored for future use. When a user submits a query video, its features are extracted in real-time (online) and compared to the stored features of all videos. The videos with features most similar to the query, essentially those with the “closest match” are then presented to the user. To capture the full essence of a video, our system employs a two-stream neural network architecture. This innovative approach allows us to extract both temporal features, which capture the changes and motion patterns within the video (think: someone running or jumping), and spatial features, which pivot about the static visual content of each individual frame (think: the objects and scene depicted).By utilizing a pre-trained neural network called ResNet-60, our system benefits from existing knowledge and can efficiently extract meaningful features from videos. To evaluate its effectiveness, we tested our system on the UCF101 dataset, a widely used benchmark consisting of 101  categorized videos. Our approach obtained accuracy 93,7\% for top 5 retrieval and 95.95\% for top 10 retrieval. The outcomes illustrate that our approach obtains superior accuracy compared to other state-of-the-art video retrieval methods.

Explanation  in Layperson’s Terms:

Most of video searching platforms relay on meta data attached to video to search and retrieve videos. For example you tube utilize video name description attached to video while uploading. How ever this approach is time consuming, error prone, and need human intervention. Our proposed CBVR system aims to retrieve videos based on content of video similarity rather than meta tags. Proposed article utilized pre trained Deep neural network particularly ResNet-50 a convolutional neural network with residual skip connections to learn video representation by employing LBP representation and Temporal map of the video.

Practical Implementation and Social Implications

The research focus on CBVR a technique that enables users to search videos based on content rather than meta tags. It has many practical implementations in various industries, such as Surveillance and security (like to search large surveillance feed particular incident), Health care and medical imagining(where doctor retrieve similar medical video for faster diagnosis), Education , Entertainment.

The research has significant social implications such as Improved accessibility to information, enhanced public safety, Advancing ai in daily lifes. Using this system in smart cities and digital systems.

Collaborations

Experiments are conducted on publicly available Dataset on DGX-1 server available at our university premises. In future we may plan to collaborate with local authorities for real time video feed to enhance proposed method capabilities.

Future Research Plans

Further in to research our plan is to propose a robust system that can be scaled and applied to all scenarios of videos may it be Medical videos, Education. Further proposed method is supervised approach, we want to explore unsupervised methods to generalize video retrieval.

Leave a Reply

Your email address will not be published. Required fields are marked *