One of the most significant and challenging areas of computer vision is object recognition and tracking, which is extensively utilised in many industries including health care monitoring, autonomous driving, anomaly detection, etc. The tracking of moving objects in videos is actively researched over the past two decades due to its practical applications in many fields such as event analysis, human-computer interaction, crowd analysis, video surveillance, behaviour analysis, etc. The effectiveness of object trackers and detectors has significantly increased with the rapid advancement of deep learning (DL) networks and GPU processing capability. New methods have been presented for object recognition and tracking in video as a result of extensive study in this field. This article addressed the several processes of object tracking in video sequences: object detection, object classification, and object tracking, in order to comprehensively comprehend the key advancements in the object detection and tracking pipeline. Additionally, we thoroughly examine the various approaches available for object recognition, categorization, and tracking. © 2022 IEEE.