Image and video retrieval by their semantic content has been animportant and challenging task for years, because it ultimately requires bridgingthe symbolic/subsymbolic gap. Recent successes in deep learning enableddetection of objects belonging to many classes greatly outperforming traditionalcomputer vision techniques. However, deep learning solutions capable ofexecuting retrieval queries are still not available. We propose a hybrid solutionconsisting of a deep neural network for object detection and a cognitivearchitecture for query execution. Specifically, we use YOLOv2 and OpenCog.Queries allowing the retrieval of video frames containing objects of specifiedclasses and specified spatial arrangement are implemented.