Depth Estimation Module
We implemented a depth estimation module using Convolutional Neural Networks (CNNs) with specialized configurations, including deep layers, optimized kernel sizes, and pooling layers. This module captured essential patterns such as the apparent size of known objects, perspective, distance, and scale, enabling accurate identification of the spatial layout and the distance between objects (e.g., a human and the camera).