The construction industry is one of the most hazardous industries suffering from a high on-site accident rate. A lot of safety hazards result from dynamic activities of construction workers and equipment. Therefore, tracking the location and motion of workers and equipment as well as identifying the interaction between them are crucial to preventing safety hazards on construction sites. Currently, with the extensive installation of surveillance cameras, computer vision techniques can be applied to process the videos and images captured on construction sites, which can be used to monitor site safety and to identify potential hazards. With the aim to predict and prevent the safety hazards among workers and equipment, this paper proposes a methodology to monitor and analyse the interaction between workers and equipment by detecting their locations and trajectories and identifying the danger zones using computer vision and deep learning techniques. First, workers and construction equipment are automatically located from cameras and classified by a deep region-based convolutional neural network (R-CNN) model. Then, the location and classification results are further processed by another CNN-based model to obtain trajectories of those objects. Based on the detection and trajectories, the spatial-temporal relationship between workers and equipment is analysed, from which the danger zones for the workers are identified and the corresponding safety alarms are generated. Experiments are conducted to demonstrate the capability of the proposed methodology for accurately identifying and predicting safety hazards among construction workers and equipment, which can contribute to the safety conditions on construction sites.