This study explores a way of detecting smoke plumes effectively as the early signs of forest fire. Convolutional neural networks (CNNs) have been widely used for forest fire detection; they were not customized or optimized for smoke characteristics. This paper proposes a CNN-based forest smoke detection model featuring a novel backbone architecture that can increase detection accuracy and reduce computational load. The proposed backbone detects the plume of smoke through different views using different sized kernels, it can better detect smoke plumes of different sizes. The conventional convolution of square kernels is decomposed into the depth-wise convolution of coordinate kernels to not only can better extract the features of smoke plumes spreading along the vertical dimension but also reduce the computational load. Attention mechanism was applied to allow the model to focus on important information while suppressing less relevant information. Experiments show that our model outperforms other popular ones by achieving detection accuracy of up to 52.9 average precision (AP) and reduces the number of parameters and giga floating-point operations (GFLOPs) significantly compared to the popular models.