Studying gene regulation and other biological processes requires an understanding of the function of transcription factors (TFs) in plants. However, because these proteins are so diverse and complicated, both TF detection and categorization continue to be difficult. Traditional methods, like BLAST, frequently have poor performance on less popular TF families and a high computational cost.
The authors provide MegaPlantTF, the first all-inclusive machine learning and deep learning framework for the categorization (family-level) and prediction (TF versus non-TF) of plant TFs. Their approach uses a two-stage design that combines a deep feed-forward neural network with a stacking ensemble classifier, as well as k-mer-based protein representations. The authors give micro-, macro-, and weighted-average performance indicators to guarantee a thorough examination of both frequent and underrepresented TF families. To calibrate confidence in TF detection, the authors also use threshold-based evaluation. The findings demonstrate that MegaPlantTF maintains steady performance even under strict thresholds and achieves good accuracy and precision, especially with a k-mer size of 3 and a classification threshold of 0.5.
The software is available at https://bioinformatics.um6p.ma/MegaPlantTF.

Reference:
Genereux Akotenou et. al.(2026) MegaPlantTF: a machine learning framework for comprehensive identification and classification of plant transcription factors.Bioinformatics 42(1) : btaf678

Leave a Reply