Analyzing the Effects of Combining Gradient Conflict Mitigation Methods in Multi-Task Learning




Gradient Conflict Mitigation Methods, Multi-Task Learning, Project Conflicting Gradients (PCGrad), Modulation Module, Language-Specific Subnetworks (LaSS)


Multi-task machine learning approaches involve training a single model on multiple tasks at once to increase performance and efficiency over multiple singletask models trained individually on each task. When such a multi-task model is trained to perform multiple unrelated tasks, performance can degrade significantly since unrelated tasks often have gradients that vary widely in direction. These conflicting gradients may destructively interfere with each other, causing weights learned during the training of some tasks to become unlearned during the training of others. The research selects three existing methods to mitigate this problem: Project Conflicting Gradients (PCGrad), Modulation Module, and Language-Specific Subnetworks (LaSS). It explores how the application of different combinations of these methods affects the performance of a convolutional neural network on a multi-task image classification problem. The image classification problem used as a benchmark utilizes a dataset of 4,503 leaf images to create two separate tasks: the classification of plants and the detection of disease from leaf images. Experiment results on this problem show performance benefits over singular mitigation methods, with a combination of PCGrad and LaSS obtaining a task-averaged F1 score of 0.84686. This combination outperforms individual mitigation approaches by 0.01870, 0.02682, and 0.02434 for PCGrad, Modulation Module, and LaSS, respectively in terms of F1 score.


Plum Analytics

Author Biographies

Richard Alison, Bina Nusantara University

Computer Science Department, School of Computer Science

Welly Jonathan, Bina Nusantara University

Computer Science Department, School of Computer Science

Derwin Suhartono, Bina Nusantara University

Computer Science Department, School of Computer Science


Y. Zhang and Q. Yang, “A survey on multi-task learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 12, pp. 5586–5609, 2021.

R. Caruana, “Multitask learning,” Machine Learning, vol. 28, pp. 41–75, 1997.

D. S. Chauhan, S. R. Dhanush, A. Ekbal, and P. Bhattacharyya, “All-in-one: A deep attentive multi-task learning framework for humour, sarcasm, offensive, motivation, and sentiment on memes,” in Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, Suzhou, China, 2020, pp. 281–290.

V. Davoodnia and A. Etemad, “Identity and posture recognition in smart beds with deep multitask learning,” in 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC). Bari, Italy: IEEE, Oct. 6–9, 2019, pp. 3054–3059.

M. Pisov, G. Makarchuk, V. Kostjuchenko, A. Dalechina, A. Golanov, and M. Belyaev, “Brain tumor image retrieval via multitask learning,” 2018. [Online]. Available:

A. Kendall, Y. Gal, and R. Cipolla, “Multi-task learning using uncertainty to weigh losses for scene geometry and semantics,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Utah, June 19–21, 2018, pp. 7482–7491.

X. Zhao, H. Li, X. Shen, X. Liang, and Y. Wu, “A modulation module for multi-task learning with applications in image retrieval,” in Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, Sept. 8–14, 2018, pp. 401–416.

S. Sridhar and S. Sanagavarapu, “Fake news detection and analysis using multitask learning with BiLSTM CapsNet model,” in 2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence). Noida, India: IEEE, Jan. 28–29, 2021, pp. 905–911.

Y. Liu, W. M. Sid-Lakhdar, O. Marques, X. Zhu, C. Meng, J. W. Demmel, and X. S. Li, “GPTune: Multitask learning for autotuning exascale applications,” in Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Virtual Event Republic of Korea, Feb. 2021, pp. 234–246.

K. Zhang, L. Wu, Z. Zhu, and J. Deng, “A multitask learning model for traffic flow and speed forecasting,” IEEE Access, vol. 8, pp. 80 707–80 715, 2020.

J. Li, X. Shao, and R. Sun, “A DBN-based deep neural network model with multitask learning for online air quality prediction,” Journal of Control Science and Engineering, vol. 2019, pp. 1–9, 2019.

Y. Li, J. Song, W. Lu, P. Monkam, and Y. Ao, “Multitask learning for super-resolution of seismic velocity model,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 9, pp. 8022–8033, 2020.

J. Zhang, K. Yan, and Y. Mo, “Multi-task learning for sentiment analysis with hard-sharing and task recognition mechanisms,” Information, vol. 12, no. 5, pp. 1–13, 2021.

F. Tao and C. Busso, “End-to-end audiovisual speech recognition system with multitask learning,” IEEE Transactions on Multimedia, vol. 23, pp. 1–11, 2020.

K. H. Thung and C. Y. Wee, “A brief review on multi-task learning,” Multimedia Tools and Applications, vol. 77, no. 22, pp. 29 705–29 725, 2018.

T. Yu, S. Kumar, A. Gupta, S. Levine, K. Hausman, and C. Finn, “Gradient surgery for multitask learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 5824–5836, 2020.

Z. Lin, L. Wu, M. Wang, and L. Li, “Learning language specific sub-network for multilingual machine translation,” 2021. [Online]. Available:

S. S. Chouhan, U. P. Singh, A. Kaul, and S. Jain, “A data repository of leaf images: Practice towards plant conservation with plant pathology,” in 2019 4th International Conference on Information Systems and Computer Networks (ISCON). Mathura, India: IEEE, Nov. 21–22, 2019, pp. 700–707.

W. C. Tseng, “WeiChengTseng/Pytorch-PCGrad,” 2020. [Online]. Available:



Abstract 127  .
PDF downloaded 60  .