IJLRET provides individual hard copy of certificates to all authors after online publication. w.e.f. 01/11/2015
Log in || Register editor@ijlret.com
IJLRET Menu

Current Issue [Vol. 12, No. 01] [January 2026]


Paper Title :: A Weighted Self-Attention Optimization Framework for Transformer-Based Text Classification
Author Name :: Sambit Ray
Country :: India
Page Number :: 01-03
Transformer models have significantly advanced Natural Language Processing (NLP) by replacing recurrence with self-attention mechanisms [1]. While standard Transformers compute attention uniformly across heads and tokens, this may dilute task-specific importance in classification problems [2,8]. This paper proposes a Weighted Self-Attention Optimization (WSAO) framework that introduces adaptive token-level weighting into the Transformer encoder to enhance discriminative feature learning. Using a public benchmark dataset, we demonstrate that the proposed formulation improves classification performance compared to baseline Transformer [1] and LSTM models [3]. Mathematical formulation, experimental evaluation, and comparative analysis are presented to highlight the effectiveness of the approach.
Keywords: Transformer Models, Self-Attention Optimization, Text Classification, Cross-Entropy Loss, NLP
[1]. Vaswani, A., et al. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30.
[2]. Devlin, J., et al. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL-HLT.
[3]. Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.
[4]. Jurafsky, D., & Martin, J. H. (2023). Speech and Language Processing (3rd ed.). Pearson.
[5]. Goldberg, Y. (2017). Neural network methods in natural language processing. Morgan & Claypool.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Copyright © 2015 IJLRET. All Right Reseverd Home   Editorial Board   Current Issue   Contact Us