Big GCVAE: decision-making with adaptive transformer model for failure root cause analysis in semiconductor industry
Abstract
Pre-trained large language models (LLMs) have gained significant attention in the field of natural language processing (NLP), especially for the task of text summarization, generation, and question answering. The success of LMs can be attributed to the attention mechanism introduced in Transformer models, which have outperformed traditional recurrent neural network models (e.g., LSTM) in modeling sequential data. In this paper, we leverage pre-trained causal language models for the downstream task of failure analysis triplet generation (FATG), which involves generating a sequence of failure analysis decision steps for identifying failure root causes in the semiconductor industry. In particular, we conduct extensive comparative analysis of various transformer models for the FATG task and find that the BERT-GPT-2 Transformer (Big GCVAE), fine-tuned on a proposed Generalized-Controllable Variational AutoEncoder loss (GCVAE), exhibits superior performance in generating informative latent space by promoting disentanglement of latent factors. Specifically, we observe that fine-tuning the Transformer style BERT-GPT2 on the GCVAE loss yields optimal representation by reducing the trade-off between reconstruction loss and KL-divergence, promoting meaningful, diverse and coherent FATs similar to expert expectations.