Parameter Efficient Instruction Tuning of LLMs for Financial Applications

Subhendu Khatuya

Parameter Efficient Instruction Tuning of LLMs for Financial Applications

The financial domain presents unique challenges for natural language processing (NLP) due to the presence of domain-specific terminology, structured–unstructured data integration, and the requirement for high factual accuracy. This dissertation addresses three critical problems at the intersection of financial text understanding and generative modeling: (i) extreme financial numeral labeling, (ii) bullet-point summarization of long earnings call transcripts, and (iii) financial numerical reasoning.
First, we study the problem of automatic tagging of financial numerals with their corresponding eXtensible Business Reporting Language (XBRL) labels, a task mandated by regulatory bodies such as the U.S. Securities and Exchange Commission (SEC). The recently introduced FNXL dataset exemplifies the scale of this problem with 2,794 possible labels, making it an extreme multi-label classification challenge. Existing discriminative approaches fail to handle unseen or rarely occurring labels effectively. To address this, we propose FLAN-FinXC, a generative framework based on parameter-efficient instruction tuning of FLAN-T5 models. Our approach leverages task-specific prompts to generate XBRL tag descriptions, followed by an unsupervised Tag Matcher module that maps generated outputs to final tags. Experiments show that our method achieves new state-of-the-art performance on FNXL and FiNER datasets, particularly excelling in zero-shot scenarios and for rare labels. Notably, even in misclassification cases, our generated descriptions exhibit high semantic overlap with ground-truth labels, showcasing the robustness of our generative paradigm.
Second, we investigate the problem of summarizing long Earnings Call Transcripts (ECTs), which are lengthy unstructured documents that provide critical insights into corporate performance and strategy. Manual summarization is time-intensive, and automatic methods struggle due to high compression ratios and the presence of dense financial content. Using the ECTSum dataset, we propose FLAN-FinBPS, a novel two-stage generative framework for producing concise bullet-point summaries. Unlike prior methods that rely entirely on supervised fine-tuning, our approach employs an unsupervised question-based context generator in the first stage to create extractive summaries, followed by a supervised instruction-tuned abstractive summarizer in the second stage. Empirical results demonstrate significant improvements over the strongest baseline, including a 14.88% increase in average ROUGE score, a 16.36% rise in BERTScore, and notable gains in factual consistency and numerical precision.
Finally, we address the persistent challenge of financial numerical reasoning, where large language models (LLMs) often falter despite the use of advanced prompting strategies such as chain-of-thought and program-of-thought prompting. We introduce FINDER, a two-step generative framework designed to enhance LLM reasoning capabilities. FINDER first employs a generative retriever to extract relevant facts from both text and tabular data, followed by dynamic in-context example selection to guide program-of-thought reasoning. On the FinQA and ConvFinQA datasets, FINDER achieves new state-of-the-art execution accuracies, surpassing prior benchmarks by 5.98% and 4.05%, respectively.
Collectively, this dissertation demonstrates the power of generative paradigms for tackling financial NLP tasks involving extreme label spaces, lengthy unstructured inputs, and complex numerical reasoning. By integrating parameter-efficient instruction tuning, unsupervised context generation, and generative retrieval, our contributions advance the state of the art and pave the way for scalable, accurate, domain-adapted technologies for financial applications.