Generative AI in Financial Domain : Specialized Models of Fine Tuned General Models ?
2 min read
Bloomberg GPT is a LLM trained by the namesake on over 50B parameters.
BloombergGPT outperforms similarly-sized open models on financial NLP tasks by significant margins — without sacrificing performance on general LLM benchmarks
LLMs have already been a sort of arms race with all major companies constantly releasing newer and better models. Bloomberg GPT could have sparked a new arms race among the financial behemoths to launch specialized LLMS. But that has not happened yet.
Part of the reason here is that companies need to be convinced that a specialized LLM is somehow really special than the general purpose LLMs. The research on this topic however suggests the opposite.
It appears that the general purpose models like ChatGPT or Llama are capable of doing everything that BloombergGPT does with pretty high accuracy and with significantly lower costs for everyone else.
Research [20] has reported that fine tuned models based on open source models like LLaMA are able to outperform BloombergGPT on finance related tasks even with a modest amount of training data of 50K samples [20]. However such models were fine tuned for very specific types of financial tasks. Current research suggests [20] that availability of high quality human labeled data is critical for fine tuning. In the finance domain producing human labeled data is harder due to challenges around regulations, privacy and availability of humans with such specialized knowledge. Additionally research [20] predicts that the cost of fine tuning an existing open source model could be more than $30,000.
Source: https://arxiv.org/pdf/2410.15653
LLM applications in finance is hard because mistakes could mean direct financial loss. Hence accuracy of very high importance. But there are many tasks such as analyzing data, answering questions about data, formatting and presenting data etc. can be done better with LLMs than current tools like spreadsheets.
It turns out that the generic LLMs are superior at this than narrow models like BloombergGPT.
If any organization does build a specialized model that is better than open source models, chances are this model will remain a trade secret as it gives competitive advantage to the institution over others.