Telecommunications is a field involving the transmission of information over distances to facilitate communication. It uses various technologies such as radio, television, satellite, and the internet for voice, data, and video transmission and plays a fundamental role in societal and economic functions.
However, Large Language Models (LLMs) that are typically used in the field lack specialised telecommunications knowledge. This limitation makes them unsuitable for specific tasks that require precision and advanced models within the industry, such as network optimisation, protocol development, and complex data analysis. Existing LLMs like GPT-4, Llama, and Mistral are not optimised for telecom-specific tasks even though they demonstrate substantial capabilities in natural language processing. Furthermore, the absence of telecom-specific datasets and evaluative benchmarks contributes to this issue, hence limiting these models’ effectiveness in real-world telecommunications scenarios.
To address this gap, researchers from the Technology Innovation Institute and Khalifa University have introduced a new model, TelecomGPT. This model customises general-purpose LLMs to the telecom domain by using a structured approach involving continual pre-training, instruction tuning, and alignment tuning. Furthermore, the researchers developed comprehensive telecom-specific datasets and proposed new benchmarks to evaluate the capabilities of the model fully. The introduction of TelecomGPT ensures the efficient and accurate handling of various telecom tasks.
The development of TelecomGPT involved several steps that included the collection of telecom-specific data from the 3GPP technical specifications, IEEE standards, patents, and research papers. Following preprocessing to ensure relevance, continual pre-training was conducted to enhance domain-specific knowledge in the model. Sequentially, Instructions tuning and alignment tuning through Direct Preference Optimisation (DPO) improved the model’s capabilities. At the end of these steps, benchmarks were utilised to measure the model’s performance.
Testing showed that TelecomGPT scored higher in various benchmarks in comparison to GPT-4. TelecomGPT scored 81.2% in Telecom Math Modeling and 78.5% in the Telecom Open QnA benchmark, against GPT-4’s scores of 75.3% and 70.1%, respectively. For code generation tasks, TelecomGPT scored 85.7% compared to GPT-4’s 77.4%. Ultimately, these results show that TelecomGPT is efficient in handling telecom-specific applications, thus demonstrating that it can contribute positively to the telecom industry.
In conclusion, TelecomGPT, developed by a collaborative effort from the Technology Innovation Institute and Khalifa University, helps to address the issue of domain-specific LLMs in telecommunications. The model provided superior efficiency and relevance for telecom tasks, highlighted the importance of domain-specific models and represented a promising advancement in the field. The research demonstrated the potential benefits of synergising academia and industry expertise to solve intricate, real-world problems in telecommunications.