Business data analysis is an essential tool in modern companies, extracting actionable insights from large datasets to help maintain a competitive edge through informed decision-making. However, the combination of traditional rule-based systems and AI models can present challenges, often leading to inefficiencies and inaccuracies.
Despite rule-based systems being recognized for their reliability and precision, they can be unreliable when dealing with complex and dynamic data environments. Similarly, Large Language Models (LLMs), a type of AI model, are proficient at identifying patterns and making predictions but often lack the precision needed for specific business applications.
Researchers at Narrative BI have proposed a hybrid approach that aims to tackle these issues, combining the robustness of rule-based systems with the adaptive capabilities of LLMs. This novel methodology seeks to leverage the respective strengths of both rule-based methods and LLMs, generating actionable business insights from complex datasets and addressing the shortcomings of each approach, thus balancing the solution for business data analysis.
The hybrid approach integrates Local Interpretable Model-agnostic Explanations (LIME), an interpretable AI technique, with rule-based systems and supervised document classification.
This study involved LLMs for natural language understanding, data processing, and analysis by rule-based systems, using corporate Google Analytics 4 and Google Ads accounts data collected over two years. After data cleaning, transformation and normalization, insights were generated using LLM. The hybrid approach outperformed standalone systems, with recall of business insights rising from 67% (LLM) and 71% (rule) to 82%.
The hybrid model increased trustworthiness and transparency in data extraction processes, mitigated the risk of biases, and reduced inaccuracies. Specifically, rule-based preprocessing algorithms improved processing efficiency to 100%, with the hybrid model achieving 87%, compared to 63% for standalone LLMs. Also, the hybrid model saw a fall in proper name hallucinations, with errors dropping from 12% for standalone LLMs to just 3% in the hybrid model.
Overall user satisfaction was highest for the hybrid model, with a ratio of likes to dislikes of 4.60, compared to 3.82 for LLMs and 1.79 for rule-based systems. This success points to the model’s balance of precision, efficiency, and user satisfaction.
The research suggests that a hybrid methodology which combines aspects of both rule-based systems and LLMs offers a promising future for the extraction and analysis of complex business data, providing an effective solution to address the challenges of traditional methods. This approach can enhance both the processing and analysis of business data, leading to more insightful and actionable intelligence.