Artificial Intelligence has made significant progress with Large Language Models (LLMs), but their capability to process complex structured graph data remains challenging. Many real-world data structures, such as the web, e-commerce systems, and knowledge graphs, have a definite graph structure. While attempts have been made to amalgamate technologies like Graph Neural Networks (GNNs) with LLMs, they are primarily focused on conventional graph tasks or simple questions on small graphs. This research aims to create a flexible framework for complex real-world graphs to enable a unified conversational user-interface.
Efforts to combine graph-based techniques with LLMs span various areas such as general graph models, multi-modal architectures, and practical applications for graph reasoning, node classification, and graph classification/regression. Retrieval-augmented generation (RAG) has emerged as a promising approach to enhance the trustworthiness of LLMs. However, the application of these advanced techniques to graph-specific LLMs is in its early stages.
Researchers from several universities and Meta AI have introduced G-Retriever, a novel architecture that integrates the strengths of GNNs, LLMs, and RAG. This framework enables efficient fine-tuning while preserving pre-trained language capabilities of the LLM via freezing the LLM model and using a soft prompting approach to the GNNs. The framework mitigates hallucinations through direct retrieval of graph information and adapts RAG to graphs, enhancing explainability by returning the retrieved subgraph.
G-Retriever’s architecture consists of four main steps: indexing, retrieval, subgraph construction, and generation. The node and graph embeddings generated in the indexing stage are stored in a nearest-neighbor data structure. The retrieval stage uses k-nearest neighbors to identify the most relevant nodes and edges for a given query. Subgraph construction employs Prize-Collecting Steiner Tree algorithm to create a relevant subgraph. The generation step involves a Graph Attention Network, a projection layer, and a text embedder, which transforms the subgraph into a textual format. The LLM then generates an answer.
G-Retriever has demonstrated superior performance across different datasets, outperforming baselines in inference-only settings and showing significant improvements with prompt tuning and LoRA fine-tuning. It also effectively mitigates hallucination by 54% compared to baselines.
This work introduces a new benchmark for graph question answering, called GraphQA, and the G-Retriever model is designed for complex graph queries. Unlike previous methods focusing on conventional tasks or simple queries, G-Retriever targets real-world textual graphs across multiple applications. The model uses a RAG approach for general textual graphs, employing Prize-Collecting Steiner Tree optimization to perform RAG over graphs which enables resistance to hallucination and handling of large-scale graphs. Experimental results demonstrate G-Retriever’s superior performance over baselines in various textual graph tasks, effective scalability with larger graphs, and a significant reduction in hallucination.