Building a Better Generative AI—How Knowledge Graphs Will Improve Our RAG-Powered Chatbot

Around this time last year, POWER published an article on how we intended to use generative AI in our industry. Since that time, we developed an internal, retrieval-augmented generation (RAG)-powered chatbot, conducted a pilot, made some adjustments and then rolled out to full production. POWER can now use our tool to help create proposals faster, streamline document analysis, summarize meeting notes, and more. In other words, we met our KPIs and reduced the time it takes to produce some of our deliverables. Even better, our development costs were significantly under budget.

Throughout the pilot, our Digital Transformation Team has learned a lot about the strengths and weaknesses of our chatbot. With the potholes clearly identified, we’re now working on how we pivot to avoid them. A key to this next step in our generative AI development process is adding knowledge graphs to improve the results of our RAG-powered chatbot.

The knowledge graphs will be built from the documentation we use for RAG. With a properly structured graph of the data, a query of that graph will result in a more refined set of data to use in our RAG search. This will get much better results and better overall accuracy.

In our previous article, we defined RAG as “a framework applied within your generative AI system that improves LLM-generated responses by guiding the model with additional, relevant data sources.” Our team has been quite successful at using RAG to guide our chatbot. But we have also discovered that for all its strengths, RAG does not solve all the problems we need it to. There are still cases where the LLM hallucinates, though the use of RAG reduces both the frequency and severity of that.

Much of the hallucination we are seeing is based on how RAG works: it is doing a lookup of content that is “close to” what the user is asking, based on vector math. But there are words, phrases, and even whole paragraphs that could mathematically be close to what the user wants, without actually being very pertinent to the question. This taints the overall context provided to the LLM and produces less than optimal results.

In our pilot, we saw better than 75% accuracy for RAG search based on our internal metrics, which is good enough to satisfy our user base for the current use cases. But that is not good enough for engineering work. For that, we would want to be into the 90% range. We’re preparing to close that gap with knowledge graphs.

The Power of Knowledge Graphs

A knowledge graph is a structured representation of information that captures relationships between data points in a way that mimics how humans understand and connect concepts. It consists of nodes (representing entities such as equipment, standards, calculations, etc.) and edges (representing the relationships between these entities). This interconnected structure allows for more intuitive data retrieval and analysis, making it a powerful tool for the electrical utility industry.

Here’s how knowledge graphs can improve generative AI results:

Contextual Understanding: Knowledge graphs provide a rich context by connecting related data points, which helps generative AI models understand the broader context of a query or task. This leads to more accurate and relevant responses.
Data Integration: By integrating various data sources into a unified framework, knowledge graphs ensure that generative AI models have access to comprehensive and up-to-date information. This reduces the chances of errors and improves the reliability of the generated outputs.
Dynamic Updates: Knowledge graphs can be continuously updated with new data, ensuring that the generative AI models are always working with the latest information. This is particularly important in the electrical utility industry, where real-time data can significantly impact decision-making processes.
Enhanced Querying: The structured nature of knowledge graphs allows for more sophisticated querying capabilities. Generative AI models can leverage these capabilities to retrieve specific information more efficiently, leading to faster and more accurate responses.

Once we have a knowledge graph, we can more easily walk back from our result to the information used to generate the result, which will provide concrete and verifiable information to our users. The citations we receive from our RAG data today verify that sometimes the RAG search gets mathematically close results that are not semantically relevant to the user’s question or intent. This data provides an immediate way to measure how our knowledge graph is improving the results of our RAG-powered chatbot.

How to Make a Knowledge Graph

There are a few methods to turn unstructured data into a knowledge graph. One of those ways is to use a LLM to extract the nodes and edges (i.e., things and relationships) from the text in the documents. By instructing the LLM with prompts that have been written specifically for extracting “things” and “relationships” from chunks of text, we can create specifically formatted outputs that can be turned into a graph.

These graphs can then be either stored directly as files in a storage account or stored in a graph database like Graphwise's Ontotext GraphDB or Neo4j. Ontotext deserves some special recognition here, as they have done some fantastic work in the knowledge graph space around electrical data. I highly recommend that you visit the Transparency Energy Knowledge Graph and see how they have leveraged graphs to show a huge amount of valuable electrical data captured by our peers in the ENTSO-E.

We are currently experimenting with Microsoft GraphRAG and it is showing some value with a more automated extraction process. There are some other open-source tools available as well, and we will use them if GraphRAG does not provide us with what we need at a price we are willing to pay.

Continuously Improving Your Generative AI

We have realized excellent value from the use of RAG, LLMs and generative AI at a reasonable price. In most of our use cases, we are easily covering the cost of implementation via the reduction of the hours that our users had previously been working. However, we know there are always ways to improve the responses from our generative AI.

As a community, those of us experimenting with generative AI for the electric utility industry need to keep pushing ourselves and the new technologies that we are using—and sharing results and lessons learned with each other. For us, that next step is developing knowledge graphs to improve accuracy and output quality. We look forward to sharing our results soon. What does that next step look like for you?