Similarity and Ambiguity Index

Updated 

Before you Begin:

Make sure that you have added more than 2 content sources, and that they have completed syncing before triggering the similarity or ambiguity report. You can find more about adding content sources in this article

Overview

Similarity refers to the measurement of how similar two or more pieces of text are in terms of their content. It involves comparing text documents and determining their degree of resemblance or likeness. A similar content may lead to the hallucination of response, as LLM may get confused.

Ambiguity refers to the measurement of how two or more pieces of text provide ambiguous content for the same context. It involves comparing one paragraph of text from a content source to all other content sources. A disambiguous content can lead to different answers for the same query.

Generating the Similarity and Ambiguity Reports

You have the capability to generate Similarity and Ambiguity Reports for content sources using the options available beneath the summary bar. Here, you can also view the date and time of the last generated report.

Similarity Report

The similarity report compares the content of one source with other existing content sources in your bot's knowledge base. A high similarity will be marked by a red icon and you can view reports to check for similar content for content sources will high similarity.

To generate the similarity report for all content sources, click Generate Similarity Report at the top. Once generated, access the report by clicking the View Report icon next to the index. The report highlights content from different sources that share similarities with the selected source. Additionally, a red icon will appear to indicate high similarity within the report.

Identifying and addressing high similarity instances is crucial, as redundant information can strain the model's processing efficiency.

Ambiguity Report

Similarly, you can generate an ambiguity report to assess the ambiguity index, highlighting contradictory information between content sources. To initiate the ambiguity report for all content sources, click Generate Ambiguity Report at the top. Once generated, access the report by clicking the View Report icon next to the index. This report identifies and highlights ambiguity across various sources.

Identifying and addressing high ambiguity instances is crucial, as it helps prevent the bot from sharing incorrect information. Additionally, you'll notice a red icon indicating high ambiguity within the report.