Negative reviews serve as a communication tool for AI chatbot users to criticize. The correct categorization of negative reviews depends on several factors.
User ratings express whether the chatbot's response satisfactorily answers the user's request - from the user's perspective. This article documents how these insights can be used to optimize response content.
User reviews do not represent an assessment of AI recognition or a specific evaluation of the chatbot as a whole.
However, negative ratings in particular can be used as indicators of the need for optimization.
Feedback forms are suitable for collecting general feedback. More detailed information on this is documented in the corresponding article in our helpcenter.
Causes
The causes of negative reviews can be divided into several groups. Two of these case groups are usable for constructive analysis:
- The answer does not fit, e.g. because the information displayed is out of date
- An incorrect topic was displayed in response to the user enquiry
- A suitable answer was displayed
The third group of cases is not conducive to constructive analysis. Negative reviews are found here, even though the correct intent was displayed and a suitable answer was given.
Our experience shows that the third case group occurs in particular when the appropriate response has a negative connotation, e.g. a distorted connection is reported, but the response does not provide any direct solutions or states that the distortion cannot be rectified immediately.
Analysis
The negative reviews are accessible via two different ways. On the one hand, the negative reviews received on individual days can be checked. This is done via the conversations:
It is also possible to check negative reviews received over a certain period of time. This is done via the user requests focus report in the statistics:
The period and feedback filters can then be set there. To set the filter, click twice on the thumb.
Selecting the intent button allows you to view all negative feedback that has been received on the selected intent in the desired time period:
As soon as the negatively rated conversations have been collected, the analysis can start.
The aim here is to identify one of the two case groups mentioned above. The following three steps are advisable for this:
- Analyse the title and the scenario → What is covered and what is not?
- Analyse the query → What does the user want to know?
- Assignment → Does the enquiry match the topic or not?
The appropriate case group is identified on the basis of the third question. Consequently, the specific consequence can be determined.
Consequence
The consequences for the case groups differ fundamentally.
Enquiry fits the intent
Information is often missing, the answers are formulated too vaguely or are inappropriate.
The content of the response should therefore be revised. Our best practices can be found in the helpcenter.
Wrong intent was played out
If the analysis comes to the conclusion that the intent is incorrect, the AI suggestions should be checked and activated promptly. If there are no suitable AI suggestions, a topic should be created independently (see article). If there is already a suitable intent that should have been played at this point, the AI feedback tool should be used.
Negative ratings, positive ratings and other key figures
Many key figures in the moinAI Hub provide information about the performance of the AI chatbot. The key figures are limited to individual sub-areas, meaning that they can only be meaningfully interpreted in an overall view.
Individual key figures that are read outside of the overall context do not provide any information about performance. They provide indications of what should be paid attention to.
Logically, this means that negative ratings in particular do not allow any conclusions to be drawn about the general functionality of the chatbot. There is no such causality!
Instead, the negative ratings should be used to check whether the response content created is good and whether there is a need for further intents.
In particular, it is important to contextualize the negative ratings with the automation rate. If the automation rate is good and the negative ratings are high, this is a strong indication that the response content needs to be improved - not that the chatbot is performing poorly.
The same applies to positive ratings. Used as a benchmark, this can be used to determine whether users rate negatively or positively more often. If the focus is on the negative ratings, the consequence is comparable to above. If the focus is on positive ratings, case group 3 in particular - negative ratings without a factual context - must be taken into account.
The automation rate can never be 100%. Anything over 60% is considered very good - a well-deployed teaser is particularly helpful for further increases. The time since going live must also be taken into account. In the first three months, 25%-35% is good. After six months, 50% should be the target.