Introduction
Currently, the number of users of intelligent assistants (now commonly referred to as «artificial intelligence» or AI, or «neural networks») is growing exponentially. A significant portion of these users, having neither mathematical training nor programming experience, declare themselves to be advanced AI specialists. This creates problems due to attempts to misuse available intelligent tools, questionable interpretation and practical application of the results obtained, as well as misleading, grandiose statements without real content.
This makes it important to create user support tools for working with intelligent assistants (IA), and first and foremost, it is important to develop tools for automating the work of prompt engineers who prepare prompts for IA.
In general, the result of a subject's interaction with an IA is determined by two factors:
-
the semantic content of prompts or other texts presented to the IA;
-
the intellectual capabilities of the IA itself, which can vary widely.
The uncertainty of interpreting the semantic content of prompts or other relevant texts, as well as the known unpredictability of the AI's response to prompts, complicate the application of formalization, requiring its significant adaptation to the new context of application. This work is one of the first attempts at such adaptation. A special notation has been developed to ensure a compact description of the algorithms used.
The generation of plausible but incorrect information, known as «hallucination», remains a problem accompanying the practical use of AI. In particular, it is well known that AI can justify mutually contradictory statements if it receives a corresponding prompt. The only exceptions are strictly substantiated or obvious observable facts. Therefore, it is becoming urgent to find tools that can objectively evaluate the correctness of formulations calculated using AI.
The issues discussed in this paper have become particularly relevant after 2020 (Nikolenko et al., 2020), so there are relatively few relevant publications on the automation of AI and the elimination of the problem of «hallucinations». Among the approaches that inspire moderate optimism are dialogical methods, including the «Debate Game» (Irving et al., 2025) and the Chain-of-Verification Method (Shehzaad et al., 2025), which eliminates «hallucinations» by asking AI to reflect on its own answers and self-correct. However, these approaches are not based on a significant formalization and are hopelessly far from useful practical application.
This paper presents algorithms for solving two problems:
-
generating prompts for which the annotations (brief descriptions of the answers to the prompt) are closest to the given description (the solution is provided by an evolutionary algorithm for selecting prompts);
-
verifying the correctness of the intelligent assistant's answers (the solution is provided by an evolutionary algorithm for verifying the correctness of IA answers, or the «pendulum algorithm»).
The main components of the above algorithms that determine the calculated result are a newly developed quasi-genetic algorithm that ensures the expansion of the set of prompts, and a method of multidimensional metric scaling, a rigorous description of which is rarely found in publications. The quasi-genetic algorithm is constructed by analogy with the well-known genetic algorithm (Emelyanov et al., 2003), used to solve optimization problems and to train neural networks, with crossover and mutation operations replaced by pseudo-crossover and variation operations performed by AI, which are similar in context but fundamentally different in content.
The main principle implemented in the approach to solving problems is that the intelligent assistant performs all substantive operations related to extracting quantitative estimates from the material under study, followed by analysis of these estimates using methods of multidimensional statistical analysis, statistical hypothesis testing techniques, and other mathematical tools.
The tools, which operate based on the algorithms described below, are implemented in software based on the OpenAI API. These tools have been tested in pilot mode with psychological texts, demonstrating convincing results.
The most obvious prospects for practical application of the presented algorithms are in areas where concepts with significant variability in interpretation are used: in psychology, sociology, art history, and other humanities (Shoham et al., 2009; Nikolenko et al., 2020).
This article is intended for programmers who create tools for working with large language models and mathematicians who develop methods for the practical use of artificial intelligence capabilities.
Notation and basic concepts
X, Y, and Z are also used as scaling spaces. Distances are understood to be Euclidean distances. Quasi-distances are represented by values from the numerical interval [0;1] and are calculated for given pairs of elements of metric spaces X, Y, and Z using AI as the results of prompts for comparing elements included in given pairs. A value of 1 corresponds to a complete match between the contents of the compared elements, while a value of 0 corresponds to a complete mismatch (obviously, the comparison result is ambiguous and is determined by the characteristics of the AI used). AI prompts explicitly specify the requirement to compare the presented elements, expressing the result as a real number from the interval.
Operations in metric space X:
i++ and j++ — increase the indices i and j by one;
Evolutionary algorithm for prompt selection for AI
The task is formulated as follows.
Solution algorithm:
-
Set a set of basic prompts . . and for all elements of the sets and . .
-
If , then compute the IA-mappings and for all elements of the sets and .
-
Find the median of Kemeny and the images of its mappings and .
-
Compute the extension using a quasi-genetic algorithm.
-
Compute the matrices , and .
-
Compute , , and , marking the resulting mutual distances between the elements of the sets , , and .
-
Find the Kameni median .
-
Determine the neighborhood of the prompt using the mutual distances defined by the matrix .
-
Compute IA mappings for all elements of the prompt neighborhood .
-
Check the condition . If the condition is satisfied, then , delete the specified relative part of the elements and their preimages and and proceed to step 11, otherwise proceed to step 4.
-
Calculate .
-
If , then and stop, otherwise go to step 13.
-
.
-
i++.
-
Proceed to step 2.
Quasi-genetic algorithm for performing set expansion of prompts when performing operations
-
Set the set of basic prompts . j=0.
-
Check the condition . If the condition is met, go to step 3, otherwise stop.
-
Quasi-genetic selection of elements of the set at iteration j using the «roulette rule» with the distance as the quality function, where .
-
Formation of the complement to the set by applying pseudo-crossover operations and , as well as pseudo-mutation , to the elements of the set .
-
Combining the set of prompts corresponding to iteration j and the complement : .
-
j++.
-
Go to step 2.
The convergence of the computational procedure is determined by the condition specified in step 10 of its description, the result of which, in turn, depends on the semantic content of the prompts submitted by the AI and the intellectual capabilities of the AI itself.
Evolutionary algorithm for checking the correctness of AI responses
Main thesis: AI responses that are correct in content are significantly more consistent with each other than responses that are incorrect in content.
-
Obtain the content material necessary for analysis in response to the generating request or ask directly. i=0. . .
-
i++.
-
Calculate .
-
If i is even, then and , otherwise and .
-
Calculate .
-
Calculate the variances and .
-
Assuming that the distributions of distances to centroids and are normal, test the null hypothesis of equality of variances and using the F-test for the statistic if , or for the statistic if . If the null hypothesis of the F-test on the equality of variances is not rejected and , then proceed to step 2, otherwise proceed to step 8.
-
If the null hypothesis of equality of variances is rejected, then if , conclude that the content material under investigation is correct , or if , conclude that the content material under investigation is incorrect , otherwise consider the correctness of the specified material to be undetermined.
The convergence of the computational procedure under consideration is determined by the condition specified in step 7 of its description, the result of which, in turn, depends on the semantic content of the prompt generated to the AI (or the original content being studied) and the intellectual capabilities of the AI itself.
In particular, events can be considered as finding sets of points representing statements and their negations in certain areas of the scaling space or the distribution of certain configurations of points corresponding to the classes of features under study in specified areas of this space (Kuravsky, 2014; Kuravsky, Greshnikov et al., 2024; Kuravsky, Orishchenko et al., 2025; Kuravsky, Yuryev et al., 2024).
Torgerson's multidimensional metric scaling method
Torgerson's multidimensional metric scaling algorithm
-
Calculate the matrix of quasi-distance squares
-
Calculate the matrix of mutual scalar products (Gram matrix) , where is the double centering matrix, is the unit matrix of size is a column vector of units, is a row vector of units (multiplication by the matrix centers the matrix by subtracting the row mean and column mean from each of its elements and adding the overall mean).
-
Solve the algebraic eigenvalue problem by computing the spectral decomposition , where ) is a diagonal matrix of eigenvalues in descending order ( ), and is a matrix of corresponding eigenvectors arranged in columns.
-
Compute the coordinate representation of the elements of the given set in a linear Euclidean space of dimension by defining the matrix , where ) — diagonal matrix of square roots of largest eigenvalues of the matrix , ordered in descending order, — matrix of the first coordinates of the eigenvectors of the matrix , arranged in columns (i.e., first terms of the matrix ; the coordinates of the elements of the set in the matrix of size are also arranged in columns).
It should be noted that:
-
The Gram matrix is symmetric and positive semidefinite, which allows it to be represented as ; one of the methods for calculating the coordinate matrix is discussed above;
-
It has been proven that the given algorithm provides a solution that ensures the smallest value of the criterion ;
-
The value is zero when
-
is an orthogonal matrix of size , and is a certain k-coordinate solution to the problem that provides the smallest value of the criterion . Then is an equivalent solution that provides the same smallest value of the criterion .
Results
Generation of a prompt for which the annotation is closest to the given description
The prompts below retain the individual style of wording prepared by an experienced prompt engineer.
-
What is the behavioral approach?
-
How is behavior studied in psychology?
-
What methods are used to analyze behavior?
-
What is objective behavior study?
ChatGPT [gpt-4o-mini] was used to solve the problem.
«Create a new prompt combining ideas from the following two prompts.»
«Create a SHORT new prompt, highlighting the common key ideas from the following two prompts. The prompt should be concise (no more than 2-3 sentences).»
«Create a SHORT new prompt, supplementing the first prompt with elements from the second. The prompt should be concise.»
«Rephrase the following prompt BRIEFLY, preserving its meaning.»
«Assess the semantic distance between two texts on a scale from 0.00 to 1.00 as a measure of the difference between their main statements. Be strict: do not underestimate the distance, consider differences in details, level of abstraction, examples, and conclusions. If one text provides a general description and the other provides a specific theory or example, the distance should be increased, even if the topics are similar. Return only one number with two decimal places, without any additional text.»
Verifying the correctness of a thesis using the pendulum algorithm
The prompts below retain the individual style of wording prepared by an experienced prompt engineer.
The phrase «Cultural-historical psychology is based on Marxist theory» was used as the thesis under investigation.
«You are an expert in philosophy, psychology, and scientific methodology. Formulate a clear, well-argued thesis H on a given topic that can be either confirmed or refuted; its plausibility will then be verified by the compactness of clusters of confirming and refuting arguments in the space of meanings. The answer should have a logical structure (thesis, justification, conclusion), be specific and concise (3–5 meaningful sentences).»
Result 1
«Cultural-historical psychology is based on Marxist theory.
Justification: The founder of cultural-historical psychology, Lev Vygotsky uses Marx's ideas about the social context in the development of the individual, emphasizing that psychological processes are formed as a result of social interaction and historical conditions. He emphasizes the role of culture and language in cognitive development, which is consistent with the Marxist concept of the influence of the social environment on the individual.
Conclusion: Thus, it can be argued that cultural-historical psychology is largely based on Marxist ideas, as confirmed by an analysis of the key concepts and principles underlying both approaches.»
«You are working with a chain of theoretical statements.
Formulate a new self-sufficient statement (1–2 sentences) that contradicts the meaning of the given text but remains within the original topic and subject area, preserving the key concepts and changing only the relationships between them.
Do not use metalanguage (such as "argument," "thesis," "this statement/text"), do not paraphrase or quote the original text, and do not offer compromises such as "on the one hand... on the other hand."
Return only the text of the new statement, without explanation.»
Result 2
«Cultural-historical psychology cannot be reduced to Marxist ideas, as its basic principles, developed by Lev Vygotsky, emphasize the uniqueness of cultural factors and individual development, significantly differing from the Marxist emphasis on class struggle and economic conditions.
In fact, the understanding of cognitive development stems directly from the specific cultural context, rather than from Marxist theory about the influence of the social environment.»
The text of request 3 coincides with the text of request 2.
Result 3
«Cultural-historical psychology is closely linked to Marxist ideas, as the emphasis on social conditions and interaction emphasizes that both individual development and cognitive processes cannot be understood without taking into account the economic and social factors described in Marxism. Thus, understanding these processes really depends on analyzing social structures and class relations, the importance of which significantly overlaps the role of unique cultural factors.»
The text of prompt 4 is identical to the text of prompt 2.
Result 4
«Cultural-historical psychology can be understood outside of Marxist theory, as it focuses on unique cultural factors, forms of communication, and linguistic practices that shape cognitive processes independently of economic and social conditions. Thus, individual development and psychology can be studied without relying on class relations or social structures, which highlights the diversity of influences on personality and thinking styles.»
«Estimate the semantic distance between two texts on a scale from 0.00 to 1.00 as a measure of the difference between their main statements. Be strict: do not underestimate the distance, consider differences in details, level of abstraction, examples, and conclusions. If one text provides a general description and the other provides a specific theory or example, the distance should be increased, even if the topics are similar. Return only one number with two decimal places, without any additional text.»
The duration of calculations on a medium-speed computer (processor base frequency — 2.70 GHz) was 71 seconds, with an average duration of 2.34 seconds per prompt to the AI.
Additional observation: narrowing of the chain to the semantic core of the contradiction
-
«Individual experience and unique cultural forms are considered a significant indicator of cognitive development.»
-
«Individual experience is interpreted as a subjective basis, insufficient without analysis of social and structural conditions.»
This observation allows us to consider the pendulum algorithm as a tool that automatically identifies the semantic basis around which a dispute on a given topic takes place.
Conclusion
-
The result of interaction with the IA is determined by two factors: the semantic content of prompts or other texts presented to the IA, and the intellectual capabilities of the IA itself, which can vary widely.
-
The main principle implemented in the applied approach to solving problems is that the intelligent assistant performs all content-related operations associated with extracting quantitative estimates from the material under study, followed by analysis of these estimates using methods of multidimensional statistical analysis, statistical hypothesis testing techniques, and other mathematical tools.
-
An evolutionary algorithm has been developed for generating prompts for the IA, the annotations of which are closest to the given descriptions, as well as an evolutionary algorithm for verifying the correctness of the intelligent assistant's responses.
-
The basis of the evolutionary algorithm for generating prompts is a quasi-genetic algorithm, which ensures the expansion of the set of prompts. The quasi-genetic algorithm is constructed by analogy with the well-known genetic algorithm used to solve optimization problems and, in particular, for training neural networks, with the replacement of crossover and mutation operations with pseudo-crossover and variation operations performed by the IA, which are similar in context of application but fundamentally different in content.
-
The pendulum algorithm allows identifying the semantic basis around which the debate on a given topic takes place.
-
A special notation has been developed to ensure a compact description of evolutionary algorithms.
-
The convergence of the evolutionary algorithm for generating prompts under certain conditions (presented in step 10 of the algorithm description) has been proven, the result of which is determined by the semantic content of the prompts submitted by the AI and the intellectual capabilities of the AI itself.
-
The convergence of the evolutionary algorithm for checking the correctness of answers is determined by the condition specified in step 7 of its description, the result of which depends on the semantic content of the generating prompt to the AI (or the original content material being studied) and the intellectual capabilities of the AI itself.
-
It has been proven that with a sufficiently large number of events, even a single manifestation of a certain pattern actually indicates its existence. Such events can be considered as finding sets of points in certain areas of the scaling space.
-
The pilot application of the developed algorithms for solving psychological problems has demonstrated their effectiveness and semantic correctness.


