Evolutionary algorithms to generate prompts and verify responses of intelligent assistants

 
Audio is AI-generated
 268 min read
17

Abstract

Tools for working with intelligent assistants (in particular, with ChatGPT and DeepSeek) have been developed. A new mathematical framework, specialized algorithms, and software, which allow for validating the content of intelligent assistant responses and generating prompts based on annotations (brief descriptions of the responses to prompts), have been presented. These tools make it possible to replace a prompt engineer or, at least, automate his work. The solution in use is based on the evolutionary algorithms which generate a sequence of prompts organized according to a specific logical scheme and include a quasi-genetic algorithm with pseudo-crossover and pseudo-mutation operations, followed by analysis of the intelligent assistant's responses with the aid of multivariate statistical analysis methods. The search for an acceptable result, in which the intelligent assistant itself is actively involved, is an iterative process converging toward a given solution. The applied approach is justified and illustrated by its application to solving psychological problems. The article is intended for programmers and mathematicians working with the large language models.

General Information

Keywords: intelligent assistant, large language model, artificial intelligence, evolutionary algorithm, quasi-genetic algorithm, prompt engineering, metric multidimensional scaling, psychology

Journal rubric: Data Analysis

Article type: scientific article

DOI: https://doi.org/10.17759/mda.2025150401

Received 01.12.2025

Revised 03.12.2025

Accepted

Published

For citation: Kuravsky, L.S., Odintsov, D.A., Mikhailovsky, M.A. (2025). Evolutionary algorithms to generate prompts and verify responses of intelligent assistants. Modelling and Data Analysis, 15(4), 7–26. https://doi.org/10.17759/mda.2025150401

© Kuravsky L.S., Odintsov D.A., Mikhailovsky M.A., 2025

License: CC BY-NC 4.0

Full text

Introduction

Currently, the number of users of intelligent assistants (now commonly referred to as «artificial intelligence» or AI, or «neural networks») is growing exponentially. A significant portion of these users, having neither mathematical training nor programming experience, declare themselves to be advanced AI specialists. This creates problems due to attempts to misuse available intelligent tools, questionable interpretation and practical application of the results obtained, as well as misleading, grandiose statements without real content.

This makes it important to create user support tools for working with intelligent assistants (IA), and first and foremost, it is important to develop tools for automating the work of prompt engineers who prepare prompts for IA.

In general, the result of a subject's interaction with an IA is determined by two factors:

  • the semantic content of prompts or other texts presented to the IA;

  • the intellectual capabilities of the IA itself, which can vary widely. 

The uncertainty of interpreting the semantic content of prompts or other relevant texts, as well as the known unpredictability of the AI's response to prompts, complicate the application of formalization, requiring its significant adaptation to the new context of application. This work is one of the first attempts at such adaptation. A special notation has been developed to ensure a compact description of the algorithms used.

The generation of plausible but incorrect information, known as «hallucination», remains a problem accompanying the practical use of AI. In particular, it is well known that AI can justify mutually contradictory statements if it receives a corresponding prompt. The only exceptions are strictly substantiated or obvious observable facts. Therefore, it is becoming urgent to find tools that can objectively evaluate the correctness of formulations calculated using AI.

The issues discussed in this paper have become particularly relevant after 2020 (Nikolenko et al., 2020), so there are relatively few relevant publications on the automation of AI and the elimination of the problem of «hallucinations». Among the approaches that inspire moderate optimism are dialogical methods, including the «Debate Game» (Irving et al., 2025) and the Chain-of-Verification Method (Shehzaad et al., 2025), which eliminates «hallucinations» by asking AI to reflect on its own answers and self-correct. However, these approaches are not based on a significant formalization and are hopelessly far from useful practical application.

This paper presents algorithms for solving two problems:

  • generating prompts for which the annotations (brief descriptions of the answers to the prompt) are closest to the given description (the solution is provided by an evolutionary algorithm for selecting prompts);

  • verifying the correctness of the intelligent assistant's answers (the solution is provided by an evolutionary algorithm for verifying the correctness of IA answers, or the «pendulum algorithm»).

The main components of the above algorithms that determine the calculated result are a newly developed quasi-genetic algorithm that ensures the expansion of the set of prompts, and a method of multidimensional metric scaling, a rigorous description of which is rarely found in publications. The quasi-genetic algorithm is constructed by analogy with the well-known genetic algorithm (Emelyanov et al., 2003), used to solve optimization problems and to train neural networks, with crossover and mutation operations replaced by pseudo-crossover and variation operations performed by AI, which are similar in context but fundamentally different in content.

The main principle implemented in the approach to solving problems is that the intelligent assistant performs all substantive operations related to extracting quantitative estimates from the material under study, followed by analysis of these estimates using methods of multidimensional statistical analysis, statistical hypothesis testing techniques, and other mathematical tools.

The tools, which operate based on the algorithms described below, are implemented in software based on the OpenAI API. These tools have been tested in pilot mode with psychological texts, demonstrating convincing results.

The most obvious prospects for practical application of the presented algorithms are in areas where concepts with significant variability in interpretation are used: in psychology, sociology, art history, and other humanities (Shoham et al., 2009; Nikolenko et al., 2020).

This article is intended for programmers who create tools for working with large language models and mathematicians who develop methods for the practical use of artificial intelligence capabilities.

Notation and basic concepts

fig. 1
Fig. 1. Evolutionary prompt-selection algorithm for an intelligent assistant: operations on elements of metric spaces.

 

X — set of prompts for the IA, Y — set of IA responses to prompts from set X, Z — set of IA-generated annotations from set Y. X, Y, Z — metric spaces with quasi-distances π x , π y , π z (calculated for pairs of elements of spaces X, Y, Z, respectively, which are calculated on prompts for IA and represented by matrices χ ( ) ) and Euclidean distances ρ x , ρ y , ρ z (calculated as a result of multidimensional scaling on matrices of pairwise quasi-distances and represented by matrices ϑ ( ) (see Fig. 1). Quasi-distances π x , π y , π z may not satisfy the distance axioms (Kolmogorov, Fomin, 2023), while distances ρ x , ρ y , ρ z satisfy these axioms. Thus, the metric of spaces X, Y, and Z is determined by Euclidean distances calculated because of multidimensional scaling of quasi-distances between non-numerical objects determined using AI. Multidimensional scaling provides filtering of inconsistencies in quasi-distance estimates and a transition from quasi-distances to mutually consistent distances (Borg, Groenen, 2005; Cox T, Cox M, 2001; Morrison, 1976; Rao, 1973).
Considering the capabilities of AI to generate prompts x X and AI mappings { φ ( x ) } x X and { ψ ( y ) } y Y to be virtually unlimited, the metric spaces X, Y, and Z can be considered complete.

X, Y, and Z are also used as scaling spaces. Distances are understood to be Euclidean distances. Quasi-distances are represented by values from the numerical interval [0;1] and are calculated for given pairs of elements of metric spaces X, Y, and Z using AI as the results of prompts for comparing elements included in given pairs. A value of 1 corresponds to a complete match between the contents of the compared elements, while a value of 0 corresponds to a complete mismatch (obviously, the comparison result is ambiguous and is determined by the characteristics of the AI used). AI prompts explicitly specify the requirement to compare the presented elements, expressing the result as a real number from the interval.

AI images y = φ ( x ) and z = ψ ( y ) , where x X , y Y . The functions φ ( x ) and ψ ( y ) are implemented by AI on elements of spaces X and Y. The mappings y = φ ( x ) and z = ψ ( y ) provide the calculation, respectively, of the response to the prompt x and the annotation y using AI.
x = φ 1 ( y ) and y = ψ 1 ( z ) — prototypes z Z and y Y , which are defined only for already calculated images (the specified prototypes are saved for already calculated images).
z + Z — specified annotation (brief description of the response to the prompt).
z Z — current approximation to the annotation.

Operations in metric space X:

τ X ( x ) — the result of determining the neighborhood of prompt x, composed of elements of the set X ;
η ( M ) — the result of calculating the Kameni median for the specified set M X : η ( M ) = arg min x M r M ρ x ( x , r ) (i.e., determining the average element of this set);
χ ( M ) — the result of calculating the matrix π ij = π ( m i , m j ) , where m j , m k M , i , j = 1 , , n , pairwise quasi-distances for n elements of the given set M X using IA while preserving the already formed mutual distances for previously marked elements M;
χ ( M ) — the result of calculating the matrix π ij = π ( m i , m j ) , where m j , m k M , i , j = 1 , , n , pairwise quasi-distances for n elements of a given set M X using IA without preserving the already formed mutual distances for previously marked elements M;
μ i ( G ) — the result of expanding the set G with the matrix of pairwise quasi-distances π jk = π ( g j , g k ) , where g j , g k G , by applying a quasi-genetic algorithm at its i-th iteration ( i =1,2,…); the elements of the set G are analogous to "chromosomes" in classical genetic algorithms;
ϑ ( Π ( M ) ) — matrix of mutual distances ρ ij = ρ ( g i , g j ) , where g j , g k M , i , j = 1 , , n , for n elements of a given set M , obtained as a result of multidimensional scaling of the matrix of pairwise quasi-distances Π = π ij = π ( m i , m j ) , where m j , m k M ; multidimensional scaling provides filtering of inconsistencies in estimates of mutual quasi-distances π x , π y , π z for pairs of elements of spaces X, Y, Z and transition from quasi-distances to their corresponding distances ρ x , ρ y , ρ z ; after this operation, the elements that have undergone multidimensional scaling are marked in order to fix the distances between them in the subsequent steps of the algorithm;
x k = U ( x i , x j ) x k = V ( x i , x j ) , x k = N ( x i ) — pseudo-crossover operations performed using AI in space X ( x i , x j , x k X ) , where U is a generalization of the content of two given texts, V is the selection of matching content from two given texts, N is the negation of the content of a given text;
x k = δ ( x ) — pseudo-mutation (variation) operation of a given text;
X i X — a subset of the space X , calculated at the i-th iteration of the prompt generation algorithm;
γ ( X i ) — result of applying pseudo-crossover operations U , V and N , as well as pseudo-mutation δ to the set X i ;
ω ( X i ) = X i γ ( X i ) , φ ψ — extension of the set of prompts X i by combining this set and supplementing it with γ ( X i ) , as well as the corresponding extensions of the sets Y i = φ ( X i ) and Z i = ψ ( Y i ) ;
X 0 — a set of basic prompts on a given topic;

i++ and j++ — increase the indices i and j by one;

ε — a small positive real number;
α < 1 — positive real number;
I — generating prompt for the pendulum algorithm;
t i — content material (text or text with illustrations) being studied by the AI at the i-th iteration of the pendulum algorithm ( i =0,1,2,…);
T i + — a set of content materials consistent in content with t 0 at the i-th iteration of the pendulum algorithm ( i =0,1,2,…);
T i — a set of content materials that contradict the content of t 0 at the i-th iteration of the pendulum algorithm ( i =0,1,2,…);
ζ ( R ) — the centroid of objects, the mutual distances between which are represented by the mutual distance matrix R (the centroid coordinates are obtained by averaging the coordinates of the specified objects along each of the scaling space axes);
Ω ( R ) — dispersion of distances between objects, whose mutual distances are represented by the mutual distance matrix R , to the centroid ζ ( R ) ;
N max — natural number.

Evolutionary algorithm for prompt selection for AI

The task is formulated as follows.

Given: X 0 — a set of basic prompts on a given topic; z + — a given annotation.
Find: x = arg min x
(i.e., find the prompt x , the annotation to the answer that is closest to the given description z + ).

Solution algorithm:

  1. Set a set of basic prompts X 0 . i = 0 . Y 0 = { φ ( x ) } x X 0 = φ ( X 0 ) and Z 0 = { ψ ( y ) } y Y 0 = ψ ( Y 0 ) for all elements of the sets X 0 and Y 0 . z 0 = η ( Z 0 ) .
  2. If i > 0 , then compute the IA-mappings Y i = { φ ( x ) } x X i = φ ( X i ) and Z i = { ψ ( y ) } y Y i = ψ ( Y i ) for all elements of the sets X i and Y i .
  3. Find the median of Kemeny x i = η ( X i ) and the images of its mappings y i = φ ( x i ) and z i = ψ ( y i ) .
  4. Compute the extension ω ( X i ) = μ i ( X i ) , using a quasi-genetic algorithm.
  5. Compute the matrices χ , χ and χ .
  6. Compute ϑ , ϑ , and ϑ , marking the resulting mutual distances between the elements of the sets X i , Y i , and Z i .
  7. Find the Kameni median η .
  8. Determine the neighborhood of the prompt τ X i using the mutual distances defined by the matrix ϑ .
  9. Compute IA mappings Y i = φ for all elements of the prompt neighborhood τ X i .
  10. Check the condition ( z Z i ) ( z i Z i ) . If the condition is satisfied, then z i + 1 = arg min z , delete the specified relative part of the elements Z d = and their preimages X nd = φ 1 ( Y nd ) and Y nd = ψ 1 ( Z i { Z d ) and proceed to step 11, otherwise proceed to step 4.
  11. Calculate x i + 1 = φ 1 ( ψ 1 ( z i + 1 ) ) .
  12. If ρ z , then x = x i + 1 and stop, otherwise go to step 13.
  13. X i + 1 = τ X ( x i + 1 ) .
  14. i++.

  15. Proceed to step 2.

Quasi-genetic algorithm for performing set expansion of prompts when performing operations  μ i ( G )

  1. Set the set of basic prompts X 0 . j=0.
  2. Check the condition ( x X j ) . If the condition is met, go to step 3, otherwise stop.
  3. Quasi-genetic selection of elements of the set X j at iteration j using the «roulette rule» with the distance ρ z as the quality function, where x X j .
  4. Formation of the complement γ ( X j ) to the set X j by applying pseudo-crossover operations U , V and N , as well as pseudo-mutation δ , to the elements of the set X j .
  5. Combining the set of prompts X j corresponding to iteration j and the complement γ ( X j ) : ω ( X j ) = X j γ ( X j ) .
  6. j++.

  7. Go to step 2.

 

Consider the transformation z i + 1 = Ξ ( z i ) defined by step 10 of the evolutionary algorithm above. According to the selection condition specified in the description of step 10, ρ z . It is true that the sequence { z j } j converges to z + in the space Z . Indeed, ρ z . Since α < 1 , the value ρ z is arbitrarily small for sufficiently large j at ρ z and equal to zero at ρ z . Therefore, in the metric space Z, z + = lim j z j .
The presented version of the evolutionary algorithm demonstrated good convergence to the desired result. However, in a small neighborhood of the given description z + , the distance ρ z may converge not to zero, but to a small positive number, which is due to the limited variability of the results of the pseudo-crossover, given by the operations U , V and N . In this case, it is necessary to either expand the set of pseudo-crossover operations, increasing the variability of its results, or replace the stopping condition ρ z in step 12 of the evolutionary algorithm with a special case of the Cauchy condition for the convergence of numerical sequences: ρ z ( z i + 1 , z i ) < ε .

The convergence of the computational procedure is determined by the condition specified in step 10 of its description, the result of which, in turn, depends on the semantic content of the prompts submitted by the AI and the intellectual capabilities of the AI itself.

Evolutionary algorithm for checking the correctness of AI responses

Main thesis: AI responses that are correct in content are significantly more consistent with each other than responses that are incorrect in content.

Avoiding philosophical discussions on the topic "What is truth?", we will assume that statements that are confirmed in practice by the results of observations are better consistent with each other than their negations. In other words, an indirect assessment of correctness is applied: it is assumed that formulations that are more consistent in content are more plausible than those that are less consistent. We will call this statement the thesis of consistency. The quantitative measure of consistency is the dispersion of IA images of responses in metric space. The significance of differences in dispersion is established by testing the standard null hypothesis using the F-test. The content material required for analysis t 0 is either obtained in response to a generating prompt I or is specified directly.
  1. Obtain the content material necessary for analysis t 0 in response to the generating request I or ask t 0 directly. i=0. T 0 + = t 0 . T 0 = .
  2. i++.

  3. Calculate t i = N ( t i 1 ) .
  4. If i is even, then T i + = T i 2 + t i and T i = T i 1 , otherwise T i = T i 2 t i and T i + = T i 1 + .
  5. Calculate ϑ .
  6. Calculate the variances Ω + = Ω and Ω = Ω .
  7. Assuming that the distributions of distances to centroids ζ + = ζ and ζ = ζ are normal, test the null hypothesis of equality of variances Ω + and Ω using the F-test for the statistic Ω / Ω + if Ω > Ω + , or for the statistic Ω if Ω < Ω + . If the null hypothesis of the F-test on the equality of variances is not rejected and i < N max , then proceed to step 2, otherwise proceed to step 8.
  8. If the null hypothesis of equality of variances is rejected, then if Ω > Ω + , conclude that the content material under investigation is correct t 0 , or if Ω < Ω + , conclude that the content material under investigation is incorrect t 0 , otherwise consider the correctness of the specified material to be undetermined.
For sets of correct and incorrect content materials, Wilks' statistics and the associated F statistics (Borg, Groenen, 2005; Cox T, Cox M, 2001; Cramer, 1999; Morrison, 1976; Rao, 1973) is used to assess the degree of discrimination between them. If the specified discrimination is significant ( p < 0 ,05 for the specified statistics), then the conclusion about the correctness of the content material under study t 0 is considered reliable; otherwise, additional analysis of the content material t 0 should be performed.

The convergence of the computational procedure under consideration is determined by the condition specified in step 7 of its description, the result of which, in turn, depends on the semantic content of the prompt generated to the AI (or the original content being studied) and the intellectual capabilities of the AI itself.

 

fig. 2
Fig. 2. Pendulum algorithm: operations on elements of the set of statements and the set of negations of statements.
As part of the procedure for generating sets T i + and T i in the scaling space, let us consider the set of events generating statements and their negations Γ = { Γ i } i = 1 Τ Γ . We will say that the subsets of events Ο { Ο i } i = 1 Τ Ο Γ and Φ { Φ i } i = 1 Τ Φ Γ are logically related if the conditional probabilities { P ( Ο i Φ 1 Φ Τ Φ ) } i = 1 Τ Ο are sufficiently large, namely: P ( Ο i | Φ 1 Φ Τ Φ ) 1 δ , where δ 1 .
If we assume that the probabilities of autonomously considered events from the subsets Ο and Φ for all i satisfy the inequalities P ( Ο i ) ε and P ( Φ i ) ε , where ε 1 , then the probability of the sequential occurrence of the subsets Φ and Ο is equal to i = 1 Τ Ο P ( Ο i Φ 1 Φ Τ Φ ) i = 1 Τ Φ P ( Φ i ) in the case of a regular connection and i = 1 Τ Ο P ( Ο i ) i = 1 Τ Φ P ( Φ i ) in the case of its absence. The ratio of the second of these probabilities to the first, which can be called the regularity index, is a value of no less than the order of smallness than ε Τ Ο .
Thus, the following statement about the balance of probabilities is valid: the probability of the appearance of subsets of events Ο and Φ in the case of a regular connection in ε Τ Ο exceeds the probability of the appearance of the same subsets in the absence of such a connection.
This statement can be interpreted as follows: when Τ Ο is sufficiently large, even a single manifestation of a certain regularity actually indicates its presence, and the value of Τ Ο can serve as a measure of the reliability of such a conclusion. For example, if Τ Ο is equal to 3 and ε is equal to 0.1, then the probability of a regular connection is approximately 10 3 times greater than the probability of its absence.

In particular, events can be considered as finding sets of points representing statements and their negations in certain areas of the scaling space or the distribution of certain configurations of points corresponding to the classes of features under study in specified areas of this space (Kuravsky, 2014; Kuravsky, Greshnikov et al., 2024; Kuravsky, Orishchenko et al., 2025; Kuravsky, Yuryev et al., 2024).

Torgerson's multidimensional metric scaling method 

Since the description of this method in the above terms is not widely presented in publications, below is a description of one of its common variants — the classic method of multidimensional scaling by Torgerson (Borg, Groenen, 2025; Cox T., Cox M., 2001; Morrison, 1976). This method solves the problem of placing a set of points in a linear Euclidean space of a certain dimension, representing elements of a certain set M , according to a given matrix of pairwise quasi-distances between these elements.

The task, which generally does not have a unique solution, is to find, for a given quasi-distance matrix π ij ( m i , m j ) , где m j , m k M , i , j = 1 , , n , for n elements of a given set M , their coordinates in a linear Euclidean space of dimension m < n , which corresponds to the Euclidean distance matrix ρ ij ( m i , m j ) , where m j , m k M , i , j = 1 , , n , providing the minimum value of the criterion S = i = 1 n j = 1 n | π ij ρ ij | 2 .

Torgerson's multidimensional metric scaling algorithm

  1. Calculate the matrix of quasi-distance squares D = d ij = π ij 2 ( m i , m j ) , где m j , m k M , i , j = 1 , , n .
  2. Calculate the matrix of mutual scalar products (Gram matrix) B = 1 2 JDJ , where J = I 1 n O O T is the double centering matrix, I is the unit matrix of size n x n , O is a column vector of n units, O T is a row vector of n units (multiplication by the matrix J centers the matrix by subtracting the row mean and column mean from each of its elements and adding the overall mean).
  3. Solve the algebraic eigenvalue problem by computing the spectral decomposition B = E T ΛE , where Λ = diag ) is a diagonal matrix of eigenvalues in descending order ( λ 1 λ 2 λ n 0 ), and E is a matrix of corresponding eigenvectors arranged in columns.
  4. Compute the coordinate representation n of the elements of the given set M in a linear Euclidean space of dimension m < n by defining the matrix X = Λ 1 / 2 E m , where Λ = diag ) — diagonal matrix of square roots of m largest eigenvalues of the matrix B , ordered in descending order, E m — matrix of the first m coordinates of the eigenvectors of the matrix B , arranged in columns (i.e., m first terms of the matrix E ; the coordinates n of the elements of the set M in the matrix X of size m × n are also arranged in columns).

It should be noted that:

  • The Gram matrix B is symmetric and positive semidefinite, which allows it to be represented as B = X T X ; one of the methods for calculating the coordinate matrix X is discussed above;
  • It has been proven that the given algorithm provides a solution that ensures the smallest value of the criterion S ;
  • The value S is zero when m = n 1 ;
  • Q is an orthogonal matrix of size m × m , and X is a certain k-coordinate solution to the problem that provides the smallest value of the criterion S . Then Q X is an equivalent solution that provides the same smallest value of the criterion S .

Results

Generation of a prompt for which the annotation is closest to the given description

The prompts below retain the individual style of wording prepared by an experienced prompt engineer.

The text used as the given annotation z + was: «An approach in psychology based on the study of observable behavior and its quantitative analysis through objective measurement methods.»
The set of X 0 obtained because of a real dialogue with AI included the following basic prompts.
  • What is the behavioral approach?

  • How is behavior studied in psychology?

  • What methods are used to analyze behavior?

  • What is objective behavior study?

ChatGPT [gpt-4o-mini] was used to solve the problem.

As a result of an iterative process implementing an evolutionary algorithm, the following prompt was obtained: x . The annotation to the answer that is closest to the given description z + is: «What are the advantages of behaviorism over other psychological schools?»
The corresponding x AI response to the prompt ( φ ( x ) in the notation used) is the following text: «Behaviorism is a psychological approach that focuses on the objective study of observable behavior and ignores internal psychological processes. It is distinguished by its use of experimental methods and its emphasis on the interaction of behavior with the environment.» Seven iterations of the procedure were performed. Figure 3 shows the dependence of the distance ρ z  on the number of iterations obtained in the calculation process.
To implement the pseudo-crossover operations  U , V , N , pseudo-mutation δ , and quasi-distance estimation, the following prompts were sent to the AI.
Prompt U

«Create a new prompt combining ideas from the following two prompts.»

Prompt V

«Create a SHORT new prompt, highlighting the common key ideas from the following two prompts. The prompt should be concise (no more than 2-3 sentences).»

Prompt N

«Create a SHORT new prompt, supplementing the first prompt with elements from the second. The prompt should be concise.»

Prompt δ

«Rephrase the following prompt BRIEFLY, preserving its meaning.»

Prompt Π

«Assess the semantic distance between two texts on a scale from 0.00 to 1.00 as a measure of the difference between their main statements. Be strict: do not underestimate the distance, consider differences in details, level of abstraction, examples, and conclusions. If one text provides a general description and the other provides a specific theory or example, the distance should be increased, even if the topics are similar. Return only one number with two decimal places, without any additional text.»

 

fig. 3
Fig. 3. Distance ρ z as a function of the number of iterations

Verifying the correctness of a thesis using the pendulum algorithm

The prompts below retain the individual style of wording prepared by an experienced prompt engineer.

The phrase «Cultural-historical psychology is based on Marxist theory» was used as the thesis under investigation.

An excerpt from the ChatGPT communication log containing the prompts and the first elements of the sequence of meaningful responses { t i } i is provided below.
Prompt 1 (generating prompt I )

«You are an expert in philosophy, psychology, and scientific methodology. Formulate a clear, well-argued thesis H on a given topic that can be either confirmed or refuted; its plausibility will then be verified by the compactness of clusters of confirming and refuting arguments in the space of meanings. The answer should have a logical structure (thesis, justification, conclusion), be specific and concise (3–5 meaningful sentences).»

Result 1

t 0 content material for analysis (received by IA)

«Cultural-historical psychology is based on Marxist theory.

Justification: The founder of cultural-historical psychology, Lev Vygotsky uses Marx's ideas about the social context in the development of the individual, emphasizing that psychological processes are formed as a result of social interaction and historical conditions. He emphasizes the role of culture and language in cognitive development, which is consistent with the Marxist concept of the influence of the social environment on the individual.

Conclusion: Thus, it can be argued that cultural-historical psychology is largely based on Marxist ideas, as confirmed by an analysis of the key concepts and principles underlying both approaches.»

Request 2 ( N ( t 0 ) ) — refutation of substantive material t 0

«You are working with a chain of theoretical statements.

Formulate a new self-sufficient statement (1–2 sentences) that contradicts the meaning of the given text but remains within the original topic and subject area, preserving the key concepts and changing only the relationships between them.

Do not use metalanguage (such as "argument," "thesis," "this statement/text"), do not paraphrase or quote the original text, and do not offer compromises such as "on the one hand... on the other hand."

Return only the text of the new statement, without explanation.»

Result 2

t 1 : denial of substantive material t 0 (received by IA)

«Cultural-historical psychology cannot be reduced to Marxist ideas, as its basic principles, developed by Lev Vygotsky, emphasize the uniqueness of cultural factors and individual development, significantly differing from the Marxist emphasis on class struggle and economic conditions.

In fact, the understanding of cognitive development stems directly from the specific cultural context, rather than from Marxist theory about the influence of the social environment.»

Request 3 ( N ( t 1 ) ) — refutation of content t 1 (confirmation of content t 0 )

The text of request 3 coincides with the text of request 2.

Result 3

t 2 : confirmation of the content of t 0 (received by IA)

«Cultural-historical psychology is closely linked to Marxist ideas, as the emphasis on social conditions and interaction emphasizes that both individual development and cognitive processes cannot be understood without taking into account the economic and social factors described in Marxism. Thus, understanding these processes really depends on analyzing social structures and class relations, the importance of which significantly overlaps the role of unique cultural factors.»

Request 4 ( N ( t 2 ) ) — refutation of content t 2 (confirmation of content t 1 )

The text of prompt 4 is identical to the text of prompt 2.

Result 4

t 3 : denial of content material t 0 (received by IA)

«Cultural-historical psychology can be understood outside of Marxist theory, as it focuses on unique cultural factors, forms of communication, and linguistic practices that shape cognitive processes independently of economic and social conditions. Thus, individual development and psychology can be studied without relying on class relations or social structures, which highlights the diversity of influences on personality and thinking styles.»

Pairwise quasi-distances for affirmations and negations of content material t 0 , forming the matrix χ , were calculated using the following prompt.

«Estimate the semantic distance between two texts on a scale from 0.00 to 1.00 as a measure of the difference between their main statements. Be strict: do not underestimate the distance, consider differences in details, level of abstraction, examples, and conclusions. If one text provides a general description and the other provides a specific theory or example, the distance should be increased, even if the topics are similar. Return only one number with two decimal places, without any additional text.»

The results of multidimensional scaling, obtained in the form of a matrix of mutual distances ϑ at i = 8 and i =30, are presented as scatter plots in Figures 3 and 4. The complete separability of the sets T i + and T i in the scaling space confirms the correctness of the results obtained.
 
fig. 4
Fig. 4. Results of metric multidimensional scaling for   i = 8
fig. 5
Fig. 5. Results of metric multidimensional scaling for i = 30
 
For a sample including 4 confirmations and 4 denials of the content material t 0 ( i = 8 , the F-test for statistics Ω / Ω + gives a value of 13.66 ( p < 0 ,03 , which, based on the accepted thesis of consistency, allows us to conclude that the content material under study t 0 is correct (i.e., cultural-historical psychology is indeed based on Marxist theory). For a sample including 15 confirmations and 15 denials of the content material t 0 ( i = 30 , the F-test for statistics Ω / Ω + gives a value of 6.17 ( p < 0,001 , which leads to the same conclusion. At a significant level of p = 0 ,05 for testing the null hypothesis, the pendulum algorithm completes its work already at the 8th iteration. The result obtained is semantically correct.

The duration of calculations on a medium-speed computer (processor base frequency — 2.70 GHz) was 71 seconds, with an average duration of 2.34 seconds per prompt to the AI.

Additional observation: narrowing of the chain  { t i } to the semantic core of the contradiction

During the testing phase of the pendulum algorithm, an early version of the prompt specifying the operation N ( t i ) was used. In this formulation, the model received an instruction to formulate a «strict refutation» of the previous text without requirements to preserve the structure of the original thesis and without restrictions on the nature of the transformation of relations between key concepts, which led to an unforeseen phenomenon.
In the early stages (iterations 1–6), the AI's statements directly operate on the content of t 0 , varying the arguments for and against the connection between cultural-historical psychology and Marxist theory.
However, starting from approximately the 15th–18th iteration, and particularly evident in the final iterations, the chain { t i } demonstrates a persistent narrowing of the semantic range. Instead of referring to a wide range of factors (historical materialism, the role of culture, the specificity of psychological mechanisms), statements begin to organize themselves around a single stable semantic opposition:
  1. «Individual experience and unique cultural forms are considered a significant indicator of cognitive development.»

  2. «Individual experience is interpreted as a subjective basis, insufficient without analysis of social and structural conditions.»

Thus, the operation N ( t i ) , defined as the generation of a statement that contradicts the previous one, leads not only to an alternation of affirmations and negations, but also to the identification of a semantic attractor, which in this context is understood as a pair of statements that are an internal contradiction of the discourse assimilated by the AI.

This observation allows us to consider the pendulum algorithm as a tool that automatically identifies the semantic basis around which a dispute on a given topic takes place.

Conclusion

  1. The result of interaction with the IA is determined by two factors: the semantic content of prompts or other texts presented to the IA, and the intellectual capabilities of the IA itself, which can vary widely.

  2. The main principle implemented in the applied approach to solving problems is that the intelligent assistant performs all content-related operations associated with extracting quantitative estimates from the material under study, followed by analysis of these estimates using methods of multidimensional statistical analysis, statistical hypothesis testing techniques, and other mathematical tools.

  3. An evolutionary algorithm has been developed for generating prompts for the IA, the annotations of which are closest to the given descriptions, as well as an evolutionary algorithm for verifying the correctness of the intelligent assistant's responses.

  4. The basis of the evolutionary algorithm for generating prompts is a quasi-genetic algorithm, which ensures the expansion of the set of prompts. The quasi-genetic algorithm is constructed by analogy with the well-known genetic algorithm used to solve optimization problems and, in particular, for training neural networks, with the replacement of crossover and mutation operations with pseudo-crossover and variation operations performed by the IA, which are similar in context of application but fundamentally different in content.

  5. The pendulum algorithm allows identifying the semantic basis around which the debate on a given topic takes place.

  6. A special notation has been developed to ensure a compact description of evolutionary algorithms.

  7. The convergence of the evolutionary algorithm for generating prompts under certain conditions (presented in step 10 of the algorithm description) has been proven, the result of which is determined by the semantic content of the prompts submitted by the AI and the intellectual capabilities of the AI itself.

  8. The convergence of the evolutionary algorithm for checking the correctness of answers is determined by the condition specified in step 7 of its description, the result of which depends on the semantic content of the generating prompt to the AI (or the original content material being studied) and the intellectual capabilities of the AI itself.

  9. It has been proven that with a sufficiently large number of events, even a single manifestation of a certain pattern actually indicates its existence. Such events can be considered as finding sets of points in certain areas of the scaling space.

  10. The pilot application of the developed algorithms for solving psychological problems has demonstrated their effectiveness and semantic correctness.

References

  1. Емельянов В.В., Курейчик В.В., Курейчик В.М. Теория и практика эволюционного моделирования. - М: ФИЗМАТЛИТ, 2003, - 432 с.
    Emelianov V.V., Kureichik V.V., Kureichik V.M. Theory and practice of evolutionary research. - M: FIZMATLIT, 2003, - 432 p.
  2. Колмогоров А.Н., Фомин С.В. Элементы теории функций и функционального анализа. – М.: URSS, 2023. -572 с.
    Kolmogorov A.N., Fomin S.V. Elements of the theory of functions and functional analysis. – M.: URSS, 2023. -572 p.
  3. Куравский Л.С., Юрьев Г.А., Михайловский М.А., Несимова А.О., Юрьева Н.Е., Поляков Б.Ю. Формирование навыков командной деятельности и их объективная количественная оценка на основе квантовых представлений // Экспериментальная психология. 2024. 17(2). C. 154-177. DOI: https://doi.org/10.17759/exppsy.2024170210.
    Kuravskiy L.S., Yuryev G.A., Mikhailovsky M.A., Nesimova A.O., Yuryeva N.E., Polyakov B.Yu. Formation of teamwork skills and their objective quantitative assessment based on quantum representations // Experimental Psychology. 2024. 17(2). P. 154-177. DOI: https://doi.org/10.17759/exppsy.2024170210.
  4. Николенко С.И., Кадурин А.А., Архангельская Е.О. Глубокое обучение.– СПб.: Питер, 2020. – 480 с.
    Nikolenko S.I., Kadurin A.A., Arkhangelskaya E.O. Deep learning. - St. Petersburg: Piter, 2020. - 480 p.
  5. Borg and P. J. F. Groenen, Modern Multidimensional Scaling Theory and Applications (Springer, New York, 2005).
  6. Cox T.F. and Cox M.A.A., Multidimensional Scaling, 2nd ed. (Chapman and Hall/CRC, Boca Raton, 2001).
  7. Cramer H., Mathematical Methods of Statistics. Princeton University Press, 1999. 575 pp.
  8. Irving G., Christiano P., Amodei D. AI safety via debate. – Open AI, 2025. 
  9. Kuravsky L.S. Quantum Representations and Their Applications in Diagnostics. - М.: Де Либри, 2024. - 128 с.
  10. Kuravsky L.S., Greshnikov I.I., Kozyrev A.D., Kosachevsky S.G., Frolova L.I., Zakharcheva A.A. A mathematical model for representing the related operator professional activities and its relevant diagnostic assessment based on the quantum representations, Lobachevskii J. Math., 45 (6), 2534-2551 (2024).
  11. Kuravsky L.S., Greshnikov I.I., Orishchenko V. A. Quantum Representation of the Civil Aircraft Pilot Activity. Lobachevskii Journal of Mathematics., 46 (6), pp.2609-2621, 2025.
  12. Morrison D. F., Multivariate Statistical Methods, 2nd ed. (McGraw-Hill, New York, 1976).
  13. Rao C.R., Linear Statistical Inference and its Applications (Wiley, Hoboken, 1973).
  14. Shehzaad Dhuliawala, Mojtaba Komeili, Jing Xu, Raileanu Roberta, Xian Li, Asli Celikyilmaz, Weston Jason. Chain-of-Verification Reduces Hallucination in Large Language Models. - Meta AI, 2025.
  15. Shoham Y., Leyton-Brown K. Algorithmic, Game-Theoretic, and Logical Foundations. — London: Cambridge University Press, 2009.

Information About the Authors

Lev S. Kuravsky, Doctor of Engineering, professor, Dean of the Computer Science Faculty, Moscow State University of Psychology and Education, Moscow, Russian Federation, ORCID: https://orcid.org/0000-0002-3375-8446, e-mail: l.s.kuravsky@gmail.com

Dmitrii A. Odintsov, student, Computer Science Faculty, Moscow State University of psychology and education, Moscow, Russian Federation, ORCID: https://orcid.org/0009-0008-7082-700X, e-mail: dmitriyodintsov101@gmail.com

Michael A. Mikhailovsky, Research Assistant, Youth Laboratory Information Technologies for Psychological Diagnostics, Moscow State University of Psychology and Education, Moscow, Russian Federation, ORCID: https://orcid.org/0000-0002-7399-2800, e-mail: muxa172002@yandex.ru

Contribution of the authors

All authors participated in the discussion of the results and approved the final text of the manuscript.

Conflict of interest

The authors declare no conflict of interest.

Metrics

 Web Views

Whole time: 103
Previous month: 100
Current month: 3

 PDF Downloads

Whole time: 17
Previous month: 16
Current month: 1

 Total

Whole time: 120
Previous month: 116
Current month: 4