Modelling and Data Analysis
2024. Vol. 14, no. 1, 27–40
doi:10.17759/mda.2024140102
ISSN: 2219-3758 / 2311-9454 (online)
Extracting Scientific and Technical Facts from Industry Documents Based on Methods of Their Semantic-syntactic and Conceptual Analysis
Abstract
Extraction of scientific and technical facts is a difficult task in terms of correctness of the obtained information. The proposed fact extraction model is based on clear ideas about the semantic structure of the text, expressed as a hierarchy of syntactic constructions of meaning units, which allows identifying interphrase relations in contacted sentences. Individual words, word combinations inherent to a particular subject area and forming its conceptual composition are used as meaning units. The procedures of phraseological, conceptual and semantic-syntactic analysis of texts are used to process the source text.
General Information
Keywords: fact extraction, semantic-syntactic analysis, semantic-syntactic analysis, conceptual analysis, semantic triad
Journal rubric: Data Analysis
Article type: scientific article
DOI: https://doi.org/10.17759/mda.2024140102
Received: 04.03.2024
Accepted:
For citation: Kan A.V., Kozlovskaya Y.D., Tokolova A.A. Extracting Scientific and Technical Facts from Industry Documents Based on Methods of Their Semantic-syntactic and Conceptual Analysis. Modelirovanie i analiz dannikh = Modelling and Data Analysis, 2024. Vol. 14, no. 1, pp. 27–40. DOI: 10.17759/mda.2024140102. (In Russ., аbstr. in Engl.)
References
- Curcic D. Number of Academic Papers Published Per Year // Wordsrated URL: https://wordsrated.com/number-of-academic-papers-published-per-year/#:~:text=As%20of%202022%2C%20over%205.14,5.03%20million%20papers%20were%20published. (date of reference: 09.10.2023).
- Belonogov G.G., Kalinin Y.P., Khoroshilov A.A. Computer linguistics and perspective information technologies. Theory and practice of building systems of automatic processing of text information. - Moscow: Izd-vo Russky Mir, 2004.
- Khoroshilov Al-Dr. A., Musabaev R.R., Kozlovskaya Y.D., Nikitin Y.V., Khoroshilov A.A. Automatic detection and classification of information events in mass media texts// Scientific and Technical Information. Series 2: Information processes and systems. 2020. №7. С. 27-38. DOI: 10.36535/0548-0027-2020-07-4
- Khoroshilov Al-Dr. A., Nikitin Y.V., Khoroshilov Al-ey. A., Budzko V.I. Automatic creation of formalized representation of semantic content of unstructured text messages of mass media and social networks // Systems of High Availability, No.3, Vol.10, 2014, pp.36-51.
- Kan A.V., Kozlovskaya Y.D., Kadushkin N.A., Khoroshilov Al-r A. Automatic clustering of media documents based on the analysis of their semantic content // Modeling and Data Analysis. 2020. Vol. 10. No. 3. C. 24-38. DOI: https://doi.org/10.17759/mda.2020100302
- Bogatyrev, M. Yu. Fact extraction from natural language texts using conceptual graph models // Izvestiya TulSU. Technical Sciences. -2016. - № 7. - Ч. 1.
- Khoroshilov Al-Dr. A., Kozlovskaya Ya.D., Musabaev R.R., Krasovitsky A.M., Khoroshilov Al-ey A. Determination of the tone of media messages by their conceptual analysis method// Modeling and Data Analysis. 2019. №4. DOI: 10.17759/mda.2019090405
Information About the Authors
Metrics
Views
Total: 77
Previous month: 6
Current month: 3
Downloads
Total: 30
Previous month: 3
Current month: 1