Mechanism of analysis of similarity short texts, based on the Levenshtein distance

Artur Niewiarowski, Marek Stanuszek


This paper presents the proposal of text mining mechanism based on Levenshtein Distance Algorithm (LDA), which effectively detect the similarity of different length words. This algorithm for similarity analysis of sentences is used and successfully detects similarities between single sentences. Mechanism is characterized by speed of data analysis and simplify of implementation.


Natural Language Processing (NLP); Natural Language Understanding (NLU); Data Mining; Text Mining; Levenshtein Distance Algorithm

