Text documents’ representation and retrieval at databases systems

Jakub Cieślewicz, Adam Pelikant


The article propels the problem of retrieval documents due to their real content. It describes the model of continuous texts representation as vectors and presents the mechanisms of weights assignment to the individual document features as well as the algorithm of comparing leaning on cosines’ measure between vector representations of documents and query.


businnes intelligence; text mining; data mining; Oracle

DOI: http://dx.doi.org/10.21936/si2009_v30.n2A.489