Decision rules and databases in web pages classification

Krzysztof Czajkowski

Abstract


This paper concerns applying of rough sets theory to web pages classification. In this work the approach integrating elements of rough sets theory with databases was proposed, which the aim is improving of the efficiency as well as processing of data in the place of them store. The paper describes implementations of algorithms of core attributes and reducts selection as well as decision rules determining. The aim is web pages classifications on the basis of decision rules and the set of features describing individual web pages.

Keywords


rough sets; attributes reduction; reducts; decision rules; web pages classification

Full Text:

PDF (Polski)

References


Pawlak Z.: Some Issues on Rough Sets. Transactions on Rough Sets I, LNCS, Springer 2004, s. 1-58.

Komorowski J., Pawlak Z., Polkowski L., Skowron A.: A Rough Set Perspective on Data and Knowledge. Rough Fuzzy Hybridization (S. K. Pal, A. Skowron, Eds.), Springer-Verlag, 1999, s. 107-121.

Fcrdinandez-Baizan A., Ruiz E., Sanchez J.: Integrating RDMS and Data Mining capabilities using rough sets. In Proc. IPMU’96, Granada, Spain, 1996, s. 1439-1445.

Hu X., Lin T., Han J.: A New Rough Sets Model Based on Database Systems. Fundamenta Informatica 59, IOS Press 2004, s. 135-152.

Hu X.: Using rough sets theory and database operations to construct a good ensemble of classifiers for data mining applications. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference, s. 233-240.

Cercone N., Ziarko W., Hu X.: Rule discovery from databases: A Decision matrix approach. In Proc. ISMIS’96, Zakopane, 1996, s. 653-662.

Hu X., Shun N., Cercone N., Ziarko W.: DBROUGH: A Rough Set Based Knowledge Discovery System. Lecture Notes in Computer Science, Springer-Verlag, 1994, . s. 386-395.

Nguyen S. H., Nguyen H. S.: Some efficient algorithms for rough set methods. Proceedings of the Sixth International Conference, Information Procesing and Management of Uncertainty in Knowledge-Based Systems (IPMU-96), 2, Granada, Spain, 1996, s 1541-1457.

Skowron A., Rauszer C: The Discernibility Matrices and Functions in Information Systems. Intelligent Decision Support - Handbook of Applications and Advances of the Rough Sets Theory, K. Slowinski (ed), Kluwer, Dordrecht, 1992, s. 331-362.

Mrózek A., Płonka L.: Analiza danych metodą zbiorów przybliżonych - Zastosowania w ekonomii, medycynie i sterowaniu. Akad. Oficyna Wyd. PLJ, Warszawa 1999.

Olson D., Delen D.: Advanced Data Mining Techniques, Springer Berlin 2008.

A Rough Set Toolkit for Analysis of Data: http://www.lcb.uu.se/tools/rosetta/.

Rough Set Exploration System: http://logic.mimuw.edu.pl/~rses/.

Knut Magne Risvik, Discretization of Numerical Attributes. Preprocessing for Machine Learning, Norwegian University of Science and Technology - Department of Computer and Information Science, 1997.

Czajkowski K., Drabowski M.: Wybrane zagadnienia integracji zbiorów przybliżonych i baz danych, Studia Informatica, Gliwice, Vol. 30, No. 2A(83), 2009, s. 355-372.

Czajkowski K., Drabowski M.: Relational database and core, relative reducts in rough sets. Proceedings of the IASTED International Conference on Artificial Intelligence and Soft Computing, Palma de Mallorca, ACTA Press, Anaheim, USA, 2009, s.14-20.

Yin S., Wang F., Xie Z., Qiu Y.: Study on Web-Page Classification Algorithm Based on Rough Set Theory, Proceedings of ISIP’2008, s. 202-206.

Dong L., Watters C., Duffy J., Shepherd M.: An Examination of Genre Attributes for Web Page Classification. Proceedings of the 41st Annual Hawaii International Conference on -System Sciences (HICSS 2008).




DOI: http://dx.doi.org/10.21936/si2010_v31.n2A.367