Facebook crawler as software agent for business intelligence system

Wojciech Kijas


The article describes attempts made to use data collected from online social network in the enterprise data warehouse. During our research we designed and developed sample independent system which can work in Data Staging Area of Data Warehouse to complement customer’s data with data from Facebook system using FOAF ontology


Business Intelligence; Online Social Network; Facebook; software agent; dimensional modeling; FOAF ontology

Full Text:



McPherson M., Smith-Lovin L., Cook J. M.: Birds of a Feather: Homophily in Social Networks. Annual Review of Sociology, 2001, Vol. 27, p. 415÷444.

Luhn H. P.: A Business Intelligence System. IBM Journal of Research and Development, 1958, Vol. 2, Issue 4, p. 314÷319.

Power D. J.: A Brief History of Decision Support Systems. DSSResources.COM, http://DSSResources.COM/history/dsshistory.html, version 4.0, March 10, 2007.

Kimball R., Ross M.: The Data Warehouse Toolkit. Wiley Computer Publishing, 2002.

Facebook Online Social Network, http://facebook.com/.

Facebook Statistics (verified on 2013-12-10), http://www.statisticbrain.com/facebook-statistics/.

Baumgartner R., Frolich O., Gottlob G., Harz P., Herzog M., Lehmann P., Wien T.: Web data extraction for business intelligence: the lixto approach. In Proc. 12th Conference on Datenbanksysteme in Buro, Technik und Wissenschaft, 2005, pages 48÷65.

Baumgartner R., Froschl K., Hronsky M., Pottler M., and Walchhofer N.: Semantic online tourism market monitoring. Proc. 17th ENTER eTourism International Conference, Switzerland, 2010.

Baumgartner R., Gottlob G., Herzog M.: Scalable web data extraction for online market intelligence. Proc. 35th International Conference on Very Large Databases, 2009, Vol. 2, Issue 2, p. 1512÷1523.

Kahaner L.: Competitive Intelligence: How to Gather, Analyze, and Use Information to Move Your Business to the Top. Touchstone Press, 1997.

Kwak H., Lee C., Park H., Moon S.: What is Twitter, a social network or a news media? In Proc. 19th International Conference on World Wide Web, Raleigh, North Carolina, USA, 2010. ACM, p. 591÷600.

Catanese S., De Meo P., Ferrara E., Fiumara G., Provetti A.: Crawling facebook for so-cial network analysis purposes. In Proc. International Conference on Web Intelligence, Mining and Semantics, Article No. 52, Sogndal, Norway, 2011. ACM.

Gjoka M., Kurant M., Butts C.T., Markopoulou A.: Walking in Facebook: a case study of unbiased sampling of OSNs. In Proc. 29th Conference on Information Communications, IEEE Press, 2010, p. 2498÷2506.

Mislove A., Marcon M., Gummadi K., Druschel P., Bhattacharjee B.: Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, ACM, 2007, p. 29÷42.

Kumar R.: Online Social Networks: Modeling and Mining, The 6th Workshop on Algo-rithms and Models for the Web Graph (WAW 2009), Barcelona, Spain, http://videolectures.net/waw09_kumar_osnmm/.

Kumar R., Novak J., Tomkins A.: Structure and evolution of online social networks. Link Mining: Models, Algorithms, and Applications, Springer, 2010, p. 337÷357.

W3C Semantic Web standard, http://www.w3.org/standards/semanticweb/.

FOAF Project website, http://www.foaf-project.org/.

Golbeck J., Rothstein M.: Linking social networks on the web with FOAF. Proceedings of the 23rd national conference on Artificial intelligence, Vol. 2, AAAI Press, 2008, p. 1138÷1143.

W3C Recommendation, SPARQL, http://www.w3.org/TR/rdf-sparql-query/.

Hyacinth S. N.: Software Agents: An Overview, Knowledge Engineering Review, Vol. 11, No 3, Cambridge University Press, 1996, p.1÷40.

Microsoft Message Queuing technology,


Dydra project website, http://dydra.com/.

LINQ language, http://msdn.microsoft.com/en-us/library/vstudio/bb397926.aspx.

HTML Agility Pack project website, http://htmlagilitypack.codeplex.com/.

LinkedIn Online Social Network, www.linkedin.com.

Stardog enterprise graph database website, http://stardog.com/

Node XL project website, http://nodexl.codeplex.com/

Microsoft Office add-ins development website,


Brandes U.: A Faster Algorithm for Betweenness Centrality,


Clauset A., Newman M. E. J., Moor C.: Finding community structure in very large net-works, http://www.ece.unm.edu/ifis/papers/community-moore.pdf

DOI: http://dx.doi.org/10.21936/si2014_v35.n4.708