Document-oriented RDF graph store

Łukasz Szeremeta, Dominik Tomaszuk


Document-oriented NoSQL databases are not commonly used in Semantic Web and Linked Data environments. The article describes the idea of an document-oriented RDF graph store. We present alternative RDF serialisation, allowing for efficient processing of graph data in an NOSQL graph store. This means that a database such as RethinkDB can be an RDF graph store. Moreover, our proposal supports various techniques for caching, which is a novelty for an RDF/JSON serialization.


RDF; NoSQL; Semantic Web; semi-structured data

Full Text:



Aranda-Andújar A., Bugiotti F., Camacho-Rodríguez J., Colazzo D., Goasdoué F., Kaoudi Z., Manolescu I.: AMADA: web data repositories in the Amazon Cloud. Proceedings of the 21st ACM international conference on Information and knowledge management, ACM, 2012, p. 2749÷2751.

Bizer C., Heath T., Berners-Lee T.: Linked data-the story so far. 2009.

Bizer C., Schultz A.: The Berlin SPARQL benchmark. 2009.

Broekstra J., Kampman A., van Harmelen F.: Sesame: A generic architecture for storing and querying RDF and RDF schema. In The Semantic Web – ISWC 2002, Springer, 2002, p. 54÷68.

Bugiotti F., Goasdoué F., Kaoudi Z., Manolescu I.: RDF data management in the Amazon Cloud. Proceedings of the 2012 Joint EDBT/ICDT Workshops, ACM, 2012, p. 61÷72.

Cudré-Mauroux F., Enchev I., Fundatureanu S., Groth P., Haque A., Harth A., Keppmann F.L., Miranker D., Sequeda J.F., Wylot M.: NoSQL databases for RDF: An empirical evaluation. In The Semantic Web – ISWC 2013, Springer, 2013, p. 310÷325.

Cuzzocrea A., Cosulschi M., de Virgilio R.: An Effective and Efficient MapReduce Algorithm for Computing BFS-Based Traversals of Large-Scale RDF Graphs. Algorithms, Vol. 9(1), 2016, p. 7.

Cyganiak R., Wood D., Lanthaler M.: RDF 1.1 Concepts and Abstract Syntax. W3C recommendation, World Wide Web Consortium, February 2014, TR/2014/REC-rdf11-concepts-20140225/.

Dean J., Ghemawat S.: MapReduce: simplified data processing on large clusters. Communications of the ACM, Vol. 51(1), 2008, p. 107÷113.

Dohmen L., Edlich I.S., Hackstein M.: A Declarative Web Framework for the Server-side Extension of the Multi Model Database ArangoDB. 2014.

Fielding R., Nottingham M., Reschke J.: Hypertext Transfer Protocol (HTTP/1.1): Caching. Technical Report 7234, RFC Editor, June 2014, rfc/rfc7234.txt.

Galárraga L., Hose K., Schenkel R.: Partout: A distributed engine for efficient RDF processing. Proceedings of the companion publication of the 23rd international conference on World Wide Web companion, International World Wide Web Conferences Steering Committee, 2014, p. 267÷268.

Hose K., Schenkel R.: WARP: Workload-aware replication and partitioning for RDF. 2013 IEEE 29th International Conference on Data Engineering Workshops (ICDEW), IEEE, 2013, p. 1÷6.

Huang J., Abadi D.J., Ren K.: Scalable SPARQL querying of large RDF graphs. Proceedings of the VLDB Endowment, Vol. 4(11), 2011, p. 1123÷1134.

Kaoudi Z., Manolescu I.: RDF in the Clouds: A Survey. The VLDB Journal, 2014, p. 1÷25.

Khadilkar V., Kantarcioglu M., Thuraisingham B., Castagna P.: Jena-hbase: A distributed, scalable and efficient RDF triple store. Proceedings of the 11th International Semantic Web Conference Posters & Demonstrations Track, ISWC-PD, Vol. 12, Citeseer, 2012, p. 85÷88.

Ladwig G., Harth A.: CumulusRDF: Linked Data Management on Nested Key-Value Stores. Proceedings of the 7th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS2011) at the 10th International Semantic Web Conference (ISWC2011), 2011.

McBride B.: Jena: A semantic web toolkit. IEEE Internet computing, Vol. 6(6), 2002, p. 55÷59.

Papailiou N., Konstantinou I., Tsoumakos D., Koziris N.: H2RDF: adaptive query processing on RDF data in the cloud. Proceedings of the 21st international conference companion on World Wide Web, ACM, 2012, p. 397÷400.

Podlipnig S., Böszörmenyi L.: A survey of web cache replacement strategies. ACM Computing Surveys (CSUR), Vol. 35(4), 2003, p. 374÷398.

Przyjaciel-Zablocki M., Schätzle A., Hornung T., Dorner C., Lausen G.: Cascading map-side joins over HBase for scalable join processing. CoRR, abs/1206.6293, 2012.

Punnoose R., Crainiceanu A., Rapp D.: Rya: A Scalable RDF Triple Store for the Clouds. Proceedings of the 1st International Workshop on Cloud Intelligence,

Cloud-I ’12, ACM, New York, NY, USA 2012.

Ramanathan S., Goel S., Alagumalai S.: Comparison of Cloud database: Amazon’s SimpleDB and Google’s Bigtable. 2011 International Conference on Recent Trends in Information Systems (ReTIS), IEEE, 2011, p. 165÷168.

Ravindra P., HyeongSik K., Anyanwu K.: An intermediate algebra for optimizing RDF graph pattern matching on MapReduce. In The Semantic Web: Research and Applications, Springer, 2011, p. 46÷61.

Rohloff K., Schantz R.E.: Clause-iteration with MapReduce to scalably query datagraphs in the SHARD graph-store. Proceedings of the fourth international workshop on Data-intensive distributed computing, ACM, 2011, p. 35÷44.

Fielding R.T.: Architectural styles and the design of network-based software architectures. Diss. University of California, Irvine 2000.

Schätzle A., Przyjaciel-Zablocki M., Hornung T., Lausen G.: PigSPARQL: a SPARQL query processing baseline for big data. Proceedings of the 2013th International Conference on Posters & Demonstrations Track-Volume 1035, CEUR-WS. Org, 2013, p. 241÷244.

Schätzle A., Przyjaciel-Zablocki M., Skilevic S., Lausen G.: S2RDF: RDF Querying with SPARQL on Spark. arXiv preprint arXiv:1512.07021, 2015.

Stein R., Zacharias V.: RDF on cloud number nine. 4th Workshop on New Forms of Reasoning for the Semantic Web: Scalable and Dynamic, 2010, p. 11÷23.

Tomaszuk D., Rybiński H.: Grouping Multiple RDF Graphs in the Collections. International Conference: Beyond Databases, Architectures and Structures, Communications in Computer and Information Systems, Vol. 424, Springer International Publishing, 2014.

Tomaszuk D., Skonieczny Ł., Wood D.: RDF graph partitions: A brief survey. International Conference: Beyond Databases, Architectures and Structures, Communications in Computer and Information Systems, Vol. 521, Springer International Publishing, 2015.

Tomaszuk D.: Named graphs in RDF/JSON serialization. Zeszyty Naukowe Politechniki Gdańskiej, 2011, p. 273÷278.

Zeng K., Yang J., Wang H., Shao B., Wang Z.: A distributed graph engine for web scale RDF data. Proceedings of the VLDB Endowment, Vol. 6(4), 2013, p. 265÷276.