Wikidata for everyone as a local service

From The QA Company
Revision as of 11:04, 6 March 2023 by 172.24.0.1 (talk) (Blog post for The QA Company's website)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Hdt-qs-ui.dae0bb16.png

Introduction[edit | edit source]

In a previous blog we talked about Wikidata, one of the largest existing knowledge graphs. Today we are glad to anounce that we are making wikidata more accessible to all the community and everyone who is interested in open linked data, from researchers to engineers.

Contribution[edit | edit source]
 As far as everybody knows, wikidata provides a public query service that receives millions of queries everyday. To avoid overloading the public service with numerous requests, one could download the data set and load it to the triple store they provide. However this may take up to 12 days just to index the data until one could be able to start running queries ! And you'd probably need an enormous machine with 200GB of memory.

What we are offering today is a docker image that you just have to pull and start on a small machine (16GB of memory could sufice), it will basically download a compressed version of Wikidata (~65GB), and you will be able to run SPARQL queries immedietaly just after the download is finished. This service is built on top of HDT, a compact data structure and binary serialization format for RDF that keeps big datasets compressed to save space while maintaining search and browse operations without prior decompression.

This project is fully maintained by the QA Company and you can follow the link below on how to use the service:

https://hub.docker.com/r/qacompany/hdt-query-serviceIf you have any problems or issues we would be more than happy to hear about it and try to fix it (contact-us)

Conclusion[edit | edit source]

We used wikidata for a long time and we feel today that it's time to pay back, and here we are providing its community with such a nice alternative of their public query service.

[/blog ]