Question of the Week: Kadaster!
Today, for the “question of the week”, I'm going to describe a use case of Question Answering technologies. Querying the DutchKadaster, i.e. the land recording of the real estate or real property's of the Netherlands. We were asked to construct on this data a question answering system and I'm going to present the results.
Some context .... Kadaster is publishing several of their datasets (e.g. topography, addresses & buildings) as a Knowledge Graph, using Triply. It is available at https://data.labs.kadaster.nl/kadaster/kg/. The Graph is pretty large, containing a bit more than 800 million triples. Most of the information is encoded in dutch using different vocabularies like schema.org properties. It contains mainly information about cities, their boundaries, their population, the real estates they contain, addresses and points of interests (which are relevant for the Kadaster). This data is not indexed by search engines like Google or digital assistants like Siri.
We indexed the dataset as is and trained [ QAnswer] on top of it. We were able to do this even if we do not speak dutch ; ). These are some of the questions we could answer on top of the dataset:
- Asking for“amersfoort” a city in the province of Utrecht, Netherlands.
- Asking for the population of a city“Wat is the bevolking of Amsterdam?”
- Searching for suburbs of a city“buurt en amsterdam?”
- Searching for a specific address“Rotterdam Adres van 1e Middellandstraat 30B-01, 3014BE Rotterdam” This is not ideal right now. The problem is that addresses in the Knowledge Graph are encoded as one big string. Refactoring the knowledge, would allow us to ask for specific streets and street numbers in a more flexible way.
- Searching for buildings of a specific type, in this case buildings that can be used for living“sinderen woonfunctie” One can see that there is a quite dense city center in the south and many other houses distributed more in the countryside. This is opposite to the buildings that can be used for industrial use“sinderen industriefunctie”
- Finally we can search for specific types such as transformator-stations“transformatorstation” You see how they are distributed over the country. Or for the petrol stations in a specific city“tankstation en amsterdam”
Our conclusion was that this looks a bit like Google Maps just on open government data. We liked this contract and to work on this geographical dataset, also our dutch got slightly better ; )
That's it for today!
See you next week!The QA Company