Our colleagues at the spatio-temporal modelling lab offer an MSc thesis on “A Linked Open Data portal for annotating statistical datasets”:

The statistical datasets currently available in the Web often lack important information such as

  • descriptions where the spatial coordinates can be found,
  • which spatial coordinate system is used,
  • whether the data represents objects or a continuous phenomenon in space and time (fields),
  • what the observation window for a point pattern variable is.

In order to enable automated integration of this information in statistical software and hence to ensure meaningful analysis, the missing information should be queried from data providers or users and needs to be made accessible in the Web in a structured way.

The student will work on a Website that allows users to upload links to datasets available in the Web and add descriptions to these datasets. Useful description items need to be identified and existing methods for annotating spatio-temporal datasets need to be analysed. The dataset descriptions will be made accessible in the Web as Linked Open Data. To illustrate usage of the descriptions in statistical software, the SPARQL R package will be used to retrieve the descriptions and automatically import the annotated dataset in R.

Required Skills: Interest in spatial statistical analysis. Knowledge of common spatio-temporal data formats and Web technologies such as JavaScript and/or PHP. Knowledge of Linked Data technologies is an advantage, but can also be acquired during the thesis development.

Contact: Christoph Stasch, Simon Scheider, Edzer Pebesma.

