Friday, September 7, 2018

Google Launches New Search Engine For Scientists And Journalists To Help Find Datasets

Google announced a new search engine that has the ability to find the needed datasets for scientist, journalists, policymakers and other groups based on their work.


This platform was launched on Wednesday, trawls the millions of open data repositories on the web for desired datasets. It looks on digital libraries, publisher websites, and on author’s personal web pages, among other places. However, it relies on publisher’s dataset to correctly label their datasets with the appropriate information, or metadata tags, as their otherwise known.

In a blog post written by Google AI research scientist Natasha Noy: “To create Dataset search, we developed guidelines for dataset providers to describe their data in such a way that it helps Google and other search engine better understand the content of their pages”. “These guidelines include salient information about datasets, like; who created the dataset, when it was published, how the data was collected, what are the terms for using the data, and so on.”

Moreover, “We then collect and link this information, analyze where information versions of the same dataset might be, and find publications that may be describing or discussing the dataset. Our approach is based on an open standard for describing this information and whoever publishes data can describe their dataset this way. We encourage dataset providers, large and small, to adopt this common standard so that all datasets are part of this robust ecosystem.”

This new Google Dataset Search Engine is available in multiple languages and can be found here.

According to Noy, “Simply enter what you are looking for and we will guide you to the published dataset on the repository provider’s website”.


EmoticonEmoticon