26-28 November, 2019, Vilnius

Gerard Toonstra

Coolblue B.V., The Netherlands


Gerard Toonstra is a technical lead for the data systems team at Coolblue in The Netherlands, a kaggle master and artist.


Data Discovery with Amundsen

Company data is increasingly widespread and it has always been difficult to understand who is using which tables or columns, how often and how the dataset was produced. There are commercial tools available that assist for some of these questions, but now there’s also an open-source tool called “Amundsen”, which is already in use by several larger companies. Amundsen helps you answer questions like how a dataset was produced, who else uses it, who’s the owner of a dataset and people using the tool can update the table or column descriptions. Amundsen isn’t around for a very long time, but data lineage and data discovery are already significant problems or at least challenges for companies; Amundsen is a strong answer to these problems, because it doesn’t just focus on the data itself, but also on the people and communities around the data that are using it.

Session Keywords

Data Discovery
Data Lineage