• Day 1
  • Day 2
  • Day 3
    • 9:00 - 10:00  -  Registration / Networking
    • Time
      Main Stage
    • 10:00 - 10:10
      Welcoming Remarks
    • 10:10 - 10:30
      Valentine Gogichashvili
      What data can do for your business - Zalando's case
      Data has become the magic word across the world. Data is the new oil. When we talk about data, we should not forget, that, as oil, it should be first harvested, then refined, and only then it can be used, to fuel the engines of the statistical and machine learning systems to generate insights, understand what our customers want, and make correct decisions faster than others. We at Zalando – one of the biggest online fashion retailers in Europe – are working hard on making data be that oil for us. I will cover the problems of harvesting and understanding data and will talk on where we at Zalando succeed and where we still need to do more progress on making data – the fuel of our future.
    • 10:30 - 10:50
      Michael Crawford
      How not to spend $1,000,000 in 12 months for no return
      Setting up a data science function in a business is hard. Over the past four years, Michael have had experience and heard of, many successful and unsuccessful attempts to set up data science operations within various businesses. This talk sets out observations on what criteria and decisions need to be made to reduce the chances of failure and presents a template for the successful implementation of a data science operation within a business.
    • 10:50 - 11:10
      Mick Cooney
      Insurance: A New Frontier for Data Science
      Insurance has a long-standing and well-deserved reputation for being boring. While insurance is still culturally conservative, it also is full of fascinating data problems from all perspectives. In this talk the speaker will discuss the industry, describe some of the problems he finds interesting, and predict potential changes the industry will face in the coming decade.
    • 11:10 - 11:30
      Dachi Choladze
      Building first AI startup in Georgia
      Building a tech startup is a very challenging process. Everyday you come across with barriers you have never encountered before. However, majority of those challenges are similar for all startups in the tech sector and this is where the difference comes for AI Startups - Lack of knowledge of the possibilities of this particular technology, doubts whether it really "exists" or not, understanding and expectations based on "Hollywood" - these are the challenges you come across if you are in an AI Startup. In this talk, we will discuss what happens differently when you are building an AI Startup.
    • 11:30 - 12:00
      Coffee Break
    • 12:00 - 12:20
      Stefan Turkheimer
      Cutting through the noise: harnessing data to win political races
      Political campaigns compete with all other messaging for your attention but a close look at social and demographic data can help a campaign convert voters to their cause as stefan Turkheimer explains.
    • 12:20 - 12:40
      Alexey Natekin
      Open Data Science: beyond traditional scientific communities
      The speaker will share their experience on how to foster Data Science within local communities across the globe, and how they can consistently develop world’s leading expertise in Data Science. These local communities are all enthusiasm-driven, with core value as the free open scientific and engineering knowledge for everyone. In particular, the speaker will talk about different types of events one can setup, beyond traditional meetups. One such event series called ml trainings help them regularly beat everyone on kaggle: 15 members of Open Data Science are in kaggle’s top-100. Science, Drinking, Rock-n-roll.
    • 12:40 - 13:00
      Christiaan Triebert
      The Power of Open Source Investigation
      What can journalists and regular citizens do to investigate governments and armed groups who don't or hardly provide any information about incidents, bombings, tortures or corruption? A growing number of citizens are pursuing facts themselves. Bellingcat, an international investigative collective, uses online open source information in combination with digital tools to uncover the facts themselves. How do they work, and which tools and methods do they use? In this short talk, the audience will be shown the power of open source investigation.
    • 13:00 - 13:20
      Jason Addie
      Open data: for eveyone by everyone
      NGOs around the world are demanding that governments make their data open and available to the public. How can NGOs abide by the same standards and make all of their valuable data open and available to the public too?
    • 13:20 - 13:40
      Jelena Vasic
      Behind the award-winning open data portal
    • 13:40 - 15:00
      Lunch Break
    • 15:00 - 15:15
      Jaume Portell
      Active Retail Intelligence: when Artificial Intelligence meets Digital Signage
      Digital Signage technology has been developing powerful capabilities to amaze customers in the retail space, visuals, formats, resolutions…. Sensing technologies have added eyes and ears to retail spaces and can now understand who, where and when in real time, with high precision. When those two technologies meet in the retail space, a new discipline arises with a strong voice: Active Retail Intelligence
    • 15:15 - 15:30
      Jakub Gornicki
      Stories vs Narratives. Using data for good.
      For a long time we were focused on providing more and more data(sets) with hope that citizens will reach to the source and magic will happen. It didn't. We left a space open for narratives who only use data to serve the goal they want to prove rather then to seek the truth. What should data people do? How to combine stories with data? Why care not only about the source but the outcome? And how not to narrative giver but a storyteller.
    • 15:30 - 15:45
      Thomas Burns
      Feeling the data: how to build stories that people care about
      Data can be powerful building blocks for storytelling, but facts and figures alone cannot make a good story. How can we transform data into stories that give life and longevity to the ideas we are trying to communicate? How can we turn hard science into engaging, emotional experiences for our audiences? Why is this important? Join story producer Thomas Burns as he walks us through the elements of strong storytelling and describes why building strong narrative is critical for realizing the full potential of our message.
    • 15:45 - 16:00
      Mariam Kobuladze
      What design adds to data
      There is plenty of data in the world, but it’s not always easy to make sense of it. That’s what data visualization is trying to solve, to see the stories behind the data and make stories visible. Information design translates those stories into a more universal - visual language, that can be perceived and felt easily. What are the challenges that information designers are encountering when creating visualizations to enable people to understand, feel and care about the issues and stories that are hidden into the numbers? How do they combine principles of graphic design and data understanding skills to enrich the data with visualizations.
    • 16:00 - 16:20
      Miriam Quick
      Oddityviz: Visualizing Bowie
      After David Bowie died in early 2016, London-based designer Valentina D’Efilippo and British data journalist Miriam Quick decided to pay tribute by turning one of his best-known songs, Space Oddity, into data visualization. The result was Oddityviz, a collection of ten engraved records visualizing data from the song. Each 12-inch disc deconstructs the track in a different way: melodies, harmonies, lyrics, structure and story are transformed into new visual systems. The records are accompanied by a series of matching posters and a moving image piece that visualizes the music in real time. Oddityviz was exhibited at W+K London in January 2017. Miriam will talk about how they created the project and what they learned along the way.
    • 16:20 - 17:00
      Coffee Break
    • 17:00 - 17:20
      Andriy Gazin
      Developing data visualization: from A to Z
      The goal of this talk is to present data visualization process from a starting point (data) to the end (visualization). To explain a design decision behind every single element of dataviz - like visual encodings, grid, annotations, colors etc. And to give a list of criterions for evaluation of data visualization.
    • 17:20 - 17:40
      Anastasia Valeeva
      Losing my favourite game: how journalists are not catching up with open data
      Slowly dying open data portals and apps, mistakes in data interpretation and collaborations that never happened - I have now a sad collection of missed opportunities for open data in journalism, based on my research of open data investigative journalism in Russia and experience in teaching data journalism in Western Balkans and Central Asia. Let’s talk about the global trends and regional specifics of not using open data to its full potential, and discuss what are the ways forward to overcome the barriers. Growing a healthy community of open data users in the region and engaging with global data communities are surely first on the list.
    • 17:40 - 18:00
      Rayna Breuer
      Can data journalism save us from fake news?
      Fake news are spreading with the speed of bushfires. How are journalists tackling it and what can data journalism do against it? Is data-driven journalism the panacea to combat fake news? Or just another way to produce fake news?
    • 18:00 - 18:20
      Emma Lacey-Bordeaux
      Ready, set, fact check!
      These days we’ve got greater access than ever to information. Videos, photos, listicles, memes and articles all seek our attention, all competing to tell us something about the world we live in. Both the volume of information and the speed make it more important than ever for fact checking to infuse the sensibility of journalists and newsrooms the world over. Emma Lacey-Bordeaux looks at how this effort can become faster, more reflexive and done in service of telling good stories that people want to read, watch and share.
    • 18:20 - 18:40
      Closing Remarks
    • 18:40 - 20:00
      Reception / Networking
      • 9:00 - 10:00  -  Registration / Networking
      • 10:00 - 11:00  -  Welcome & Panel hosted by Eric Barrett (Main Stage)
      • 11:00 - 19:00  -  Ministry of Data Hackathon (Library)
      • Time
        Library Room
        Room 113
        Room 223
        Room 303
        Room 416
        Small Conference Room
      • 11:00 - 12:30
        Library Room
        Визуальный сторителлинг: хороший, плохой, уродливый

        (Mariam Gamkharashvili, ForSet)
        Мариам расскажет о важных принципах и ключевых элементах визуального сторителлинга, которые необходимо учесть при создании вашей истории. Онa предоставит примеры хороших и плохих визуализаций, чтобы лучше понять важность правильной визуальной коммуникации. Затем участники будут разделены на группы и рассмотрят примеры визуализaций и придложат способы как улучшить их.
        Room 113
        DDJ quick and easy

        (Rayna Brauer, DW)
        What can you do in 100 seconds? Miss the train. Eat an ice cream, or two. Fall in love and then be disappointed. Read the title, subtitle and teaser of an article. Ask Siri if it is going to rain and then curse Siri because it will obviously going to rain. In 100 seconds you can do a lot of things. You can even visualize data . You don't even need to know how to program. Tools like Datawrapper let oyu quickly create visualizations of your data.
        Room 223
        Finding stories in data

        (Miriam Quick, UK)

        Stories are at the heart of journalism. A growing branch of journalism – data journalism – uses data to uncover original stories or explore fresh angles on existing stories.

        But telling a data-driven story is a skill. After you’ve gathered, analyzed and visualized your data, how do you bring out the story from a dataset and communicate it to your readers in the most effective way?

        This workshop will cover the top-level principles of writing about data in English for a broad audience, including the types of data-driven narrative, crafting effective concepts and writing on and around charts and visualizations. We’ll take a simple dataset and explore how to communicate key insights from it in an engaging – and accurate – way.

        Room 303
        Teaching Probability and Statistics from a Simulation Perspective

        (Mick Cooney, Dublin Data Science)
        Simulation and probability are natural partners - the frequentist interpretation has at its heart the idea of replicating events over and over. In many cases though, these ideas are hard to understand at a deep level as they are abstract. In this workshop we solve that by programming simulations to illustrate the core idea of statistics and probability such as the Law of Large Numbers, Bayes Rule and the Central Limit Theorem.
        Room 416
        Neural networks, types, comparisons, algorithms

        (Soso Khutsishvili, Deep Learning Tbilisi)
        We will tell you about neural networks, history and trends. Speak about neural network types, differences between them and tasks they fit and solve. Discuss in detail back-propagation algorithm, which plays main role. Also few words about convolutional neural networks, reccurent neural networks, autoencoders and so on. Real life examples and modern trends. Interesting references, papers and source code examples of course.
        Small Conference Room
        Open legislative data: what and how

        (Nadiia Babynska, Open Data Lab Ukraine)
        We are going to talk about open legislative data. Why it is important? What kind of data should be published, structure and format? How to communicate open legislative data? We are interested to hear stories from your countries too. And we are going to work in groups to develop the best messages for openning legislative data as open data
      • 12:30 - 13:00
      • 13:00 - 14:30
        Library Room
        Курс молодого дата-бойца: найти-очистить-проанализировать

        (Anastasia Valeeva, Kyrgyzstan)
        Что такое открытые данные и где их найти? Как и почему стоит использовать открытые данные для аргументирования своей точки зрения? Вместе с участниками мы составим план дата истории, найдем набор данных по теме, подготовим его к работе и выполним базовый анализ данных с помощью табличного программного обеспечения (Google Spreadsheets).
        Room 113
        DataViz design

        (Mariam Kobuladze, ForSet)
        Mariam will cover the basic principles of information design, such as layout, visual hierarchy, colors, typography, and chart types. She will take you through the process of creating a data visualization, such as static infographics and interactive visulizations.
        Room 223
        (re)Using datasets for more effective fact-checking

        (Elena Calistru, Funky Citizens)
        Even though not all the organizations deal with both data-driven projects and fact-checking projects, in many countries exist both fact-checking platforms and, for example, public spending monitoring platforms. The workshop builds on the Funky Citizens experience in fact-checking by (re)using the data we gather in other data-driven projects. The work will explore ways to foster partnerships between organisations in our countries and to give a new communication twist for not so sexy topics like budgets or complicated datasets through fact-checking. The workshop will be very hands on and interactive and its aim is to identify together ways to bring more data in the everyday life of our public.
        Room 303
        High-Quality Interactive Dashboards or “CEO Porn and how to make it

        (Michael Crawford, Barnett Waddingham)
        Data science can seem forbidding and opaque to the executives running a business. It is our experience that exceptional data visualisations are often the most effective way of communicating with the people paying your wages. This workshop sets out to build a web-based interactive dashboard using open source libraries such as DC.js, leaflet.js and crossfilter. The workshop will be a whistle-stop tour of how we learnt to build dashboards and will include all the code demonstrated via Github. If you wish to code along I recommend having either Brackets or Atom installed on your laptop.
        Room 416
        Image Analysis with Deep learning

        (Levan Tsinadze, Pulsar AI)
        Convolutional neural networks are workhorses of modern computer vision tasks such as object recognition, object detection, semantic segmentation etc. In this workshop we will cover basics of deep convolutionoal neural networks and their applications in image analysis. We will cover training vanila convolutional nets from scratch, as well as fine tuning pre-trained models in Python.
        Small Conference Room
        Open contracting in action
        Public procurement monitoring - you are not alone!

        (Karolis Granickas, , Open Contracting Partnership; Viktor Nestulia, TI Ukraine, ProZorro)
        Every third dollar spent by governments globally is spent through public procurement. Every minor shift in savings, transparency, and efficiency of these funds can mean tremendous benefits for everyone. This is where open contracting steps in - to help make sure procurement is done fairly and creates value for many.
        You might have heard about open contracting. The workshop is your opportunity to learn more and see how you can use it in your work through a quick introduction and practical hands-on exercise on Open Contracting Data Standard (OCDS).
        Many countries now in stage of e-procurement systems implementation. More transparency, more data, more tools increase demand for monitoring. But how are you monitoring tenders? How many of us in Ukraine? Georgia? Romania? Other countries? We are not alone - let’s think how collective action can unveil corruption and improve public procurement system?
      • 14:30 - 15:30
      • 15:30 - 17:00
        Library Room
        Инструменты визуализации данных

        (Andriy Gazin,
        Задача этого воркшопа - помочь участникам выбрать оптимальный инструмент визуализации данных для их проекта. Дать обзор инструментов визуализации данных для разных форматов (статическая/интерактивная визуализация, картография, визуализация графов, таймлайнов и т.д.), задач (исследование/презентация данных), и уровней сложности (от простых веб-сервисов до языков программирования).
        Room 113
        Don’t worry, be creative

        (Rayna Brauer, DW)
        There are plenty of sophisticated and complicated methods to visualize data which big media outlets use to tell their stories. For normal people there are open source tools that make our lives easer and give us the opportunity to do line charts and treemaps. But what about using non-digital tools to do your data story? So, prepare for a creative hands on workshop.
        Room 223
        Building REST API for serving Data with Zalando Connexion

        (Amil Osmanli, Zalando)
        Many of you have created a useful data model like recommendation system, natural language processing, sentiment analysis model, or perhaps image color recognition and wanted to expose outcome to the world in easy and fast manner. This workshop aims to give introduction to how to build a RESTful application and how you can rapidly serve any data with a minimal effort. In this interactive workshop we will cover how to build RESTful API using Zalando Connexion. Connexion is a lightweight framework that uses Swagger specifications and Python Flask to define and build APIs very rapidly.
        By the end of this workshop you will be able to build RESTful API for any data, that can serve high load, and scale on demand. If you are familiar with Python, this workshop will show you how to use basic coding skills to develop production ready project. Bring your laptop and code alongside, or if you got any questions regarding REST, Swagger specs and Python Connexion, you can join the workshop too. Before the session, I highly recommend to visit the workshop web-page and follow the setup instructions. This will ensure that you have everything installed to code during this workshop. If you have any problems with setting up create GitHub issue and I will answer them.
        Room 303
        Visual Storytelling: The good, the bad and the ugly

        (Jason Addie, ForSet)
        Jason will go over important requirements of visual storytelling to keep in mind when creating your story. He will provide examples of good and bad visuals to emphasize why these requirements are important. Participants will then be split into groups and critique visuals and come up with suggestions on how to improve the visuals.
        Room 416
        Data Engineering with Apache Spark

        (Giorgi Jvaridze, Zalando, Data Science Tbilisi, Dublin Data Science)
        There are multiple steps involved before one can use the data to train a Machine Learning model. Including data ingestion, data cleaning, validation, aggregation, feature engineering etc. Those steps are usually refered as Data Engineering. With the rise of the volumes of the data we collect every day, complexity of the data management increases dramatically. In this workshop we are going to cover Data Engineering and some of the tools used there, especially distributed computation engine Apache Spark.
        Small Conference Room
        Identifying a foreign political person

        (Andrew Jvirblis, Transparency International Russia)
        The speaker will reveal how the information about a Russian PEPs can be revealed using different sources, including the project, a unified database of asset and income declarations, as well as other methods and sources. Identifying foreign politically exposed persons became a crucial issue for different actors fighting against corruption and illicit financial flows: financial institutions, journalists, NGO activists, law enforcement authorities. A high ranked employee that wish to spend his illicit money abroad should not become a private person just crossing the border, but how their identity can be revealed? The workshop participants will discuss the possible sources of information about politically exposed persons in the jurisdictions of their origin, the requirements for such information, their weaknesses and the legal background behind.
      • 17:00 - 17:30
      • 17:30 - 19:00
        Library Room
        DataFest Mini-Hackathon challenge presentation & team formation

        Room 113
        Fake news detection: a collaborative challenge

        (Vincenzo Lagani, Gnosis Data Analysis)
        Can data analysis and predictive analytics help us in automatically detecting fake-news? Let's try to answer together!
        The workshop will guide the attendees in analyzing a large dataset containing fake and real news that actually appeared on communication media. Different text-mining and predictive modelling techniques will be applied, while their respective advantages and disadvantages with respect to the problem at hand will be thoroughly discussed.
        During the workshops the attendees will be encourage to submit their own solutions through the RAMP platform ( RAMP is an initiative promoting collaborative data-analysis challenges, where participants can help each other in solving complex machine-learning problems. In the weeks following the workshops, the participants will have the opportunity to keep collaborating on RAMP for optimizing their predictive models for fake-news detection.
        The workshop is open for everybody that is interested in learning how to address a complex data analysis problem using machine-learning techniques. A basic understanding of machine-learning and an initial knowledge of Python are suggested for actively contributing to the workshop and the collaborative challenge.
        Room 223
        Ensuring Data Needs by Engaging with the Internet Numbers Community

        (Marco Schmidt, RIPE)
        Data needs an open Internet and easy access to Internet number resources such as IPv4, IPv6 and AS Numbers.
        This workshop will explain the global and regional Internet Registry system, including the relevant policies and processes. It will explain how you can participate in the policy development to make and support policies changes that are good for your business. The workshops will also highlight how to get Internet number resources.
        Room 303
        Digital Forensics: A Walk-Through of Bellingcat's Investigations

        (Christiaan Triebert, Bellingcat)
        In 1,5-hours, Christiaan Triebert will provide participants with detailed walk-through of how open source investigations were conducted, showing tools, resources and methods used. Examples will include the downing of Flight MH17 above eastern Ukraine, the failed coup attempt in Turkey, Libya's social media executioner, and airstrikes in the Middle East.
        Room 416
        Scalable algorithms for recommender systems in one-class collaborative filtering case

        (Dmitriy Selivanov, Majid Al Futtaim)
        We will define problem of one-class collaborative filtering and introduce common evaluation metrics. After that we will discuss two approaches - Matrix Factorization (Weighted Regularized Matrix Factorization) and Linear Method. First one looks for user and item embeddings, second one learns item-item similarity matrix from the data. We will show that both methods are efficient and allow to solve large-scale real-world problems on a moderate hardware.
        Small Conference Room
        How we built Database of Assets of Serbian Politicians;
        DIY liberating asset declarations

        (Jelena Vasic, KRIK; Attila Juhász K-Monitor)
        Jelena Vasic will show the participants how KRIK team has built its award wining online Database of Assets of Serbian Politicians ( The entire research process will be discovered step by step, explaining resources and methods used. Database is created to help citizens in Serbia to better understand who are the people who run their country.
        In the same workshop Attila from K-Monitor will talk about a data liberation campaign called the Dawn of Asset Declarations. It is an overnight event where volunteers help to transform MP's annual asset desecrations into transparent and machine readable format. After cleaning the data we release it in a simple sheet so that information can be compared between MPs and over time. It also serves as an update to our partner's web directory of MPs' assets. Our event is not only about data, it's an important advocacy tool to shape the political agenda and enforce better regulation. Moreover participants learn a lot about their representatives.
      • 19:00 - 20:00  -  Reception / Networking
      • 10:00 - 17:00  -  Ministry of Data Hackathon (Library)
      • Time
        Library Room
        Room 223
        Room 303
        Room 322
        Room 416
      • 10:00 - 11:30
        Library Room
        Ministry of Data & DataFest Hackathon Mentorship sessions

        Room 223
        Digital Security in Action

        (Nikolai Kvantaliani, Belarus)
        We live at a time when technology is growing fast. When security & privacy should be delivered by design, and not by licence agreement. The time when 0 day attacks happens every month and every day we receive tons of spam & fishing letters. When government is using its resources to attack civil activists. But we still have a hope in technology for good and humanity. The workshop is space to talk about digital threats that is around us and solutions that we can use to stay safe in everyday life.
        Room 303
        Sifting for gold, in mountains of data

        (Emma Lacey-Bordeaux, CNN)
        Sometimes the problem we have isn’t a lack of data but trying to sort out what a mountain of data actually means. This workshop will focus on techniques for sifting through mountains of data to find the main themes, story lines. Emma Lacey-Bordeaux will bring examples for how data gets harnessed to tell good stories and fact check powerful people.
        Room 322
        An Introduction to Bayesian Regression Modelling

        (Mick Cooney, Dublin Data Science)
        Linear models are one of the keystones of statistical and predictive modelling and are a natural fit for the Bayesian framework. In this workshop we first build a linear model using OLS/MLE optimisation to fit the parameters and progress to the equivalent Bayesian fit using prebuilt models provided by the R package 'rstanarm'. We use this approach to explain the core concepts of Bayesian data analysis including the concept of prior and posterior distributions and discuss how the output can give us a more robust understanding of our analysis.
        Room 416
        Natural Language Processing

        (Rudolf Eremian, Pulsar AI)
        The objective of this workshop is to show how natural language processing applied in modern applications such as Google Search, Apple Siri, Bing Translator and etc. During the workshop we will go through history if natural language processing, talk about typical problems, consider classical approaches and methods, and compare them with state-of-the-art deep learning techniques.
      • 11:30 - 12:00
      • 12:00 - 13:30
        Library Room
        Ministry of Data & DataFest Hackathon Mentorship sessions

        Room 223
        DataViz design

        (Mariam Kobuladze, ForSet)
        Mariam will cover the basic principles of information design, such as layout, visual hierarchy, colors, typography, and chart types. She will take you through the process of creating a data visualization, such as static infographics and interactive visulizations.
        Room 303
        Data-driven Objective and Key Results (OKR) for High-Performing Teams

        (Oana Calugar, Performance+)
        This interactive workshop will demonstrate the power of the framework Objective and Key Results, and how it can be used to align people, improve employee performance and engagement, in a transparent and data-driven manner. I will share examples from various domains and provide hands-on opportunities to experience OKRs in a risk-free environment. The workshop aims to equip the attendees with practical knowledge that can be used in their daily work.
        Room 322
        A/B testing vs Multi-armed bandit

        (Pavel Nesterov,
        The goal of the talk is to compare two different approaches for web experiments. The first part of the talk will cover the theory behind A/B/n testing, its pros and cons. The second part will cover basics of Bayesian inference, which will be used in the third part of the talk. Last part will cover the theory behind Multi-armed bandits and its pros and cons. You will also be provided with a jupyter notebook, which you can use to follow the speaker and to run same experiments as he does.
        Room 416
        Attention-based methods in deep learning

        (Hrant Khachatrian, YerevaNN lab)
        Attention is one of the most successful concepts in deep learning. It has been applied for image caption generation, question answering, speech synthesis, translation and many other problems. In this workshop we will investigate real life examples of attention-based deep learning architectures for images and for text.
      • 13:30 - 15:00
      • 15:00 - 16:30
        Library Room
        Ministry of Data & DataFest Hackathon Mentorship sessions

        Room 223
        Teaching DDJ at Universities

        (Anastasia Valeeva, Kyrgyzstan)
        Experienced and aspiring trainers in data journalism will come together to discuss how to introduce the discipline in the academic curriculum. What are the strategies to teach data journalism at the university level, common challenges and ways to overcome it? The discussion will include: Overview of the academic programs for data journalism and experience in the room; Manuals, tutorials and other teaching resources including presentation of the 'Data Journalism Manual' by ODECA/UNDP (, available in English/Russian; What exactly do we teach when we teach data journalism: skills and tools; Students' profiles and how to deal with them the best way; Going academic: how academic should data journalism be at the university? Gaps, challenges and failures we had.
        Room 303
        Ready, set, target!

        (Stefan Turkheimer, USA)
        Stefan Turkheimer will lead a deep dive into crunching all kinds of data from polling to social stats to fine tune messaging.
        Room 322
        პერსონალური მონაცემების დაცვის თეორია და პრაქტიკა დამწყები ბიზნესისთვის

        (Salome Bakhsoliani, Dimitri Gugunava, პერსონალურ მონაცემთა დაცვის ინსპექტორის აპარატი)
        კომპანია დაარეგისტრირეთ? ახალ პროდუქტს ქმნით? ინოვაციური მომსახურების დანერგვას აპირებთ? - გილოცავთ, თქვენ მონაცემთა დამმუშავებელი ხართ. რასაც არ უნდა აწარმოებდეთ, ყიდდეთ თუ ქმნიდეთ, საქმიანობის პროცესში არაერთი ადამიანის პერსონალურ მონაცემთან მოგიწევთ შეხება. ეს კი მათი და კანონის წინაშე გარკვეულ პასუხისმგებლობებს გულისხმობს.
        თუ თქვენ უკვე კარგად იცით, რა უნდა გააკეთოთ თქვენი მომხმარებლების, თანამშრომლების, კლიენტების თუ პარტნიორების პერსონალურ მონაცემთა დასაცავად, მაშინ ეს ვორკშოპი თქვენთვის არ არის. თუ ფიქრობთ, რომ ამ საკითხში პრაქტიკული რჩევები გჭირდებათ - დარეგისტრირდით.
        Room 416
        Exploring Reinforcement Learning

        (Vitalii Duk, Dubizzle)
        The talk will cover basics of RL, such as Markov Decision Process and Q-Learning, as well as provide examples along with a Python code.
      • 16:30 - 17:00
      • 17:00 - 18:30
        Library Room

        Room 223

        Room 303
        Modern Gradient Boosting

        (Alexey Natekin,, DM Labs, Arktur)
        Gradient boosting machines terrorize both kagglers and apogets of no-free-lunch for over a decade. Countless companies integrated GBMs in their production systems, including worlds largest search companies. But how do GBMs work and what can you do with them? In this workshop we will cover basics of GBMs: how GBM are more than just xgboost//lightgbm imports, but are actually a flexible general-purpose machine learning modeling framework. Then we will discuss how you properly cook these models: hows, dos and donts. And at last we will talk about modern libraries and implementations -- what are your options for casual boosting at your workplaces.
        Room 322
        Billion Node Graph Processing

        (Viktor Cherkaskyi, Zalando)
        Graphs or Networks can be seen everywhere: from the social network of Facebook friends and computer networks to one's thoughts in their head. Graph as a model representation can be crucial not only in social studies, but also advertisement, fraud investigation and even terrorism prevention. Graph theory is a very developed area of computer science and lots of excellent algorithms for achieving various tasks were created during the last century, but when your graph is too big for one machine to process, everything is different. During this workshop we are going to explore various aspects of distributed processing of a graph with more than a billion nodes. Starting from graph processing in general and leading to building a cross-device graph.
        Room 416

      • 17:00 - 18:30  -  Ministry of Data Hackathon presentations (Main Stage)
      • 18:30 - 19:30  -  Reception / Networking
      • 19:30 - 20:00  -  Awarding and Closing (Main Stage)
      • 21:00 - 24:00  -  After Party