EuroPython 2018

Good features beat algorithms

Speaker(s) Pietro Mascolo

In Machine Learning and Data Science in general, understanding the data is paramount. This understanding can come from many different sources and techniques: domain expertise, exploratory analysis, SMEs, some specific Machine Learning techniques, and feature engineering. As a matter of fact, most Machine Learning and Statistical analysis strongly depends on how the data is prepared, thus making feature engineering very important for any serious Machine Learning enterprise.

“Feature engineering is the process of transforming raw data into features that better represent the underlying problem to the predictive models, resulting in improved model accuracy on unseen data.”

In this talk we will discuss what feature engineering and feature selection are; how to select important features in a real-world dataset and how to develop a simple, but powerful ensemble to measure feature importance and perform feature selection.

Familiarity with intermediate concepts of the Python programming language is required to follow the implementation steps. General knowledge of the basic concepts of Machine Learning and data cleaning will be useful, but not strictly necessary, to follow the discussion on feature selection and feature engineering.

in on Friday 27 July at 11:20 See schedule


  1. Gravatar
    It was a great talk, thanks! Would you mind to share again the link with the slides, please? I couldn't make it work.

    — Esther Marmol-Queralto,
  2. Gravatar
    Hi! Thanks!

    Slides are here:
    You need to install the go present tool to view them (link to instructions is in the README).
    — Pietro Mascolo,

New comment