EuroPython 2018

Data Wrangling & Visualisation with Pandas & Jupyter

Speaker(s) Alexander Hendorf

One of the best tools around for data wrangling and analysis in Python is is Pandas. With Pandas dealing with data-analysis is easy and simple but there are some things you need to get your head around first as Data-Frames and Data-Series.

After this tutorial you will be able to work with Pandas and make simple data analytics incl. visualisations. Pandas is not only useful in data science it’s also a great tool for creating e.g. sales reports or any other data-driven report required in business. It’s easy to make fancy analytics while integrating with fellow co-workers used working with Excel.

Setup

Please do come prepared and follow these simple installation instructions

Outline

Part one: The Basics

  • Working with pandas and Jupyter notebooks
  • reading and writing data across multiple formats (CSV, Excel, JSON, SQL, HTML,…)
  • selecting and accessing data
  • inner-mechanics of Pandas: Data-Frames, Data-Series
  • boolean indexes
  • summary and Q&A

Part two: Visualisation

Pandas features directly accessible, powerful visualisations.

  • data visualisation basics
  • enhance visualisations / inner mechanics
  • summary and Q&A

Part three: Data Analytics and Aggregation

  • statistical data analysis and aggregation
  • indexing
  • data grouping and aggregation
  • summary and Q&A

Part four: Data formats and scaling

  • Limits of Pandas and how to scale with e.g Dask
  • Speeding up by using optimised file formats as Parquet.

The workshop will be provided as Jupyter notebook for the attendees to follow along.

in on Tuesday 24 July at 09:30 See schedule
in on Tuesday 24 July at 11:15 See schedule

Do you have some questions on this talk?

New comment