HiPerCH 11 - Module 4: Scientific Data Processing with Python

Topics

The aim of the course is to introduce the ecosystem of scientific Python to an audience specially interested in Python for Data Science. This course shall enable participants to efficiently work on scientific data with Python. The following topics will be covered:

Introduction to the Scientific Python stack
NumPy & SciPy
Pandas
scikit-learn

Numpy introduces the concept of array oriented programming which helps to quickly operate on array-like datasets in only a few lines of code.

Scipy is a very powerful library that features most of the tools needed in the context of scientific work: linear algebra, FFT, numerical optimisation and many more.

Pandas is a tool for efficiently handling large sets of tabulated data (clustering, aggregating, visualising) that (amongst other tools) heavily relies on NumPy.

Scikit-Learn is a set of tools meant for performing data analysis and data mining, some of which feature techniques from machine learning. Some capabilities are classification, regression and dimensionality reduction.

Method

Lectures and hands-on workshop

Target group and requirements

Basic Python knowledge is needed.
Participants are expected to bring their own laptop with a working installation of Python.

Trainers

David Palao, Marcel Giar (HKHLR)

Date

Thursday September 26, 9:00-18:00

Location

TU Darmstadt, Alexanderstraße 2, Karl-Plagge-Haus S1|22, Room 403 "New York"

Attendance fee

Students(Bachelor/Master): €5.-
PhD students and members of universities or public research institutes: €20.-
All other: €200.-

The fee includes coffee breaks, lunch, and the evening event.

HKHLR - HPC Hessen