Data Science - Course Notes and Materials

The content of this class is designed for the dark theme of this website. If you have the bright theme activated, click the button furthest to the right of the navigation bar to change it.

General

This is a practical and skill-focused introduction to using open-source programming software (R, RStudio, and R Markdown) in several aspects of Business Analytics.

The course covers basic scripting/coding in R, data-wrangling and advanced graphing. You will learn how to load data, assemble and disassemble data objects, navigate R’s environment system, write your own functions, and use all of R’s programming tools to apply advanced data science techniques. There is no prerequisite knowledge needed in R Programming or data science. The chapters in this course are arranged according to 5 practical projects with concrete examples. These examples are short, easy to understand, cover everything you need to know and provide you with immediate practice. Learning to program is like learning to speak another language — you progress faster when you practice.

Course objectives

After completing this module, you will be able to:

  • Obtain large amounts of data via APIs or web scraping from the Internet
  • Clean and transform data
  • Explore and visualize data in a goal-oriented way

Course structure

Over the course of 5 sessions you will complete 4 assignments. Each session will involve a small amount of lecturing on R concepts, and a large amount of time for students to complete coding and analysis problems.

DataCamp

If you have R-studio working and your github page set up (will be explained in detail in the corresponding chapter), you can get started with online tutorials from datacamp and you can begin messing around in R. In order to do so, join the TUHH data science team on datacamp via the following link (Please register with your tuhh email address):

These tutorials are optional and you can choose whatever courses you want. R, Python or SQL. In accordance with the content of the sessions, I will recommend you to complete tutorials at the end of each session.

Schedule

SessionDateTopic
123.11.20 - 25.11.20Introduction to R, RStudio IDE & GitHub
223.11.20 - 25.11.20Introduction to the tidyverse
323.11.20 - 25.11.20Data Acqusition
423.11.20 - 25.11.20Data Wrangling
523.11.20 - 25.11.20Data Visualization