Data Science - Course Notes and Materials

The content of this class is designed for the dark theme of this website. If you have the bright theme activated, click the button furthest to the right of the navigation bar to change it.

General

This is a practical and skill-focused introduction to using open-source programming software (R, RStudio, and R Markdown) in several aspects of Business Analytics. The course covers basic scripting/coding in R, data-wrangling, advanced graphing and machine learning. You will learn how to load data, assemble and disassemble data objects, navigate R’s environment system, write your own functions, and use all of R’s programming tools to apply machine learning techniques. There is no prerequisite knowledge needed in R Programming, data science or machine learning. The chapters in this course are arranged according to 14 practical projects with concrete examples. These examples are short, easy to understand, cover everything you need to know and provide you with immediate practice. Learning to program is like learning to speak another language — you progress faster when you practice.

Course objectives

After completing this module, you will be able to:

  • Obtain large amounts of data via APIs or web scraping from the Internet
  • Clean and transform data
  • Explore and visualize data in a goal-oriented way
  • Model data using modern machine learning techniques with respect to classifications and predictive predictions
  • Communicate data and results in the form of products and applications

Course structure

Over the course of seven days you will complete 14 sessions. Each session will involve a small amount of lecturing on R concepts, and a large amount of time for students to complete coding and analysis problems.

DataCamp

If you have R-studio working and your github page set up (will be explained in detail in the corresponding chapter), you can get started with online tutorials from datacamp and you can begin messing around in R. In order to do so, join the NIT data science team on datacamp via the following link (Please register with your tuhh email address):

These tutorials are optional and you can choose whatever courses you want. In accordance with the content of the sessions, I will recommend you to complete tutorials at the end of each session.

Schedule

SessionDateTopic
1June 12thReporting with Shiny