Introduction to R, RStudio IDE & GitHub

If you already participated in the first course (“Business Data Science Basics”) you can skip the first part and jump right into “Mattermost”.

Programming language R

Integrated development environment (IDE) RStudio

This chapter provides a broad overview of the R language that will get you programming right away. The first project will be creating a decision tool used in cost accounting. This project will make it easier to study these things by teaching you the basics of R. The term “R” is used to refer to both the programming language and the software that interprets the scripts written using it. Your first mission is simple: assemble R code that will calculate the ideal quantity of inventory to order for a given product (Economic Order Quantity).

In this project, you will learn how to:

Use the R and RStudio interfaces
Run R commands
Create R objects
Write your own R functions and scripts

Don’t worry if you’ve never programmed before and it seems like we cover a lot of ground fast. The chapter will teach you everything you need to know and gives you a concise overview of the R language. You will return to many of the concepts we meet here in the next projects, where you will examine the concepts in depth.

Before you can ask your computer to save some numbers, you’ll need to know how to talk to it. RStudio gives you a way to talk to your computer. R gives you a language to speak in. To start you will need to have both R and RStudio installed on your computer before you can use them. R and RStudio are separate downloads and installations. Both are free (under the Affero General Public License (AGPL) v3) and easy to download. R is the underlying statistical computing environment, but using R alone is no fun. RStudio is a graphical integrated development environment (IDE) that makes using R much easier and more interactive. You need to install R before you install RStudio. In the following sections you’ll find information about how to install the language R and the IDE RStudio.

Installing R & RStudio IDE

Download and install R from CRAN, the Comprehensive R Archive Network.

Video instructions to install R (Download link below)

https://cloud.r-project.org

Steps

Click “Download R for Mac/Windows”
Download the appropriate file:
- Windows users click Base, and download the installer for the latest R version
- Mac users select the file R-4.X.X.pkg that aligns with your OS version
Follow the instructions of the installer.

Install RStudio’s IDE

Video instructions to install RStudio (Download link below)

https://rstudio.com/products/rstudio/download/

Steps

Select the install file for your OS.
Follow the instructions of the installer.

Understanding the RStudio IDE

Let’s start by learning about RStudio, the Integrated Development Environment (IDE) for working with R. We will use RStudio IDE to write code, navigate the files on our computer, inspect the variables we are going to create, and visualize the plots we will generate. RStudio can also be used for other things (e.g., version control, developing packages, etc.) that we will not cover during this class.

One of the advantages of using RStudio is that all the information you need to write code is available in a single window. Additionally, with many shortcuts, autocompletion and highlighting for the major file types you use while developing in R, RStudio will make typing easier and less error-prone. To get started, open RStudio just as you would open any other application on your computer. When you do, a window should appear in your screen like the one shown in the Figure below. RStudio is divided into 4 Panes.

The placement of these panes and their content can be customized `Tools -> Global Options -> Pane Layout`. To change the appearence of RStudio go to `Tools --> Global options --> appearences`.

At first startup you will not see an open script as seen in the green frame. You can create one by clicking first on (new file) and then on (new R script) in the upper left corner. Alternatively you can use the shortcut Shift+⌘+N on macOS or Shift+Ctrl+N on Windows & Linux.

RStudio Integrated Development Environment (IDE)

The basis of programming is that we write down instructions for the computer to follow, and then we tell the computer to follow those instructions. We write, or code, instructions in R because it is a common language that both the computer and we can understand. We call the instructions commands and we tell the computer to follow the instructions by executing (also called running) those commands. There are two main ways of interacting with R: by using the console or by using script files.

Console

The console appears in the bottom left panel. Simply put, the Console is where you type commands and visually see the outputs. This is the place where commands written in the R language can be typed and executed immediately by hitting Enter. The code you type is called a command, because it will command your computer to do something for you. The line you type it into is called the command line. This pane does not save your code. Those commands will be forgotten when you close the session. It only displays the output and the history of commands that have been executed. When you type a command at the prompt and hit Enter, your computer executes the command and shows you the results. If R is ready to accept a new command, RStudio displays a fresh > prompt for your next command. For example, if you type 10 * 10 and hit Enter, RStudio will display:

> 10 * 10
[1] 100
>

You’ll notice that a [1] appears next to your result. R is just letting you know that this line begins with the first value in your result. Some commands return more than one value, and their results may fill up multiple lines. For example, the command 1:100 returns 100 integer (numbers that have no digits after the decimal point). It creates a sequence of integers from 1 to 100. Notice that new bracketed numbers appear at the start of the second and third lines of output. These numbers just mean that the second line begins with the 18th value in the result, and the third line begins with the 35th value and so on. You can mostly ignore the numbers that appear in brackets:

  [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17
 [18]  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34
 [35]  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51
 [52]  52  53  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68
 [69]  69  70  71  72  73  74  75  76  77  78  79  80  81  82  83  84  85
 [86]  86  87  88  89  90  91  92  93  94  95  96  97  98  99 100

The colon operator : returns every integer between two integers. It is an easy way to create a sequence of numbers.

If you type an incomplete command and press Enter, R will display a + prompt, which means R is waiting for you to type the rest of your command. This is because you have not closed a parenthesis or quotation, i.e. you do not have the same number of left-parentheses as right-parentheses, or the same number of opening and closing quotation marks. Either finish the command or hit Esc to start over:

> 99 +
+
+ 1
[1] 100

If you type a command that R doesn’t recognize, R will return an error message. If you ever see an error message, don’t panic. R is just telling you that your computer couldn’t understand or do what you asked it to do. You can then try a different command at the next prompt:

> 100 % 2
Error: unexpected input in "100 % 2"
>

Once you get the hang of the command line, you can easily do anything in R that you would do with a calculator. For example, you could do some basic arithmetic:

2 * 3   
## 6

4 - 1   
## 3

6 / (4 - 1)   
## 2

Did you notice something different about this code? I’ve left out the >’s and [1]’s. This will make the code easier to copy and paste if you want to put it in your own console.

R treats the hashtag character # in a special way; R will not run anything that follows a hashtag on a line. This makes hashtags very useful for adding comments and annotations to your code. Humans will be able to read the comments, but your computer will pass over them. The hashtag is known as the commenting symbol in R.

For the remainder of the book, we’ll use hashtags to display the output of R code. We’ll use a single hashtag to add my own comments and a double hashtag ## to display the results of code. We’ll avoid showing > and [1] unless we want you to look at them.

Exercise

That’s the basic interface for executing R code in RStudio. Think you have it? If so, try doing these simple tasks to calculate the economic order quantity step by step with the given formula (Hint: Use sqrt(x) to calculate the square root).

$$Q = \sqrt{\frac{2DK}{h}}$$

Calulate the Q where:

$D = 1000$, $K = 5$, $h = 0.25$

Solution

#⁠ Step by step
2 * 1000 * 5
#⁠#⁠ 10000
10000 / 0.25
#⁠#⁠ 40000
sqrt(40000)
#⁠#⁠ 200

#⁠ Or as an one-liner
sqrt((2 * 1000 * 5) / 0.25)
#⁠#⁠ 200

Throughout the course, I’ll put exercises in chunks, like the one above. I’ll follow each exercise with a hidden model answer, like the one above.

Source

Script files/ Source

Because we want our code and workflow to be reproducible and generally we will want to write programs longer than a few lines, it is better to type the commands we want in the script editor, and save the script. In the source tab you can create a draft of your code as you go by using an R script. An R script is just a plain text file (saved with a .R or .r extension) that you save R code in. I strongly encourage you to write and edit all of your R code in a script before you run it in the console. This habit creates a reproducible record of your work. When you’re finished for the day, you can save your script and then use it to rerun your entire analysis the next day. Scripts are also very handy for editing and proofreading your code, and they make a nice copy of your work to share with others. This way, there is a complete record of what we did, and anyone (including our future selves!) can easily replicate the results on their computer. Moreover, the source editor in RStudio has the advantage of providing syntax highlighting, code completion, and smart indentation. You can see the different colours for numbers and there is also highlighting to help you count brackets (click your cursor next to a bracket and you will see its partner bracket highlighted).To save a script, click

or use the shortcut Cmd+S on macOS or Ctrl+S on Windows & Linux.

RStudio allows you to execute commands directly from the script editor by using the Ctrl + Enter shortcut (on Macs, Cmd + Return will work, too). The command on the current line in the script (indicated by the cursor) or all of the commands in the currently selected text will be sent to the console and executed when you press the shortcut.

Exercise

Write your calculations for the EOQ in the script editor and …

… execute a single line (click the Run icon or hit Cmd+Return). Note that the cursor can be anywhere on the line and one does not need to highlight anything (unless you want to define a function)
… execute multiple lines (Highlight lines with the cursor, then click Run or press Cmd+ Return
… execute the whole script (Source icon or Cmd + Shift + S)

Note that running code via source differs in a few respects from entering it at the R command line. Sourcing a file does not print command outputs. To see your output click the arrow next to the source icon and click Source with Echo or use Cmd + Shift + Return.

Objects / Assignment operator

You can get now output from R simply by typing math in the console or by running a script. You generated different numbers for you to see, but it didn’t save that values anywhere in your computer’s memory. If you want to use those numbers again, you’ll have to ask your computer to save them somewhere. You can do that by creating R objects. R lets you save data by storing it inside an R object. What is an object? Just a name that you can use to call up stored data. For example, you can save data into an object like a or b. Wherever R encounters the object, it will replace it with the data saved inside.

To create an object, we need to give it a name followed by the assignment operator <- (less-than symbol, followed by a minus sign), and the value we want to give it next to it. Typing ⌥ + - (mac) will write <- in a single keystroke. R ignores spaces and line breaks and executes one complete expression at a time. But it makes the code easier for you and me to read.

Q <- 200
die <- 1:6

For historical reasons, you can also use = for assignments, but not in every context. Because of the slight differences in syntax, it is good practice to always use <- for assignments.

To see what is stored in an object, just type the object’s name by itself.

Q <- 200
Q
## 200

die <- 1:6
die
## 1 2 3 4 5 6

You can name an object in R almost anything you want such as x, current_temperature, or subject_id, but there are a few rules. First, a name cannot start with a number (2x is not valid, but x2 is). Second, a name cannot use some special symbols, like ^, !, $, @, +, -, /, or *. You want your object names to be explicit and not too long. There are some names that cannot be used because they are the names of fundamental functions in R (e.g. if, else, for, etc.). In general, even if it’s allowed, it’s best to not use other function names (e.g., c, T, mean, data, df, weights). It’s also best to avoid dots (.) within an object name as in my.dataset. There are many functions in R with dots in their names for historical reasons, but because dots have a special meaning in R (for methods) and other programming languages, it’s best to avoid them. It is also recommended to use nouns for object names, and verbs for function names. It’s important to be consistent in the styling of your code (where you put spaces, how you name objects, etc.). Using a consistent coding style makes your code clearer to read for your future self and your collaborators.

Exercise

Another important question here is, is R case sensitive? Is A the same as a? Figure out a way to check for yourself.

Environment

Environment/ History

When you create an object (i.e., dataframes, arrays, values and functions), the object will appear in the index tab Environment. If an object with the same name already exists in the active environment, R will overwrite it without asking you for permission. This pane will show you all of the objects you’ve created since opening RStudio. You can see the values for objects with a single value and for those that are longer R will tell you their class. When you have data in your environment that have two dimensions (rows and columns) you may click on them (in the value column) and they will appear in the Source Editor pane like a spreadsheet. As an example you assign mtcars (this is a exemplary built in dataframe) to a name of your choice and take a look. Also in this pane is the History tab, where you can see all of the code executed for the session. You can also run history() to open it up. If you double-click a line or highlight a block of lines and then double-click those, you can send it to the Console (i.e., run them). The history will not be saved if we did not choose to save the environment after a session. The Build Tab will be useful later when you create your own website to hand in your code. We will come back to this shortly.

Exercise

Now that we know how to put values into memory, we can do arithmetic with it and calculate EOQ again. Assign all values to R objects. Replicate the formula with those ojects, so that you just can run the line again even if another value changes.

Solution

D <- 1000
K <- 5
h <- 0.25
sqrt(2 * D * K / h)
#⁠#⁠ 200
D <- 4000
sqrt(2 * D * K / h)
#⁠#⁠ 400

Functions

R comes with many functions that you can use to do sophisticated tasks like random sampling. For example, you can round a number with the round() function, or as you have already done, calculate its square root with the sqrt() function. Using a function is pretty simple. Just write the name of the function and then the data you want the function to operate on in parentheses:

round(3.1415)
## 4

sqrt(16)
## 4

The data that you pass into the function is called the function’s argument. The argument can be raw data, an R object, or even the results of another R function. In this last case, R will work from the innermost function to the outermost. Here R first looks up die, then calculates the mean of one through six, then rounds the mean:

mean(1:6)
## 3.5

die <- 1:6
mean(die)
## 3.5

round(mean(die))
## 4

Many R functions take multiple arguments that help them do their job. You can give a function as many arguments as you like as long as you separate each argument with a comma.

round(3.1415)
## 3

round(3.1415, digits = 2)
## 3.14

sample(die, size = 1)
## 3

With the sample function you can simulate rolling a pair of dice, if you set the argument size to the value 2. But additonally, you have to set the value replacement to TRUE. Otherwise rolling pairs would not be possible. If you want to add up the dice, you can feed your result straight into the sum function:

dice <- sample(die, size = 2, replace = TRUE)
dice
## 3 4

sum(dice)
## 7

Writing your own function

You can now run this chunk again and again to simulate rolling the dice. However, this is an awkward way to work with the code. It would be easier to use your code if you wrapped it into its own function.

Every function in R has three basic parts: a name, a body of code, and a set of arguments. To make your own function, you need to replicate these parts and store them in an R object, which you can do with the function() function. To do this, call function() and follow it with a pair of braces, {}:

my_function <- function() {}

function will build a function out of whatever R code you place between the braces. For example, you can turn your dice code into a function by calling:

roll <- function() {
  die <- 1:6
  dice <- sample(die, size = 2, replace = TRUE)
  sum(dice)
}

Notice that I indented each line of code between the braces. This makes the code easier for you and me to read but has no impact on how the code runs. R ignores spaces and line breaks and executes one complete expression at a time.

To use it, write the object’s name followed by an open and closed parenthesis:

roll()
## 6

The code that you place inside your function is known as the body of the function. When you run a function in R, R will execute all of the code in the body and then return the result of the last line of code. If the last line of code doesn’t return a value, neither will your function, so you want to ensure that your final line of code returns a value.

What if you want to choose the kind of dice (number of sides of the die) each time you roll them? For that, you have to replace the object die, which indicates six faces, with another object:

roll2 <- function() {
  dice <- sample(faces, size = 2, replace = TRUE)
  sum(dice)
}

roll2()
## Error in sample(faces, size = 2, replace = TRUE) : 
##   object 'faces' not found

Now you’ll get an error when you run the function. The function needs an object faces to do its job, but there is no object named faces to be found. You can supply faces when you call roll2() if you make faces an argument of the function. To do this, put the name faces in the parentheses that follow function when you define roll2:

roll2 <- function(faces) {
  dice <- sample(faces, size = 2, replace = TRUE)
  sum(dice)
}

roll2(faces = 1:6)
## 7

roll2(faces = 1:10)
## 13

Notice that roll2() will still give an error if you do not supply a value for the faces argument when you call roll2(). You can prevent this error by giving the faces argument a default value. To do this, set faces equal to a value when you define roll2():

roll2 <- function(faces = 1:6) {
  dice <- sample(faces, size = 2, replace = TRUE)
  sum(dice)
}

roll2()
## 9

Exercise

What if you also want to choose the number of dice each time you roll them? Try to alter the function accordingly. Set the default value to 2 dice.

Solution

roll2 <- function(faces = 1:6, number_of_dice = 2) {
  dice <- sample(x = faces, size = number_of_dice, replace = TRUE)
  sum(dice)
}
roll2()
#⁠#⁠ 10

#⁠ Four Tetrahedron shaped dice (Four faces)
roll2(faces = 1:4, number_of_dice = 4)
#⁠#⁠ 11

Exercise

Now transfer that approach to the EOQ formula and create a function, which calculates the EOQ and takes the variable D as an argument (set the default value of D to 1000)!

Solution

calc_EOQ <- function(D = 1000) {
  K <- 5
  h <- 0.25
  Q <- sqrt(2*D*K/h)
  Q
}

calc_EOQ()
#⁠#⁠ 200

calc_EOQ(D = 4000)
#⁠#⁠ 400

Miscellaneous

Misc. Displays

Finally, the bottom right panel is the miscellaneous panel containing five separate tabs. The Files tab has a navigable file manager, just like the file system on your operating system. It shows all the files and folders in the default workspace. The Plot tab is where graphics you create will appear. Here we can zoom, export, configure and inspect graphs/figures. Run hist(mtcars$mpg) for an example. The Packages tab shows you the packages that are installed and those that can be installed (more on this later). The Help tab allows you to search the R documentation for help and is where the help appears when you ask for it from the Console.

RStudio’s default preferences generally work well, but saving a workspace to .RData can be cumbersome, especially if you are working with larger datasets. To turn that off, go to Tools –> ‘Global Options’ and select the ‘Never’ option for ‘Save workspace to .RData’ on exit.’ This also helps to save your entire work in your code instead of saving it into variables.

Project & working directory

It is good practice to keep a set of related data, analyses, and text self-contained in a single folder, called the working directory. It is the place from where R will be looking for and saving the files. All of the scripts within this folder can then use relative paths to files that indicate where inside the project a file is located (as opposed to absolute paths, which point to where a file is on a specific computer). Working this way makes it a lot easier to move your project around on your computer and share it with others without worrying about whether or not the underlying scripts will still work.

RStudio provides a helpful set of tools to do this through its Projects interface, which not only creates a working directory for you, but also remembers its location (allowing you to quickly navigate to it) and optionally preserves custom settings and open files to make it easier to resume work after a break. Using RStudio projects makes this easy and ensures that your working directory is set properly.

When you accept the assignment in one of the next steps you will download a folder already including a project. So in this case, you do not have to set it up by yourself. The working directory will also be automatically set to the folder the project is located in.

You can see your current working directory in the path given above the console or you can just run the command getwd(). If for some reason your working directory is not what it should be, you can change it in the RStudio interface by navigating in the file browser where your working directory should be, and clicking on the blue gear icon More in the files tab and select Set As Working Directory. Alternatively you can use setwd("/path/to/working/directory") to reset your working directory. However, your scripts should not include this line because it will fail on someone else’s computer.

Help & other resources

Within RStudio

There are over 1,000 functions at the core of R, and new R functions are created all of the time. This can be a lot of material to memorize and learn! Luckily, each R function comes with its own help page, which you can access by typing the function’s name after a question mark. For example, each of these commands will open a help page. Look for the pages to appear in the Help tab of RStudio’s bottom-right pane:

?sqrt
?log10
?sample

Use ?function_name (or help(function_name)) to get the help documentation on a function.

Help pages contain useful information about what each function does. These help pages also serve as code documentation, so reading them can be bittersweet. They often seem to be written for people who already understand the function and do not need help.

Don’t let this bother you—you can gain a lot from a help page by scanning it for information that makes sense and glossing over the rest. This technique will inevitably bring you to the most helpful part of each help page: the bottom. Here, almost every help page includes some example code that puts the function in action. Running this code is a great way to learn by example.

Parts of a Help Page

Each help page is divided into sections. Which sections appear can vary from help page to help page, but you can usually expect to find these useful topics:

Description - A short summary of what the function does.

Usage - An example of how you would type the function. Each argument of the function will appear in the order R expects you to supply it (if you don’t use argument names).

Arguments - A list of each argument the function takes, what type of information R expects you to supply for the argument, and what the function will do with the information.

Details - A more in-depth description of the function and how it operates. The details section also gives the function author a chance to alert you to anything you might want to know when using the function.

Value - A description of what the function returns when you run it.

See Also - A short list of related R functions.

Examples - Example code that uses the function and is guaranteed to work. The examples section of a help page usually demonstrates a couple different ways to use a function. This helps give you an idea of what the function is capable of. You can also type example(function_name)to get examples for said function.

If you’d like to look up the help page for a function but have forgotten the function’s name, you can search by keyword. To do this, type two question marks followed by a keyword in R’s command line. R will pull up a list of links to help pages related to the keyword. You can think of this as the help page for the help page:

??log

External to RStudio

R also comes with a super active community of users that you can turn to for help. Stack Overflow, a website that allows programmers to answer questions and users to rank answers based on helpfulness, is the best place to find an answer to your question. You can submit your own question or search through Stack Overflow’s previously answered questions related to R, because there’s a great chance that your question has already been asked and answered. You’re more likely to get a useful answer if you provide a reproducible example with your question. This means pasting in a short snippet of code that users can run to arrive at the bug or question you have in mind.

Addtionally to Stackoverflow you can just use google.com or rseek.org to search for an r related problem.

Exercise

Find a way to alter the probabilities involved in the sampling process, if your roll one die. Let’s change the probability of rolling a 6 to 50% (Because the probabilites need to sum one, the other faces have a probability of 10%). Start by opening the help page of the function sample() and take a look at the arguments.

Hint: You can create a vector with the c() function. For example:

probabilities_vector <- c(1/6, 1/6, 1/6, 1/6, 1/6, 1/6)
probabilities_vector
## 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667

Solution

roll3 <- function(faces = 1:6, number_of_dice = 1) {
  dice <- sample(x = faces, size = number_of_dice, 
                            replace = TRUE, 
                            prob = c(0.1, 0.1, 0.1, 0.1, 0.1, 0.5))
  sum(dice)
}

# You can run the function 100 times, store the results and plot a histogram to varify your function
results <- replicate(n = 100, expr = roll3(), simplify=TRUE)
hist(results)

1. Introduction to R
2. Intermediate R

Mattermost

In the course of the next chapters, we will do a lot of coding and it is very likely that errors will occur. That can have several reasons but dealing with errors is an elementary component in programming in data science and machine learning. As already mentioned, in the most cases other people from around the world have had similar problems and you will find the right solution to your problem performing a query on the internet. Please try to do that as a first step when you run into an error.

If you don’t find a solution and feel you are stuck, don’t hesitate to use the relevant Mattermost channel. For each chapter there will be a channel where we try to help you as much as possible. But in order to keep it efficient and manageable it is necessary that we all follow some basic rules:

Use the right channel: for each chapter there is a correct channel as well as a general channel for questions that do not relate to a specific chapter.
Post error message: if you run into an error it is necessary that we know what the error is. Often reading the error message very carefully can also help you to understand where the problem comes from.
Post the code that caused the error: in order to reproduce the error we need the last command that caused the error. If we need more context we will ask you for that.
Use the formatting guidelines of Mattermost when you post code. That makes a huge difference in terms of readability. They will also be linked in the channel description. Most important is that using ``` one line above and one line below your code will make it easy to read.
Use thread function to reply to a discussion. This way a discussion can be easier read. You find the reply button on the right side of a message.
Use the search function: It will allow you to search across channels for similar problems as it is very likely that the you encounter the same problems as the other students.

Playing by these rules makes it a lot easier for everyone to follow the discussion and learn from similar problems and everyone can benefit from the discussions.

 
```r
x %>% sum()
```

**Error:**
Error in x %>% sum() : could not find function "%>%"

Minimalistic example of how you could post an error in Mattermost. See how little formatting makes the code very nice to read.

GitHub

GitHub Desktop

We will use GitHub to store your data and your code files. Your final submissions will be URLs to your web applications but for writing the required code we recommend you to use GitHub as it also makes it easier for us to help you if you get stuck at some point. If you don’t have a GitHub account already, please complete the following steps:

Create a free github account (if you do not have one already).

https://github.com

Download, install github desktop and connect it with your account. GitHub Desktop is a graphical user interface, which allows you to sync your local code changes with your online repository (if you are already familiar with git and github you might not need it).

https://desktop.github.com

Accept the assignment on GitHub Classroom:

Building Business Data Products - GitHub Classroom

Now you just have to follow the steps explained in the following screenshots.

Please note, that there can be minor deviations from the screenshots, e.g. the name of the classroom. The next steps serve as a general instruction.

You might have to refresh the page after a while. But then you’re ready to go and able to access your template by clicking on the bottom link (you don’t need to accept the invitation to TUHHStartupEngineers). That means you have all of the files you need to compile a website with Quarto.

If the message “owmork invited you to collaborate” pops up, press accept invitation so that all teaching assistants are able to track your progress and give feedback if needed.

The repo was created in the organization Account TUHH Startup Engineers. It is not assigned to your account, but you have full admin rights to change everything as you wish.

Running applications on your local computer

Make sure the rmarkdown package and all it’s dependencies are installed in RStudio: Open RStudio, click the packages tab in the lower right hand corner, click install packages, type in rmarkdown, make sure “install dependencies” is clicked on, then press install (alternatively run install.packages("rmarkdown", dependencies = TRUE) in the console).
Only necessary if you do not have the most recent version of RStudio installed: Install Quarto

What is Quarto? Quarto is a scientific publishing tool built on Pandoc that allows R, Python, Julia, and ObservableJS users to create dynamic documents, websites, books and more.

Quarto is now included with RStudio v2022.07.1+ so no need for a separate download/install if you have the latest version of RStudio!

Quarto will work with RStudio v2022.02+, but you’ll need to install Quarto separately here (we recommend just installing the most up to date version of RStudio).

macOS only: Install the Apple Xcode developer tools.

Option 1: Run xcode-select --install in the terminal.

Terminals/command shells only accept text commands and are a useful way to run task on a computer.

You can copy commands directly in your terminal (Terminal on MacOS, Command Prompt on Windows).

Option 2: You can install Xcode for free from the App Store.

https://apps.apple.com/de/app/xcode/

Windows only: Download and install the latest version of Rtools.

https://cran.r-project.org/bin/windows/Rtools/

Open GitHub Desktop, make sure it is connected to your account and clone your assignment repository to your computer so that you have a local copy of that data that was provided to you:

2.3 Configure Git (you can just use the suggested values and click `Continue`)

2.5 Click `Clone a Repository from the Internet...` or `Add` and then `Clone Repository ...` and select your `TUHHStartupEngineers/ss23-bdsb-YourUserName` repo. Choose a local path where you want to save the local copy and press `Clone`.

Navigate to the folder you just cloned/downloaded, open the lab_journal.Rproj file. This should automatically open RStudio, and your current working environment will be inside this project. That means everything you save will be auto saved to this folder (unless you tell RStudio to save something somewhere else. Have a look at the files tab in the bottom right hand corner. Most files you click will be opened up as text files in the RStudio editor. For each chapter there is a journal you should open and edit. If you save the file your website changes as well.

To upload any changes to your content you have to commit your changes to github using github dektop (if you are familiar with git you are free to use another method to commit your changes). As shown in this Figure you see that GitHub Desktop detects the changes automatically after saving your files.

To upload any changes to your content (e.g. the author name or the date in the `index.qmd` file) you have to *commit* your changes to github using github dektop (if you are familiar with git you are free to use another method to commit your changes). As shown in this Figure you see that GitHub Desktop detects the changes automatically after saving your files (don’t forget to build your website again after any changes. Otherwise the html files won’t be affected by any changes to your .qmd-files). Select all of the files that you want to commit on the left panel. Write a short note to describe the changes in the box at the bottom left. Press `Commit to main.`

Press `Push origin` and wait a couple of minutes. Your changes should now be served to GitHub and the teaching assistants are able to see your code.

Submission

In one of the following chapters you will be asked for your name, student number, GitHub name and URLs to your web applications.

Further resources

R Markdown Cheat Sheet

Getting Started with R Markdown

Last updated on Jun 13, 2023