Question 1

Use the code that you worked on in Homework #8 (creating fake data sets), and re-organize it following the principles of structured programming. Do all the work in a single chunk in your R markdown file, just as if you were writing a single R script. Start with all of your annotated functions, preliminary calls, and global variables. The program body should be only a few lines of code that call the appropriate functions and run them in the correct order. Make sure that the output from one function serves as the input to the next. You can either daisy-chain the functions or write separate lines of code to hold elements in temporary variables and pass them along.

#Description-----------------------------------------
#Exercise 09
# 23 Mar 2022
#RCS

#Initialize -----------------------------------------
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.1.3
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.6     v dplyr   1.0.8
## v tidyr   1.2.0     v stringr 1.4.0
## v readr   2.1.2     v forcats 0.5.1
## Warning: package 'ggplot2' was built under R version 4.1.3
## Warning: package 'tibble' was built under R version 4.1.3
## Warning: package 'tidyr' was built under R version 4.1.3
## Warning: package 'readr' was built under R version 4.1.3
## Warning: package 'purrr' was built under R version 4.1.3
## Warning: package 'dplyr' was built under R version 4.1.3
## Warning: package 'stringr' was built under R version 4.1.3
## Warning: package 'forcats' was built under R version 4.1.3
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(ggplot2)
# set.seed(1234)

#library(TeachingDemos)
#char2seed('espress withdrawl')
#char2seed('espress withdrawl', set=FALSE)

# Load functions--------------------------------------
#######################################################
# FUNCTION: sim_df
# Purpuse:Make a simulation dataframe for BD salt tolerance
#input: sample size, mean and standard deviation for low salinity, mean and
        #standard deviation for high salinity, 
#output:dataframe of salainity level + bd load
#-----------------------------------------------------
sim_df <- function(sample = 50, mLow = 600, mHigh = 200, sdLow = 200, sdHigh = 50){
  salinity <- c(rep('low',sample),rep('high',sample))
  lowsal.load <- rnorm(n = sample, mean = mLow, sd = sdLow)
  highsal.load <- rnorm(n = sample, mean = mHigh, sd = sdHigh)
  bd.load <- append(lowsal.load,highsal.load)
  df1 <- data.frame(salinity,bd.load)
  bd.load
  head(df1)
  t.test(bd.load ~ salinity)
  return(df1)
}
  
#######################################################
# FUNCTION: salinity_plot
# Purpuse: creat plot of bd vs. salinity
#input: dataframe used, salinity level, bd.load
#output: plot of salinity
#-----------------------------------------------------
salinity_plot <- function(df, salinity, bd.load) {
  salinityPlot <- ggplot(data = df, mapping = aes(x = salinity, y = bd.load, fill = salinity))
  salinityPlot +
    stat_summary(geom = 'bar', fun = 'mean')+
    scale_fill_manual(values = c(high ='grey30', low = 'gray70'),
                      labels = c('High', 'Low'))
}


# Global Variables-------------------------------------
#mock "Real" Data
sal <- c(rep('low',5),rep('high',5))
BDload.low <- rnorm(n = 5, mean = 600, sd = 200)
BDload.high <- rnorm(n = 5, mean = 200, sd = 50)
bd.load <- append(BDload.low, BDload.high)
RealDF <- data.frame(sal,bd.load)
# Program Body------------------------------------------

simDF <- sim_df() #using random values made in function
salinity_plot(df = simDF, salinity = simDF$salinity, bd.load = simDF$bd.load)

#running salinity_plot() using values made outside sim_df#
salinity_plot(df = RealDF, salinity = RealDF$sal, bd.load = RealDF$bd.load)

Question 2

Once your code is up and working, modify your program to do something else: record a new summary variable, code a new statistical analysis, or create a different set of random variables or output graph. Do not rewrite any of your existing functions. Instead, copy them, rename them, and then modify them to do new things. Once your new functions are written, add some more lines of program code, calling a mixture of your previous functions and your new functions to get the job done.

#######################################################
# FUNCTION: simdfPois
# Purpuse: Make a simulation dataframe using a poisson distribution for data
#input: sample size, mean and standard deviation for low salinity, mean and
#standard deviation for high salinity, 
#output:dataframe of salainity level + bd load
#-----------------------------------------------------
simdfPois <- function(sample = 50, hLambda = 600, lLambda = 200){
  salinity <- c(rep('low',sample),rep('high',sample))
  lowsal.load <- rpois(n = sample, lambda = hLambda)
  highsal.load <- rpois(n = sample, lambda = lLambda)
  bd.load <- append(lowsal.load,highsal.load)
  poisDF <- data.frame(salinity,bd.load)
  bd.load
  head(poisDF)
  return(poisDF)
}

#Making a poisson distribution and graphing
simDF_pois <- simdfPois()
salinity_plot(simDF_pois, salinity = simDF_pois$salinity, bd.load = simDF_pois$bd.load)

Question 3

Optional. If time permits and you have the skills, try putting your program inside of a for loop and repeat the analysis with a different stochastic data set (each time you call a function that invokes the random number generator, it will create a new set of data for you to process). Can you create a data structure to store the summary statistics created in each pass through the loop? If not, your program will work, but it will only show the results from the final replicate (the previous results will be written over each time you traverse the loop).

for(i in 1:3){
  loopDF <- sim_df()
  p1 <- salinity_plot(df = loopDF, salinity = loopDF$salinity, bd.load = loopDF$bd.load)
  print(p1)
}