BSB123 Data Analysis Assessment Item 2 Research Report (2018 S1) Background

BSB123 Data Analysis

Assessment Item 2         Research Report (2018 S1)


An American College conducted a study in the early 2000 to examine if there were any gender pay gaps[1] in its four schools: Business, Health, Liberal Studies and Sciences. Data were collected of a sample of 199 academics on their annual salary, years of service, rank, school, gender and age. The data file: Faculty Salary (Research Report Dataset).xlsx is available on Blackboard. A small portion of the data is shown below:

Faculty ID Age Years of Service Rank School Gender M/F Salary ($/year)
1 49 22 ASST BUSINESS F 106632
2 31 0 ASST BUSINESS F 80000
3 34 2 ASST BUSINESS F 114666

Legend: There are three academic ranks in the dataset: ASST, ASSO and PROF, which stand for Assistant Professor, Associate Professor and Professor respectively.


Task 1 (Boxplots and t-tests)

  1. (a) Construct separate boxplots of salaries for male and female academics, and compare their distributions (central location, spread and skewness).

(3 marks)

(b)  Test if male academics on average earn more than their female counterparts at 1%.

(1 mark)

  1. (a) Considering assistant professors only, test if male assistant professors on average earn more than female assistant professors at 1%.

(1 mark)

(b)  Considering associate professors only, test if male associate professors on average earn more than female associate professors at 1%.

(1 mark)

(c)   Considering professors only, test if male professors on average earn more than female professors at 1%.

(1 mark)

Note: In conducting a test, you should discuss briefly whether it is a one or two tail test, the test statistics, any assumption made and draw a conclusion based on Excel output.

(Bonus Question)

Simpson’s Paradox is a type of association paradox. Conduct a Google search to find out more about this topic. Think about the gender gaps observed in Task 1: What is the size of the gender gap in the whole sample? What is the size of the gender gap for each Rank? Discuss if there is any association paradox here.                                                                                                                                                           (3 marks)


Task 2 (Regression Analysis)

You plan to develop a regression model to investigate how various factors influence academic salaries.

  1. Before you conduct any regression analysis, you use Excel to construct a correlation matrix of all the quantitative variables in the dataset. Based on the correlation matrix, comment briefly on the associations between Salary and other quantitative variables.

(2 marks)

  1. You conduct a stepwise regression according to the following procedure:

Step 1:         Gender only

Step 2:         Gender and School

Step 3:         Gender, School and Rank

Step 4:         Gender, School, Rank and Years of Service

Step 5:         Gender, School, Rank, Years of Service and Age

Choice of reference variable: It is recommended that you choose Health and ASSO as the reference variables for the categorical variables: School and Rank.

Present the regression output for each of the five steps.

(4 marks)

  1. Based on the regression output obtained in Step 4, answer the following:
  • Which summary measure in the regression output is used to assess the overall adequacy of the model? Comment on the overall adequacy of the model obtained in Step 4.

(2 marks)

  • For each of the four independent variables, fully interpret the regression coefficients and comment on their statistical significance. (In discussing statistical significance of a regression coefficient, you have to justify your choice of one or two tail test.)

(6 marks)

  1. Based on the correlation between Age and Salary, did you expect Age to have a statistically significant effect on Salary? In Step 5, is the statistical significance of the regression coefficient of Age as expected? Discuss fully.

(3 marks)

Task 3 (Summary Report)

Observing the changes to the regression coefficient of Gender and its statistical significance when School, Rank, Years of Service and Age are progressively added to the model in Steps 2 to 5 in Q4 above, discuss in a summary report (word limit 300) if there are any gender pay gaps at this College. In your report, you should integrate all relevant findings from Tasks 1, 2 and 3, and present the final model you would recommend.

In this report, you can include findings from other analyses than Tasks 1, 2 and 3 (for example, Question 2 Tute 1 and the Bonus Question).

(6 marks)



  • Use 1 & ½ spacing and font size of 11.
  • You can and are encouraged to include relevant charts and Excel objects in your summary report (Task 3).
  • No referencing is required in your summary report. However, if you wish to include, and refer to, additional information, you can use any referencing system as long as it is used consistently.
  • There is no word limit for Tasks 1 and 2.
  • The word limit of 300 (with a tolerance of 10%) applies only to the summary report, and is exclusive of words in tables, appendices and reference list (if any).



  • You should submit your response to all three tasks as a single pdf document saved in the format:

BSB123 Report_StudentName.pdf 

  • After uploading your research report on Blackboard, it is your responsibility to go back to the Assignment Upload page to check that your report was properly uploaded.
  • Due: 11:59 pm May 27 (Sunday) 2018 via Blackboard

[1] If you wish to know more about the issue of gender pay gaps in Australian universities, read a special report in The Australian (March 8 2018): “Great progress, but gaps remain” (


%d 博主赞过: