I need help with a Computer Science question. All explanations and answers will be used to help me learn.
I’d like to see larger font on all of your labels and titles. And in some cases, the labels are so blurry, I can’t even make out what they say. And, in general, you’ll need more discussion of your charts. I’vealso asked you to include some descriptive statistics and a discussionof each variable you choose. This means I’m looking for you to calculate some means and standard deviations, etc. and discuss what they mean in for people living in the mountain region of the US.Graph 1/2 I like them, mostly. I would recommend swapping the x- and y-axes. We typically put the independent variable on the x-axis and the dependent variable on the y-axis. Since the total household income depends on the combo of individual incomes of the household, it makes more sense the other way. I would also recommend pre-filtering out people under 16 (or even 18). In the US, the legal working age is 16. I’d be curious how this affects your N/A or No Schooling and Less than HS Diploma graphs. Right now it looks like there are a great many people in these groups with no individual income, but high household income. That’s not really what I’d expect since jobs at this education level don’t pay well, so I’d expect most individuals to be working. I’d also recommend forcing the axes to have the same scale (since they’re measuring the same thing). You’ll need some more interpretation. Why is there a hard line at the boundary? Discuss that so many of the blue points are lower. Discuss why it would be that household income would be so much different than individual income, and what trends do you see between education levels and sex.
Graph 3 You should first filter out the people with no recorded ageâ€”there’s no need to show a graph of non-data. Have you messed around with the number of bins to make sure you’re getting the right amount of detail in these histograms? I’m having a hard time seeing which bars are which. I would suggest not only doing â€œcolorâ€ by employment status, but also â€œfillâ€. And I would also consider whether a density curve might be more useful than a histogram. Again, you’ll need some more discussion and interpretation of the graph.
Graph 4 What is this even? You’ve made a scatter plot with categorical vars on both axes? That’sâ€¦.not a thing. If you want frequency of observations (count), I’d go with a column chart. If you want it to include info for each race, separated by education level and sex, I’d put race on the x-axis, facet by education level, and make it a stacked chart for sex (more or less what you’ve done in graph 5). I also think it’s really weird that you’ve only got â€œLess than HSâ€ and â€œMaster’s Degreeâ€ included. If you want to scale it down to 2 categories, you’ll typically want it to be â€œLess than HSâ€ and â€œHS or higherâ€ or â€œLess than Master’sâ€ and â€œMaster’s or higherâ€.
Graph 5 Generally pretty cool. I’d be careful assuming that the dataset is biased. The US is justâ€¦kind of white, in general (though you’re right; you’d need to include the survey sample weights for individuals and households to get truly representative numbers, but I thought that would be a bit tough for this project, so I left it out). If what you’re interested in is the proportion of each bar across race, you should consider doing a 100% stacked bar chart (position=â€fillâ€). Since it’s hard to distinguish the colors here, I might consider combining Chinese, Japanese, and Other Asian into a single race level.
Graph 6 Also interesting.
you should a box plot?