class: center, middle, inverse, title-slide .title[ # ISA 401: Business Intelligence & Data Visualization ] .subtitle[ ## 17: Charts Used for Comparisons, Relationships, Distributions and Correlations ] .author[ ###
Fadel M. Megahed, PhD
Professor of Information Systems and Business Analytics
Farmer School of Business
Miami University
@FadelMegahed
fmegahed
fmegahed@miamioh.edu
Automated Scheduler for Office Hours
] .date[ ### Fall 2024 ] --- # Learning Objectives for Today's Class - Identify strengths & weaknesses of basic charts - Use appropriate charts based on objective - Avoid using pie charts (never use pie charts) - Avoid 3D graphs (unless VR changes their utility) --- # A Catalog of Commonly Used Graph Types <iframe width="1200" height="500" src="https://datastudio.google.com/embed/reporting/eb2fea55-8eeb-440f-9c56-e8278266a368/page/vZWQB" frameborder="0" style="border:0" allowfullscreen></iframe> --- # Chart Suggestions <img src="data:image/png;base64,#../../figures/chart-chooser-2020.png" width="693" style="display: block; margin: auto;" /> .footnote[ <html> <hr> <html> **Source:** [Chart Chooser](https://extremepresentation.typepad.com/blog/2006/09/choosing_a_good.html), created by Andrew Abela in 2006 and last updated on Sept 06, 2020. ] --- class: center, inverse, middle # Charts Used for Comparing Data # (Unit of Analysis is Based on a Nominal Categorical Variable) --- # A Literal Bar Chart
−
+
04
:
00
.panelset[ .panel[.panel-name[Activity] .pull-left[ <img src="data:image/png;base64,#../../figures/bar1.png" width="75%" style="display: block; margin: auto;" /> ] .pull-right[ > _Answer the following questions:_ (1) How many variables do we have in this graph? (2) How many observations? (3) Please discuss the type of variables in the graph? (i.e. nominal, ordinal, etc.) (4) How is the data encoded in the graph? (5) Any other comments/observations? ] ] .panel[.panel-name[Your Solution] .can-edit.key-activity2[ - Q1: ________________ - Q2: ________________ - Q3: ________________ - Q4: ________________ - Q5: ________________ ] ] ] .footnote[ <html> <hr> </html> **Source:** The image is from [How Much Does Beer Consumption Vary by Country?](https://snippets.com/how-much-does-beer-consumption-vary-by-country.htm), and the data seems to be based on a 2004 report from the Kirin Holdings Company. ] --- # Using a Bar Chart to Visualize R Code <img src="data:image/png;base64,#https://github.com/brodieG/watcher/raw/master/extra/sort-2.gif" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Source:** See the amazing [watcher](https://github.com/brodieG/watcher) package, which allows you to record the state of R function during evaluation. For more details, please click on the link. ] --- # Non-graded activity: Two Bar Charts
−
+
04
:
00
.panelset[ .panel[.panel-name[Activity] > _Over the next five minutes, identify **3-4 differences that make the graph on the right better**, and suggest **how you can further improve the graph on the right**_ .pull-left[ <img src="data:image/png;base64,#../../figures/bar3a.jpg" width="85%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="data:image/png;base64,#../../figures/bar3b.jpg" width="85%" style="display: block; margin: auto;" /> ] ] .panel[.panel-name[Your Solution] .can-edit.key-activity3[ Insert your differences and suggestions for improvement below. ] ] ] --- # Issues with the Interpretation of Bar Charts <img src="data:image/png;base64,#../../figures/ddog.png" width="50%" style="display: block; margin: auto;" /> .center[.font70[The Draw Datapoints on Graph (DDoG) measure maintains the graph as a consistent reference frame across its three stages.]] .footnote[ <html> <hr> </html> **Source:** Kerns and Wilmer (2021). *Journal of Vision*. DOI: [https://doi.org/10.1167/jov.21.12.17](https://doi.org/10.1167/jov.21.12.17) ] --- count:false # Issues with the Interpretation of Bar Charts <img src="data:image/png;base64,#../../figures/mean_count_bar.png" width="30%" style="display: block; margin: auto;" /> .center[.font70[Data distribution differs categorically between mean and count graphs. (a) Mean bar graphs and (c) count bar graphs do not differ in basic appearance, but they do depict categorically different data distributions. ]] .footnote[ <html> <hr> </html> **Source:** Kerns and Wilmer (2021). *Journal of Vision*. DOI: [https://doi.org/10.1167/jov.21.12.17](https://doi.org/10.1167/jov.21.12.17) ] --- # Key Takeaway 1 > .font125[The typically used **bar** chart should not be to depict means of categorical variables.] --- # Waterfall Charts
−
+
04
:
00
.panelset[ .panel[.panel-name[Activity] > _What are the advantages and disadvantages of these two charts? They are using the same exact data. Please try to list 2-4 in each category for each chart._ .pull-left[ <img src="data:image/png;base64,#https://2.bp.blogspot.com/-B9n30Ev5kvg/TrLKEc6Nr-I/AAAAAAAADHc/KPhx6mnvsu8/s400/Income+%2526+Expenses+-+AFTER.jpg" width="85%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="data:image/png;base64,#https://lh5.ggpht.com/-D9ui-i_gH4A/Trf9zScsXiI/AAAAAAAAWrQ/6kKFFMHH4XE/Storytelling%252520with%252520Data%252520Waterfall_thumb%25255B7%25255D.png?imgmax=800" width="85%" style="display: block; margin: auto;" /> ] ] .panel[.panel-name[Your Solution] .can-edit.key-activity4[ Insert your advantages and disadvantages below ] ] ] --- # 3D Bar Charts are Awful <center> <blockquote class="twitter-tweet" data-height='320' ><p lang="en" dir="ltr">When our status is secure, we don't emphasize it. When it's ambiguous, we do.<br><br>Penn students are more likely than Harvard students to mention that they go to an Ivy League school...<a href="https://t.co/0ipuArIRUn">https://t.co/0ipuArIRUn</a> <a href="https://t.co/BDusRYJm34">pic.twitter.com/BDusRYJm34</a></p>— Adam Grant (@AdamMGrant) <a href="https://twitter.com/AdamMGrant/status/1339925914327638017?ref_src=twsrc%5Etfw">December 18, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> </center> --- count: false # 3D Charts are Awful <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#https://pbs.twimg.com/media/Ephdbz5WwAINqIY?format=jpg&name=small" alt="Adam Grant's Plot of the Penn and Harvard Bar Chart" /> <p class="caption">Adam Grant's Plot of the Penn and Harvard Bar Chart</p> </div> --- count: false # 3D Bar Charts are Awful <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#17_commonly_used_charts_files/figure-html/harvard_penn2-1.png" alt="A remake of the plot with colorblind-friendly colors and a 2D bar representation to avoid distorting the data" /> <p class="caption">A remake of the plot with colorblind-friendly colors and a 2D bar representation to avoid distorting the data</p> </div> --- count: false # 3D Charts are Awful: Even This <img src="data:image/png;base64,#../../figures/codepen.png" width="80%" height="80%" style="display: block; margin: auto;" /> .footnote[ <html> <hr> <html> **Source:** See the [interactive version of the chart by clicking here](https://codepen.io/zingchart/full/ePxQmd/) ] --- # Dot Charts: Recall the Playfair Example
--- class: inverse, center, middle # Proportions --- # My Favorite Pie Chart <img src="data:image/png;base64,#17_commonly_used_charts_files/figure-html/egyptian_pie-1.png" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Note:** Humor aside, pie charts are almost always awful!! ] --- # Pie Charts are Awful By Design <img src="data:image/png;base64,#17_commonly_used_charts_files/figure-html/pie_charts1-1.png" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Source:** The pie charts generated in this slide are based on the R code provided in [From-Data-to-Viz: The Issue with Pie Chart](https://www.data-to-viz.com/caveat/pie.html). ] --- count:false # Pie Charts are Awful By Design <img src="data:image/png;base64,#17_commonly_used_charts_files/figure-html/pie_charts2-1.png" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> ] --- # And often made even worse: 3D <img src="data:image/png;base64,#17_commonly_used_charts_files/figure-html/3dpie-1.png" style="display: block; margin: auto;" /> --- # And often made even worse: Side Legend <img src="data:image/png;base64,#17_commonly_used_charts_files/figure-html/pie_side_legend-1.png" style="display: block; margin: auto;" /> --- # And often made even worse: Exploded Pie <img src="data:image/png;base64,#17_commonly_used_charts_files/figure-html/pie_explode-1.png" style="display: block; margin: auto;" /> --- # And often made even worse: SUM(%) != 100% <img src="data:image/png;base64,#https://flowingdata.com/wp-content/uploads/2009/11/Fox-News-pie-chart-620x465.png" style="display: block; margin: auto;" /> --- # And often made even worse: Many Levels <img src="data:image/png;base64,#https://online.hbs.edu/online/PublishingImages/blog/posts/HBS_Too_Many_Variables_Pie_Chart.jpg" height="500px" style="display: block; margin: auto;" /> --- # Key Takeaway 2 > .font125[Please do **NOT** use pie charts.] <details> <summary>If you need any further evidence, please check <i>?pie()</i> in R. Even statistical software are recommending against using pie charts!!</summary> <b>Note</b> <br><br> Pie charts are <b>a very bad way of displaying information</b>. The eye is good at judging linear measures and bad at judging relative areas. A bar chart or dot chart is a preferable way of displaying this type of data. <br><br> Cleveland (1985), page 264: <br>“Data that can be shown by pie charts always can be shown by a dot chart. <b>This means that judgements of position along a common scale can be made instead of the less accurate angle judgements</b>.” <br> This statement is based on the empirical investigations of Cleveland and McGill as well as investigations by perceptual psychologists. </details> <img src="data:image/png;base64,#17_commonly_used_charts_files/figure-html/pie_summary-1.png" style="display: block; margin: auto;" /> --- # Stacked Bar Charts
−
+
04
:
00
.panelset[ .panel[.panel-name[Activity] > _When it is best to use the four charts below? They are using the same exact data._ <img src="data:image/png;base64,#../../figures/stacked1.png" width="50%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Your Solution] .can-edit.key-activity5[ Insert best usage scenario for each chart below ] ] ] --- # A Note on Stacked Bar Charts <img src="data:image/png;base64,#https://images.squarespace-cdn.com/content/v1/55b6a6dce4b089e11621d3ed/1438044543242-WFT20H8JQ33L392ATHAP/image-asset.jpeg?format=750w" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Source:** [Storytelling with Data: To Stack or Not to Stack](https://www.storytellingwithdata.com/blog/2012/11/to-stack-or-not-to-stack) ] --- count:false # A Note on Stacked Bar Charts <img src="data:image/png;base64,#https://images.squarespace-cdn.com/content/v1/55b6a6dce4b089e11621d3ed/1438044493349-X77ZPDNES1BOOWZW9OR2/image-asset.jpeg?format=750w" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Source:** [Storytelling with Data: To Stack or Not to Stack](https://www.storytellingwithdata.com/blog/2012/11/to-stack-or-not-to-stack) ] --- count:false # A Note on Stacked Bar Charts <img src="data:image/png;base64,#https://images.squarespace-cdn.com/content/v1/55b6a6dce4b089e11621d3ed/1438044496347-Q55Q8K06I2GCWFY1DRHV/image-asset.jpeg?format=750w" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Source:** [Storytelling with Data: To Stack or Not to Stack](https://www.storytellingwithdata.com/blog/2012/11/to-stack-or-not-to-stack) ] --- class: inverse, center, middle # Distributions and Correlations --- # Issues with Histograms <img src="data:image/png;base64,#../../figures/animated_histogram.gif" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Source:** In a quality improvement project with a major manufacturer of electronic devices, we discovered an issue in the customer's seemingly 'normal' data. ] --- # Issues with Box Plots <img src="data:image/png;base64,#https://i0.wp.com/nightingaledvs.com/wp-content/uploads/2021/11/box-plot-vs-histogram-w-callouts.png?resize=1920%2C1010&ssl=1" width="80%" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Source:** [Nick Desbarats (2021) I've Stopped Using Box Plots. Should You?](https://nightingaledvs.com/ive-stopped-using-box-plots-should-you/) ] --- count: false # Issues with Box Plots <img src="data:image/png;base64,#https://i0.wp.com/nightingaledvs.com/wp-content/uploads/2021/11/box-plot-vs-strip-plots.png?resize=550%2C381&ssl=1" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Source:** [Nick Desbarats (2021) I've Stopped Using Box Plots. Should You?](https://nightingaledvs.com/ive-stopped-using-box-plots-should-you/) ] --- # Additional Issues with Box Plots <img src="data:image/png;base64,#../../figures/albert_rapp_box.gif" width="65%" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Source:** [Albert Rapp (2024) Issues with Box Plots](https://www.linkedin.com/posts/dr-albert-rapp-9a5b9b28b_datavisualization-activity-7255217415628746752-YcAG?utm_source=combined_share_message&utm_medium=member_desktop) ] --- # Key Takeaway 3 > .font125[To capture the variability in a dataset, the use of box plots **may not be** appropriate!!!!] --- # Scatter Plots <img src="data:image/png;base64,#https://pbs.twimg.com/media/FOsiu3-VQAUihO1?format=jpg&name=large" width="45%" style="display: block; margin: auto;" /> .footnote[ <html> <hr> <html> **Source:** Chart created by [@EnthusiastFpl](https://twitter.com/EnthusiastFpl/status/1507336696999817216?s=20&t=jmvxlOhpwFWSOCcVMVpZiA) and shared on
on March 25, 2022. ] --- class: inverse, center, middle # Recap --- # Summary of Main Points - Identify strengths & weaknesses of basic charts - Use appropriate charts based on objective - Avoid using pie charts (never use pie charts) - Avoid 3D graphs (unless VR changes their utility)