class: center, middle, inverse, title-slide .title[ # ISA 401/501: Business Intelligence & Data Visualization ] .subtitle[ ## 20: Charts for High Dimensional Data ] .author[ ###
Fadel M. Megahed, PhD
Professor of Information Systems and Business Analytics
Farmer School of Business
Miami University
@FadelMegahed
fmegahed
fmegahed@miamioh.edu
Automated Scheduler for Office Hours
] .date[ ### Fall 2024 ] --- # Learning Objectives for Today's Class - Describe what is high dimensional data. - Provide some examples for graphs used for high dimensional datasets. - Construct these graphs using software --- class: inverse, center, middle # High Dimensional Data --- # What Do we Mean by High Dimensional Data?
−
+
03
:
00
.panelset[ .panel[.panel-name[Activity] > In 3 minutes, define the terms in next tab in the context of this table. .font70[
] .panel[.panel-name[Your Solution] .can-edit.key-activity3[ **Data Types:** .font70[(Edit below)] - Multivariate data: ________________ - Big Data: ________________ - Tall Data: ________________ - Wide Data: ________________ - High Dimensional Data: ________________ ] ] ] ] --- # Taxonomy **Based on the number of attributes:** - 1: Univariate - 2: Bivariate - 3: Trivariate - 4+: Multivariate **Things to Think about:** - .can-edit.key-activity4a[What is the problem with visualizing multivariate (especially when `\(p > 6 − 7\)` dimensions) data? ______________] - .can-edit.key-activity4b[Any ideas about what to do? ______________] --- class: inverse, center, middle # Examples of High Dimensional Charts --- # Hans Rosling: The Best Stats You Have Seen .panelset[ .panel[.panel-name[Activity] > While watching this video, please answer the questions in the next tab!! <center> <div style="max-width:600px"><div style="position:relative;height:0;padding-bottom:56.25%"><iframe src="https://embed.ted.com/talks/lang/en/hans_rosling_the_best_stats_you_ve_ever_seen" width="600" height="350" style="position:absolute;left:0;top:0;width:100%;height:100%" frameborder="0" scrolling="no" allowfullscreen></iframe></div></div> </center> ] .panel[.panel-name[Your Solution] - .can-edit.key-activity5a[What data is represented in this visualization? Be specific. ____] - .can-edit.key-activity5b[How is each data type visually encoded? ___________] - .can-edit.key-activity5c[Do you think the encodings are appropriate? _________] ] ] --- # So What is the Motion Bubble Chart? > Motion charts are essentially **animated bubble charts**. A bubble chart shows data using the .bold[x-axis], .bold[y-axis], and the .bold[size] and .bold[color] of the bubble. A motion chart displays .red[.bold[changes over time by showing movement within the two-dimensional space and changes in the size and color of the bubbles]]. -- [Juice Analytics](https://www.juiceanalytics.com/writing/better-know-visualization-motion-charts) **Encoding mechanisms:** - .bold[x-axis] is typically used to encode a **numeric variable** - .bold[y-axis] is also used to encode a **numeric variable** - .bold[area] is used to encode a **numeric/ordinal variable** - .bold[color] is typically used to encode a **nominal variable** - .bold[motion/animation] is typically used to encode **time** --- # Live Demo: Creating Bubble Charts in Power BI Let us use Power BI to create a similar chart to the one created by Hans Rosling. We will be using the [gapminder.csv](https://raw.githubusercontent.com/fmegahed/isa401/main/data/gapminder.csv). <iframe src="https://www.gapminder.org/tools/?embedded=true#$chart-type=bubbles&url=v1" style="width: 100%; height: 400px; margin: 0 0 0 0; border: 1px solid grey;" allowfullscreen></iframe> --- # Small Multiples > Illustrations of postage-stamp size are indexed by category or a label, sequenced over time like the frames of a movie, or ordered by a quantitative variable not used in the single image itself -- [Tufte, E.R.: Envisioning Information, Graphics Press, 1990](https://www.amazon.com/exec/obidos/tg/detail/-/0961392118/) .center[] --- # Small Multiples in Power BI Let us use Power BI to create a similar chart to the one below. We will be using the [mpg_2023_large.csv](https://raw.githubusercontent.com/fmegahed/isa401/main/data/mpg_2023_large.csv). <img src="data:image/png;base64,#20_high_dimensional_charts_files/figure-html/small_multiples-1.png" style="display: block; margin: auto;" /> **Note this is easier to make in Tableau and/or
. Outside of class, please give it a shot in Tableau as well.** --- # Parallel Coordinates > Parallel coordinates is a visualization technique used to plot individual data elements across many performance measures. Each of the measures corresponds to a vertical axis and each data element is displayed as a series of connected points along the measure/axes -- [Juice Analytics' Defintion](https://www.juiceanalytics.com/writing/writing/parallel-coordinates) <img src="data:image/png;base64,#https://images.squarespace-cdn.com/content/v1/52f42657e4b0b3416ff6b831/1614951876498-9TIQE0EHNCGUFQSNEP6J/Parallel_Coordinates_Plot_-_Learn_about_this_chart_and_tools.jpg?format=1000w" height="300" style="display: block; margin: auto;" /> --- # Parallel Coordinates in Power BI Let us visualize the [mpg_2023_sample.csv](https://raw.githubusercontent.com/fmegahed/isa401/main/data/mpg_2023_sample.csv) using a parallel coordinates plot in Power BI. <img src="data:image/png;base64,#20_high_dimensional_charts_files/figure-html/ggparcoord-1.png" style="display: block; margin: auto;" /> --- # Radar Charts > Charts show how individual things perform across multiple measures <img src="data:image/png;base64,#https://images.squarespace-cdn.com/content/v1/52f42657e4b0b3416ff6b831/1633047594333-7OEOYF4YZU4TMI2VMJSR/A_Night_Under_The_Stars___Jordan_Vincent.png?format=1000w" height="420" style="display: block; margin: auto;" /> --- # Radar Charts in Power BI Let us add the [Radar Chart App](https://appsource.microsoft.com/en-us/marketplace/apps?product=power-bi-visuals&page=1&src=office&search=radar) to our Power BI and use it to visualize the [mpg_2023_sample.csv](https://raw.githubusercontent.com/fmegahed/isa401/main/data/mpg_2023_sample.csv). <img src="data:image/png;base64,#20_high_dimensional_charts_files/figure-html/ggradar-1.png" style="display: block; margin: auto;" /> --- # Other Charts: HeatMap .left-column[ - each column is a variable - each obs is a row - each square is a value; closer to yellow the higher ] .right-column[ <img src="data:image/png;base64,#20_high_dimensional_charts_files/figure-html/heatmap-1.png" style="display: block; margin: auto;" /> ] --- # Other Charts: TreeMaps > Treemaps simultaneously show the big picture, comparisons of related items, and allow easy navigation to the details. [Juice Analytics](https://www.juiceanalytics.com/writing/10-lessons-treemap-design) **Encoding mechanisms:** Each box in a treemap can show two measures: - .bold[area of the boxes] should be a **quantity measure**. The measures should sum up along the hierarchical structure of the data. The sum of all the elements in one branch need to sum to the value of the branch as a whole. - .bold[Color of the boxes] is best suited to a **measure of performance or change** such as growth over time, average conversion rate, or customer satisfaction. --- count: false # Other Charts: TreeMaps > Treemaps simultaneously show the big picture, comparisons of related items, and allow easy navigation to the details. [Juice Analytics](https://www.juiceanalytics.com/writing/10-lessons-treemap-design) .center[] --- # Other Charts: Chernoff Faces .center[] --- class: inverse, center, middle # Recap --- # Summary of Main Points - Describe what is high dimensional data. - Provide some examples for graphs used for high dimensional datasets. - Construct these graphs using software --- # Non-graded Activity: Kahoot Let us go to Kahoot and compete for a $10 Starbucks gift card. To evaluate your understanding of the material, please answer the questions correctly and as quickly as possible to get the most points.