IF 4061 · Data and Information Visualization · STEI ITB

Data Visualization
Complete Midterm Notes

Intro · Key Principles · Methodology · Data Preparation

Representation & Presentation 4 Key Principles Fry & Kirk Methodology Viz Function & Tone Data Types Data Preparation Editorial Focus User-Centered Design

References: Andy Kirk (2012) · Colin Ware (2004) · Fry (2008) · Lecture Decks IF4061 Sem 2 2025/2026 · Dessi Puji Lestari

01 · What Is Data Visualization?

Definitions, Core Elements & Key Concepts

Before you can design anything, you need to understand what visualization actually is and why it works. Start here.

The Core Definition

Data Visualization has two widely accepted definitions. They sound similar but have an important difference:

Data Visualization

The use of computer-supported, interactive, visual representations of data to amplify cognition.

→ Focus: raw data (structured numbers, tables, etc.)

Information Visualization

The use of computer-supported, interactive, visual representations of abstract data to amplify cognition.

→ Focus: abstract/conceptual data (relationships, hierarchies, etc.)

Simplified Definition (Andy Kirk)

DataViz = the representation and presentation of data that exploits our visual perception abilities in order to amplify cognition. This definition breaks into three inseparable ideas: Representation, Presentation, and Amplify Cognition.

The Three Pillars of the Definition

Every visualization you ever make is built on these three ideas. Understand each one deeply.

1. Representation

Taking data as the raw material and creating a visual form to best portray its attributes. It is the choice of physical forms (shapes, lines, colors, positions) used to encode the data. Think of it as answering: "What shape does my data take?"

2. Presentation

Presentation goes beyond just showing the data. It concerns how you integrate the data representation into the overall communicated work. It includes decisions about:

Colors and color palettes
Layout and composition
Annotations, labels, and titles
Interactive features (hover tooltips, filters, etc.)

Think of it as: "How does my visual look and feel as a complete piece of work?"

3. Amplify Cognition

This is the why. Amplifying cognition means maximizing how efficiently and effectively we process information into thought, insights, and knowledge. A visualization that looks beautiful but confuses the viewer has failed. The goal is always to make the reader think better, faster, or more accurately.

The Simple Mental Model

DataViz = Representation + Presentation → Amplify Cognition

Every design decision must serve the goal of helping the reader understand the data faster and more correctly.

Art or Science?

Data visualization is both, but it leans more toward science than most people think. Doing it well requires knowledge from several traditionally separate fields:

Cognitive Science

How humans perceive, process, and remember visual information. Foundation of why certain charts work and others don't.

Statistics

Understanding what the data actually means, choosing the right summary measures, and avoiding misleading aggregations.

Graphic Design

Visual hierarchy, typography, color theory, and layout — making the chart readable and aesthetically coherent.

Cartography

For spatial/geographic data, applying principles developed over centuries of map-making.

Computer Science

Tools, algorithms, interaction design, and performance — implementing the visualization in software.

Key Quote — Stephen Few

"Getting visualization right is much more a science than an art, which we can only achieve by studying human perception."

Gestalt Laws (Theoretical Ancestry)

The Gestalt laws are psychological principles that explain how humans naturally group and perceive visual elements. They are the scientific backbone of why certain visual arrangements feel intuitive. You don't need to memorize all of them for the midterm, but know they exist and why they matter for DataViz design.

Proximity

Elements placed close together are perceived as a group. Use spacing to signal groupings in your chart.

Similarity

Elements that look alike (same color, shape, size) are seen as belonging together. The basis of color encoding in legends.

Continuity

The eye naturally follows lines and curves. Line charts exploit this to show trends over time.

Closure

The brain fills in gaps to perceive complete shapes. Partially drawn outlines are still recognized.

Figure/Ground

We automatically distinguish a foreground object from its background. Important for contrast and readability.

How to Make Good Visualization

Three things must be understood and balanced:

Properties of the data and information — what type of data is it? What story does it hold?
Properties of pictures — what visual encodings (position, length, color, area) are most accurately perceived by humans?
Rules to map data into pictures — the grammar of graphics, the design principles, the methodology.

02 · Purpose of Data Visualization

Why Do We Visualize? The Two Core Purposes

Every visualization project is created for one of two reasons — or a blend of both. Know the difference.

Purpose 1 — Data Analysis

Using visualization to understand data and extract comprehensive information from it. The chart is a tool for you (the analyst), not necessarily for a general audience. When you visualize data to analyze it, you are exploring — looking for patterns, outliers, and hypotheses.

Famous Quote — John W. Tukey

"The greatest value of a picture is when it forces us to notice what we never expected to see." — This is the essence of exploratory data analysis (EDA).

Advantages of Visualization for Data Analysis

Understand large datasets faster — patterns that are invisible in a spreadsheet become obvious in a chart.
Capture important properties — distribution shape, outliers, trends, clusters.
Capture problems — visualization is a tool for quality control. Dirty data often shows up visually before you find it programmatically.
Facilitate new hypotheses — a chart can suggest relationships you had not thought to test.

Purpose 2 — Communication

Using visualization to communicate information to an audience. The emphasis here is on clarity, simplicity, and emotional tone. Visualization for communication incorporates simplification (removing noise) and tonal intent (the feeling you want to create in the reader).

Key Quote — Edward Tufte

"Overload, clutter, and confusion are not attributes of information — they are failures of design." If the reader is confused, the designer is at fault, not the data.

The Ultimate Goal

Regardless of purpose (analysis or communication), the ultimate goal of any visualization is to make readers feel like they have become better informed about a subject.

Mackinlay's Principle of Effectiveness

"Visualization A is more effective than B if the information conveyed by A is more readily perceived than the information in B." — Jock Mackinlay

Effectiveness is not about beauty. It is about perceptual efficiency — how fast and accurately a reader extracts the information.

03 · History & Milestones

A Brief History of Data Visualization

DataViz is not a new trend — it has existed for centuries. Understanding its history helps you appreciate how current practice evolved.

Historical Timeline

Visualization Milestones by Era

pre-1600

Maps & Diagrams. The earliest visualizations were geographic maps. Humans have been encoding spatial information visually for millennia — from ancient cave paintings to medieval cartography.

1600–1799

Theory & Metrics. The development of formal measurement systems, coordinate systems, and early statistical graphs. Scientists began plotting data points on axes to understand astronomical and physical phenomena.

1800–1974

Modern Infographics Begin. The 19th century saw the invention of many chart types we use today — the bar chart and pie chart were invented by William Playfair (late 18th century). John Snow's famous 1854 cholera map is a landmark of data-driven visual reasoning. Florence Nightingale's polar area diagrams influenced public health policy.

1975–Now

Computer-Aided Visualization. Catalyzed by powerful computing and a cultural shift toward transparency and data accessibility. The internet made data and visualizations broadly accessible. "Data is the new oil." (Michael Palmer, 2006). Tools like Tableau, D3.js, and Python libraries democratized visualization creation.

Why is DataViz So Important Now?

Catalyzed by two forces: (1) powerful new technological capabilities — cheap computing, cloud storage, open data; and (2) a cultural shift toward transparency and accessibility of data. As Hal Varian (Google Chief Economist) said: "The ability to take data, understand it, process it, extract value from it, visualize it, communicate it — that's going to be a hugely important skill in the next decades."

04 · Key Principles

The Four Key Principles of Data Visualization

These are the non-negotiable rules that separate good visualization from bad. Know them, apply them, and be able to explain each with an example.

Overview of the 4 Principles

Principle 01

Strive for Forms & Functions

Balance aesthetic form with practical function. Neither style without substance, nor function without beauty. Form and function should work together, not compete.

Principle 02

Justify Every Design Choice

Every visual element — shape, color, label position, interaction — must be deliberate and reasoned. Nothing should be accidental or arbitrary.

Principle 03

Create Accessibility Through Intuitive Design

Your visualization should be immediately understandable. Overload, clutter, and confusion are design failures, not information problems.

Principle 04

Never Deceive the Receiver

Visualizations can distort reality — intentionally or accidentally. Ethical visualization ensures an honest, accurate representation of the data.

Deep Dive: Principle 1 — Forms & Functions

Frank Lloyd Wright said: "Form and function should be one, joined in a spiritual union." This is the ideal for DataViz. The question is never "style or substance?" — it is always both.

Practical advice (from the lectures): When starting a project, first secure the functional aspects of the visualization (does it convey the right information accurately?), and only then explore ways to enhance its form (does it look good and engage the reader?).

Deep Dive: Principle 2 — Deliberate Design

Every single design feature in a visualization should be included for a reason:

Shape

Why a circle vs a bar? Circles encode part-of-whole; bars encode magnitude comparisons. The choice must match the data's story.

Color Palette

Sequential vs. diverging vs. categorical palettes? Color blindness considerations? Each color choice must be justified.

Label Position

Inside the bar or outside? To the right of the point? Label placement affects readability and whether the reader even sees the annotation.

Interaction

Hover tooltips, drill-down filters, pan/zoom — every interactive feature must serve user exploration needs, not just exist for "coolness".

Amanda Cox (New York Times)

"We're so busy thinking about if we can do things, we forget to consider whether we should." — Just because a charting tool lets you add a 3D effect or an animation doesn't mean you should.

Deep Dive: Principle 3 — Accessibility Through Intuitive Design

A visualization should be usable without a manual. If your reader needs a lengthy explanation to understand the chart, the chart has failed. Intuitive design means leveraging natural human visual perception so that the message is immediately apparent.

Clutter — too many grid lines, labels, colors, and decorations — adds cognitive load without adding information. Every element you remove that adds no informational value increases the clarity of the remaining elements.

Deep Dive: Principle 4 — Never Deceive

Visualization ethics deals with the potential deception created by visual choices. Deception can be:

Intentional — deliberately designing a chart to mislead (e.g., a politician cherry-picking a date range to make a trend look favorable).
Unintentional — arising from an ineffective or inappropriate representation of data (e.g., a truncated Y-axis that makes a small difference look huge).
From ignorance — caused by a lack of understanding of visual perception (e.g., using area to encode a 1D value, making readers vastly over- or under-estimate).

Common Deception Patterns to Know

Truncated Y-axis — not starting the bar chart axis at 0 exaggerates differences.
Area vs. length confusion — using bubble size to show a 1D value misleads because humans perceive area, not radius.
Cherry-picked timeframes — selecting a window of data that shows a trend favorable to your argument.
Dual Y-axes — two unrelated scales can create false correlations by manipulating axis ranges.

Visualization Skills for the Masses (Stephen Few)

"The skills required for most effectively displaying information are not intuitive and rely largely on principles that must be learned." — This is the whole reason this course exists. Good visualization is a learned discipline, not an innate talent.

05 · Methodology

How to Build a Visualization: Two Frameworks

Both Fry's 7 Stages and Kirk's 5-Step process describe how a visualization project actually flows from data to finished product.

Framework 1 — Fry's 7 Stages of Visualizing Data (2008)

Ben Fry proposed a process model for creating data visualizations. These stages are iterative — you may loop back, skip, or re-order them depending on the project.

Acquire

Obtain the data from its source.

Parse

Structure and categorize the data.

Filter

Remove data that is not needed.

Mine

Apply statistics / data mining to find patterns.

Represent

Choose a visual model (bar, tree, map…).

Refine

Improve clarity and visual engagement.

Interact

Add interactivity for data exploration.

Note: These stages are often iterative and may have a flexible order or even be omitted in simple projects.

Framework 2 — Andy Kirk's 5-Step Methodology (2012)

This is the primary framework used throughout the course. It is more project-management-oriented than Fry's model.

Step 1

Purpose & Parameters

Define why, for whom, and under what constraints.

Step 2

Prepare & Explore Data

Acquire, clean, understand, and analyze your data.

Step 3

Formulate Questions

Identify the key questions your viz should answer.

Step 4

Design Concepting

Sketch and prototype visual solutions.

Step 5

Construct & Launch

Build, test, and publish the final visualization.

06 · Step 1: Purpose & Parameters

Visualization Function, Tone, Factors & Users

The first and most critical step in any visualization project. Get this wrong and everything downstream is misaligned.

Clarifying the Purpose: Two Questions

The reason for existing — What triggered this project? What is its scope and context? How much creative control do you have?
The intended effect — What should the reader think, feel, or do after seeing this visualization?

Establishing Intent: Visualization Function

Every visualization has one of three primary functions. This is a fundamental classification you must know for the exam:

1. Explanatory

Goal: Convey a specific narrative to the reader.

What it is: Based around a focused story. You already know what the key finding is, and you design the chart to communicate it clearly.

Examples: A corporate dashboard showing key performance figures; a newspaper infographic explaining economic crisis complexity.

→ More about visual presentation of data.

2. Exploratory

Goal: Provide an interface for the user to explore the data themselves.

What it is: Lacks a single, predetermined narrative. The user drives the exploration and finds their own insights.

Examples: A scatterplot matrix for multivariate correlation exploration; interactive dashboards with filters, brushing, and sorting.

→ More about visual analysis of data.

3. Exhibition / Data Art

Goal: Express or exhibit data as an aesthetic or emotional experience.

What it is: The intent is removed from a pure desire to inform. Data becomes the raw material for artistic self-expression.

Examples: A visualization of all adjectives in a novel; artistic renderings of city heartbeat data.

→ More about form and aesthetic than information transfer.

Explanatory vs. Exploratory: Detailed Comparison

Dimension	Explanatory	Exploratory
Narrative	Based around a specific, focused narrative	Lacks a single specific narrative
Focus	Visual presentation of data	Visual analysis of data
Designer role	Creates a clear portrayal of interesting stories from the dataset	Builds a tool for users to seek personal discoveries and patterns
Finding	One specific finding defined beforehand	Opens up possibility for chance/serendipitous findings
Interactivity	Usually static or minimal	Usually highly interactive (filter, sort, brush, zoom)

Establishing Intent: Visualization Tone

Tone is about the type of stimulus or desired emotional response you are trying to create in your reader. There are two ends of a spectrum:

Pragmatic / Analytical Tone

The reader reacts analytically. They read values, compare numbers, track trends. Emotions stay low — unless the data reveals something alarming.

Example: "We need a chart to help monitor our quarterly sales performance."

→ Think: corporate dashboards, scientific reports, financial charts.

Emotive / Abstract Tone

The goal is a personal, impactful experience. Abstract or artistic visual choices are used to create feeling, not just to transfer data.

Example: "We need to present this in a way that persuades people to care." (Chris Jordan: "I fear we aren't feeling enough to digest these huge numbers.")

→ Think: data journalism, advocacy visualizations, data art.

Emotive Tone Note

In emotive/abstract visualizations, you sometimes move beyond bars and straight lines toward curves, circles, and organic shapes. Abstract tone is more about creating an aesthetic that portrays a general sense of the data's story — you might not be able to read exact values, but the visual impression carries the message.

Key Factors Surrounding a Visualization Project

Beyond intent, every project is shaped by real-world constraints. The "8 hats" concept refers to the many roles a DataViz designer must wear:

The Aim

What is the specific goal of this project? Broad enough to guide creativity, specific enough to evaluate success.

Time Pressures

Deadlines constrain how deep the analysis and refinement can go. Know when "good enough" is good enough.

Costs

Budget affects tooling, data acquisition, and team size. Custom interactive D3.js costs more than a Tableau screenshot.

Client Pressures

Client preferences, organizational culture, and politics all shape what you can and cannot do.

Format

Is this a static PDF report, an interactive web dashboard, a slide in a PowerPoint, or a printed poster? Format determines design choices.

Technical Capabilities

What tools and skills are available? A beautiful D3.js visualization is worthless if no one on the team can build or maintain it.

Understanding the Users

Visualizations are always made for someone. The user context fundamentally changes the design. Know these five common user environments:

User Context	Characteristics	Design Implications
Boardroom	Executives, high-stakes decisions, limited time	Simple, fast-reading summaries. Highlight the single most important number. High contrast.
One-to-One Exchange	Manager or analyst with a peer	More detail acceptable. Can support conversation and questions.
Large Range of Customers	Diverse backgrounds, variable expertise	Must work across knowledge levels. Clear labels, plain language. Avoid jargon.
Global Audience	Cross-cultural, multilingual	Mind color meanings (red ≠ danger universally), symbols, language, numeric formats.
Personal / Self	You are the only audience	Function over form. Quick EDA charts. No need for polished presentation.

User-Centered Design (UCD)

Good visualization design starts with understanding the user. The four UCD tools you should know:

User Persona

A fictional but research-based profile of a typical user — their job, goals, pain points, and technical literacy. Grounds design decisions in real human needs.

User Stories

"As a [type of user], I want to [do something] so that [I achieve a goal]." Breaks down user needs into concrete, testable requirements.

User Scenario

A narrative description of how a specific user accomplishes a specific goal with the visualization in a realistic context.

Empathy Map

A canvas that captures what the user says, thinks, does, and feels. Helps designers build empathy and uncover unstated needs.

Physical & Cognitive Characteristics of the User

Physical Capabilities

Color perception: ~8% of men have color vision deficiency. Never rely on color alone to encode information.
Ergonomics: Screen size, viewing distance, and input device (mouse vs. touch) affect usability.
Visual contrast: Low contrast is problematic for older users and those with vision impairments.

Cognitive Characteristics

Attention & memory: Working memory is limited. A cluttered chart forces the user to use cognitive resources on navigation rather than insight.
Recognition over recall: Users recognize familiar patterns faster than they recall abstract information. Use conventions.
Cognitive biases: Anchoring bias (first number seen anchors all other comparisons), confirmation bias (users seek evidence supporting existing beliefs).
Change blindness: Significant visual changes in dynamic visualizations can go unnoticed if not properly highlighted.

User Research Methods

How do you find out what your users need? User research methods are mapped across two dimensions:

Attitudinal vs. Behavioral

Attitudinal: What people say (surveys, interviews). Useful for stated preferences and opinions.

Behavioral: What people do (usability testing, analytics). Reveals actual behavior, which often differs from stated preferences.

Qualitative vs. Quantitative

Qualitative: More effective at revealing why — deep insights from small samples (interviews, usability sessions).

Quantitative: Shows what is happening and how much — statistical patterns from large samples (surveys, A/B tests, analytics).

Collaboration & Communication Contexts

Visualizations are often used in shared, multi-user settings:

Synchronous Communication: Real-time collaboration — live dashboards in meetings, conferencing, online games. Design for simultaneous group viewing.
Asynchronous Communication: Reports, email, social media — users interact at different times. Design must be fully self-explanatory without a presenter.

07 · Step 2: Prepare & Explore Data

Data Preparation: Editorial Focus & the 6 Mechanisms

Data preparation is typically the most time-consuming and intensive activity in any visualization project. Get it right — everything downstream depends on it.

A. Editorial Focus

Editorial focus is the story you want to tell through the visualization — the main narrative or message you want to emphasize to the reader. It determines the direction and goal of the visualization, not just what data to display.

Key Question for Editorial Focus

"What topic or question do I want readers to have answered after seeing this visualization?"

If you cannot answer this in one sentence, your editorial focus is not clear enough yet.

Why Do You Need Editorial Focus?

Ensures clarity — guarantees the visualization communicates a clear message.
Guides design decisions — determines what to emphasize and what to omit.
Prevents information overload — stops you from adding "everything" to a single chart.
Delivers the right insight — helps surface the finding that actually matters.

Classic Example

Without editorial focus: Show all products, all regions, all metrics in one chart → information overload.
With editorial focus (goal: show sales decline after 2023): Show only total sales per year, highlight 2023–2024, add annotation explaining the cause.

The most influential data visualizations in history — from the New York Times, The Guardian, National Geographic — succeed largely because of strong editorial focus. They do not dump data; they tell a specific, focused story.

B. Preparing & Familiarizing with Data

Data is the primary raw material. Without good data, there is no compelling story to tell. A strong visualization always starts from strong data. Datasets with errors or missing values do not just slow down analysis — they can corrupt the message you are trying to deliver.

The 6 Mechanisms of Data Preparation

Andy Kirk's 6-Step Data Preparation Process

Acquisition — Obtain your data. Sources include: a colleague or client, a download from an organizational system, manual data collection, web API extraction, web scraping, PDF extraction, and more.

⚠️ Ethical Concerns in Acquisition: Ensure data is (1) obtained ethically and responsibly; (2) legally compliant with relevant regulations; (3) respects privacy and confidentiality of sensitive data; (4) used according to its license — especially if publishing or monetizing.

Examination — Determine your confidence in the data. Use tools (Excel, Tableau, Google Refine) to scan, filter, sort, and search the dataset to establish its quality. Examination covers two dimensions:

Completeness — Is it all there?

Does it have all the categories needed?
Does it cover the full time period needed?
Are all expected fields/variables present?
Does it contain the expected number of records?

Quality — Is it clean?

Are there errors or incorrect values?
Unexplained classifications or coding conventions?
Formatting issues (unusual dates, weird ASCII characters)?
Missing items or incomplete records?
Duplicate rows?
Accuracy issues — does the data appear plausible?
Unusual values or obvious outliers that need investigation?

Understand Data Types — Know the fundamental structure of your variables. The type of data determines which chart is valid, which statistics are meaningful, and which visual encodings are appropriate.

Transforming for Quality — Clean the data by resolving errors found in examination: removing duplicates, filling/handling missing data, cleaning erroneous values, and standardizing formats (dates, string encoding, etc.).

Transforming for Analysis — Prepare and refine data for analysis and presentation:

Parsing: Split up variables (e.g., extract year from a full date string).
Merging: Combine variables into new ones (e.g., first name + surname → full name).
Converting: Turn qualitative/free-text data into coded values or keywords.
Deriving: Create new values from existing ones (e.g., derive gender from title, sentiment score from text).
Calculating: Create new metrics (e.g., percentage proportions, ratios, moving averages).
Removing redundancy: Drop variables you have no planned use for in the visualization.
Determining resolution: Decide how granular to show the data (see Resolution Options below).

Consolidating — Even after preparation, there may be gaps. Additional layers of data may need to be combined with the existing dataset — for supplementary calculations, context, or to enhance the scope of communication. Pro tip: Always consider whether you need additional data to help frame the story before you begin designing.

Data Types You Must Know

Understanding your data type is not academic trivia — it determines which chart types are valid and which statistics are meaningful.

Type	Subtype	Description	Examples
Categorical	Nominal	Named groups with no inherent order. You can count and compare frequencies, but not rank or calculate averages.	Countries, gender, product category, text labels
Categorical	Ordinal	Named groups with a meaningful order, but the gaps between levels are not uniform or measurable.	Olympic medals (Gold/Silver/Bronze), Likert scale (Strongly Agree → Strongly Disagree), education level
Quantitative	Interval Scale	Numeric values where differences are meaningful, but there is no true zero. Ratios are meaningless.	Temperature in °C or °F, calendar dates (year 0 is arbitrary)
Quantitative	Ratio Scale	Numeric values with a true absolute zero. All arithmetic operations are valid. Ratios are meaningful.	Prices, age, distance, weight, speed, count of items

Interval vs. Ratio: The Key Difference

20°C is not "twice as hot" as 10°C — temperature on the Celsius scale has no true zero, so ratios are meaningless. But $20 is twice as much as $10 — money has a true zero. This affects what calculations and visual encodings are appropriate.

Resolution Options — At What Level of Detail?

One of the most important decisions in data preparation is choosing the level of resolution at which to present the data. Showing too much detail creates visual noise; too little hides important patterns.

Five Resolution Strategies

Full

Full Resolution: Plot every available data point as an individual mark. Best for small datasets or when individual data points tell important stories (e.g., each patient in a clinical trial).

Filtered

Filtered Resolution: Exclude records based on specific criteria (e.g., show only transactions above $1,000; show only data from 2020–2024). Focuses attention on the relevant subset.

Aggregate

Aggregate Resolution: "Roll up" the data by a dimension — month, year, category, region. Instead of 365 daily data points, show 12 monthly averages. Reveals macro-trends that individual points obscure.

Sample

Sample Resolution: Apply mathematical selection rules to extract a fraction of potential data. Useful when datasets are so large (billions of records) that plotting all points would be computationally impossible or visually unreadable.

Headline

Headline Resolution: Show only the overall statistical totals — a single number, a ratio, a summary metric. Maximum simplification. Best for executive dashboards or quick key performance indicators.

08 · Glossary

Complete Term Reference

Every key term from the lectures defined clearly. Great for last-minute review.

DataViz Data Visualization — The use of computer-supported, interactive, visual representations of data to amplify cognition.

InfoViz Information Visualization — Like DataViz but specifically applied to abstract data (not raw numerical data).

Core Concept Representation — The choice of physical/visual forms used to encode data (shape, position, length, color, area, etc.).

Core Concept Presentation — How the data representation is integrated into the complete communicated work, including layout, color, annotations, and interactivity.

Core Concept Amplify Cognition — Maximizing how efficiently and effectively humans process visual information into thought, insights, and knowledge.

Principles Forms & Functions — Principle 1: Aesthetic form and practical function must work together, not against each other.

Principles Deliberate Design — Principle 2: Every visual element (shape, color, label, interaction) is included for a specific, reasoned purpose.

Principles Intuitive Design — Principle 3: Designing for accessibility such that the visualization is immediately understandable without training or manuals.

Principles Visualization Ethics — Principle 4: The responsibility to never deceive the receiver, whether intentionally or through poor design choices.

Gestalt Gestalt Laws — Psychological principles describing how humans naturally group and perceive visual elements (proximity, similarity, continuity, closure, figure/ground).

Methodology Fry's 7 Stages — Acquire → Parse → Filter → Mine → Represent → Refine → Interact. An iterative process model for creating data visualizations.

Methodology Kirk's 5 Steps — Purpose & Parameters → Prepare & Explore → Formulate Questions → Design Concepting → Construct & Launch. The primary course methodology.

Function Explanatory Viz — A visualization built around a specific, predetermined narrative to convey to the reader.

Function Exploratory Viz — A visualization designed as a tool for the user to conduct their own visual analysis, without a pre-set narrative.

Function Exhibition / Data Art — Visualization where the intent is aesthetic self-expression or emotional impact through data, not information transfer.

Tone Pragmatic Tone — A visualization designed for analytical, data-reading behavior. The reader extracts values and compares numbers.

Tone Emotive/Abstract Tone — A visualization designed to create a personal emotional experience or visceral impact, often using non-standard visual forms.

Data Prep Editorial Focus — The specific narrative or message you want to communicate; determines what data to show, highlight, and omit.

Data Prep Data Acquisition — The process of obtaining data from its source, including ethical, legal, and privacy considerations.

Data Prep Data Examination — Assessing data completeness (is it all there?) and quality (is it clean and accurate?).

Data Types Nominal Data — Categorical data with no inherent order (e.g., country, gender, product type).

Data Types Ordinal Data — Categorical data with a meaningful order, but non-uniform gaps between levels (e.g., Likert scale, Olympic medals).

Data Types Interval Data — Numeric data with meaningful differences but no true zero. Ratios between values are not meaningful (e.g., temperature in °C).

Data Types Ratio Data — Numeric data with a true absolute zero. All arithmetic operations are valid (e.g., price, age, distance).

Resolution Full Resolution — Showing every individual data point as a mark in the visualization.

Resolution Aggregate Resolution — Rolling up data to a coarser level (e.g., monthly totals instead of daily records).

Resolution Headline Resolution — Showing only the highest-level statistical summary (a single number or ratio).

UCD User Persona — A fictional but research-grounded profile of a target user, used to guide design decisions.

UCD Empathy Map — A canvas capturing what users say, think, do, and feel — used to build empathy and surface unstated needs.

Cognitive Change Blindness — The tendency for significant visual changes to go unnoticed, especially relevant in dynamic/animated visualizations.

Cognitive Anchoring Bias — The tendency for the first number encountered to disproportionately influence all subsequent judgments.

Cognitive Confirmation Bias — The tendency to interpret data in ways that confirm pre-existing beliefs. Designers must account for this in audience behavior.

01 · Design Concepting

Two Dimensions of Visualization Design

Step 4 of Kirk's methodology — Design Concepting — is where all earlier work (purpose, data prep, questions) gets translated into a visual artifact. Every design decision lives in one of two dimensions.

Where Does This Fit?

Kirk's 5-step methodology: Purpose & Parameters → Prepare & Explore Data → Formulate Questions → Design Concepting ← you are here → Construct & Launch.

Design Concepting asks: How do we give form to our data? The answer lives across two dimensions:

Data Representation

How we give form to our data through the use of "visual variables" to construct chart or graph types.

Think: What kind of chart? What shape encodes the data?

Covered by the 5 representation methods and all the chart types within them.

Data Presentation

The delivery format, appearance, and synthesis of the entire design. Concerns the layers of: color use, interactivity, annotation, and the arrangement of all elements.

Think: How does the whole thing look, feel, and behave?

Covered by color, interactivity, annotation, and architecture.

Data Representation Step: Two Sub-decisions

1. Choose the correct visualization method (which of the 5 categories does your data need?)
2. Choose the appropriate chart type within that method (which specific chart best fits the data and purpose?)

02 · The 5 Representation Methods

Five Categories of Visualization

Every chart type in existence belongs to one of these five purposes. Knowing which category your data falls into is the first decision in representation.

#	Method	What It Does	Classic Example
1	Comparing Categorical Values	Facilitate comparisons between the relative and absolute sizes of categorical values.	Bar Chart
2	Assessing Hierarchies & Part-of-a-Whole	Show a breakdown of categorical values in relationship to a population, or as elements of hierarchical structures.	Pie Chart
3	Showing Changes over Time	Exploit temporal data to show changing trends and patterns of values over a continuous time frame.	Line Chart
4	Plotting Connections & Relationships	Assess associations, distributions, and patterns between multivariate datasets. Usually facilitates exploratory analysis.	Scatter Plot
5	Mapping Geo-Spatial Data	Plot and present datasets with geo-spatial properties.	Choropleth Map

03 · Method 1

Comparing Categorical Values

Charts in this group allow you to compare the size, frequency, or magnitude of distinct categories against each other.

Method 1 · Comparing Categories

Chart Type	Data Variables	Visual Variables	What You Get / When to Use
Bar / Column Chart	1 categorical + 1 quantitative	Height / Length, Position	The workhorse of comparison. Compares magnitudes across discrete categories. Bars = horizontal, Columns = vertical.
Floating Bar (Gantt Chart)	1 categorical-nominal + 2 quantitative	Position, Length	Shows a range of quantitative values per category (bar stretches from min to max, not from zero). Reveals variation, overlap, and outliers across categories.
Pixelated Bar Chart	Multiple categorical + 1 quantitative	Height, Color-hue, Symbol	Two levels of resolution in one: global bar chart view (aggregate) + detail view inside each bar (pixels/symbols). Usually interactive — hover a pixel for precise detail.
Histogram	1 quantitative-interval + 1 quantitative-ratio	Height, Width	Shows frequency distribution of a continuous quantitative variable over binned intervals. Key difference from bar chart: no gaps between bars; used for continuous data, not categorical.
Slopegraph (Bumps / Table Chart)	1 categorical + 2 quantitative	Position, Connection, Color-hue	Compares two (or more) quantitative values linked to the same categories. Perfect for before–after or two-point-in-time comparisons. The slope direction and steepness encode change.
Radial / Circular Bar Chart	Multiple categorical + 1 categorical-ordinal	Position, Color-hue, Color-saturation, Texture	Displays changes over time (each ring = time period), proportional comparisons, and multi-category compositions. Good for overview and pattern detection; not for reading precise values.
Glyph Chart	Multiple categorical + multiple quantitative	Shape, Size, Position, Color-hue	Uses a repeated shape (e.g., a flower) where each part encodes a variable. Not for precise reading — for relative comparisons (big, medium, small). Usually interactive for exploration.
Sankey Diagram	Multiple categorical + multiple quantitative	Height, Position, Link, Width, Color-hue	Shows flow — how quantities move from one stage to another through connecting ribbons. Ribbon width = magnitude of flow. Best for multi-stage processes or transformations.
Area Size Chart (Bubble / Circle)	1 categorical + 1 quantitative-ratio	Area, Color-hue	Uses circle area to represent magnitude. Often used to emphasize stark inequality between categories. Area is less accurately perceived than length — use with caution for precise comparison.
Small Multiples (Trellis Chart)	Multiple categorical + multiple quantitative	Position + any visual variable	A grid of small identical charts, each showing one subset of the data. Exploits the eye's ability to quickly scan and compare many similar charts simultaneously. Best for many categories or time-series comparisons.
Word Cloud	1 categorical + 1 quantitative-ratio	Size	Font size encodes word frequency. Color is usually decorative only. Good for early exploratory text analysis to find key terms — not for precise frequency comparison. Requires good text preprocessing.

04 · Method 2

Assessing Hierarchies & Part-of-a-Whole

These charts show how a total is broken down into constituent parts, or how elements are nested within larger structural hierarchies.

Method 2 · Hierarchies & Part-of-a-Whole

Chart Type	Data Variables	Visual Variables	What You Get / Notes
Pie Chart	1 categorical + 1 quantitative-ratio	Angle, Area, Color-hue	Often criticized because angles and areas are harder to compare accurately than length or position. Problems arise from misuse: too many categories, 3D effects, disorganized slices. Best practice: max 3 categories, start first slice at vertical, arrange logically.
Stacked Bar Chart	2 categorical + 1 quantitative-ratio	Length, Color-hue, Position, Color-saturation	Shows composition of categories using color + position. Can use absolute or normalized values. Weakness: inner segments are hard to compare accurately because they lack a shared baseline. Use ordinal ordering for ordinal data (e.g., sentiment: disagree → agree).
Square Pie / Waffle Chart / Unit Chart	1 categorical + 1 quantitative-ratio	Position, Color-hue / Symbol	More accurate than pie/donut because it uses grid areas (e.g., 100 squares = 100%). Small parts remain visible and distinguishable. Stays clean even with multiple categories. Good for percentage-based narratives.
Treemap	Multiple categorical-nominal + 1 quantitative-ratio	Area, Position, Color-hue, Color-saturation	Nested rectangles where area encodes magnitude. Great for showing large hierarchical datasets in a compact space. Color can encode a second dimension (e.g., growth rate). Area comparison is less precise than length.
Circle Packing Diagram	2 categorical + 1 quantitative-ratio	Area, Color-hue, Position	Many circles packed inside a large circle. Each circle = a category; size = quantitative value; color/position = hierarchy or grouping. Not for precise reading — for seeing relative scale and groupings.
Bubble Hierarchy	Multiple categorical + 1 quantitative-ratio	Area, Position, Color-hue	Similar to circle packing but with a more explicit hierarchical arrangement. Bubbles of different sizes grouped by category.
Tree Hierarchy (Dendrogram)	2 categorical + 1 quantitative-ratio	Angle/Area, Position, Color-hue	A tree-shaped diagram showing parent–child relationships. The branching structure makes hierarchical relationships and depth immediately visible.

Pie Chart Best Practices (Exam-Likely)

Start the first slice from the 12 o'clock (vertical) position as a reference. Limit to ideally maximum 3 categories. Order segments logically. Avoid 3D effects, too many colors, or decorations. If you have more than 3 categories — use a bar chart instead.

05 · Method 3

Showing Changes over Time

Temporal charts exploit time as the primary axis, revealing trends, cycles, and changes in value across a continuous time frame.

Method 3 · Changes over Time

Chart Type	Data Variables	Visual Variables	What You Get / Notes
Line Chart	1 quant-interval (time) + 1 quant-ratio + 1 categorical	Position, Slope, Color-hue	The fundamental temporal chart. Slope encodes rate of change. Multiple lines can compare categories over time. Y-axis does not need to start at 0 for line charts.
Sparklines	1 quant-interval + 1 quant-ratio	Position, Slope	Line charts in miniature — Edward Tufte's "intense, word-sized graphics." Not a new chart type, just a very small line chart. Ideal for embedding trend context inside tables or dashboards where space is precious.
Area Chart	1 quant-interval + 1 categorical + 1 quant-ratio	Height, Slope, Area, Color-hue	Like a line chart but with the area below the line filled. The filled area emphasizes cumulative volume. Important: the Y-axis must start at zero, because the area encoding (unlike line slope) requires a true baseline for accurate interpretation.
Horizon Chart	1 quant-interval + 1 categorical + 2 quant-ratio	Height, Slope, Area, Color-hue, Color-saturation	A modified area chart that folds negative values upward and uses color to distinguish positive/negative. Allows many time series to be stacked vertically in very little space, enabling cross-series pattern comparison.
Stacked Area Chart	1 quant-interval + 1 categorical + 1 quant-ratio	Height, Area, Color-hue	Multiple area charts stacked on top of each other. Shows how the composition of categories changes over time. Weakness: middle bands are hard to read accurately because they lack a shared baseline.
Stream Graph	1 quant-interval + 1 categorical + 1 quant-ratio	Height, Area, Color-hue	Like a stacked area chart but without a baseline — layers flow organically around a central axis. Emphasizes peaks and troughs ("ebb and flow") over time. Not for reading precise values. Aesthetic and organic feel.
Candlestick Chart	1 quant-interval + 4 quant-ratio	Position, Height, Color-hue	Used in financial data. Shows OHLC (Open, High, Low, Close) for each time period. Bar height = range from open to close; color = price up or down; wicks = high/low range. Conceptually similar to a boxplot.
Barcode Chart	1 quant-interval + 3 categorical	Position, Symbol, Color-hue	A very compact visualization of event sequences over time using symbols and color. Similar space-efficiency to sparklines. Requires some familiarity to read, but packs a rich story in minimal space.
Flow Map	Multiple quant-interval + 1 categorical + 1 quant-ratio	Position, Height/Width, Color-hue	Like a Sankey diagram, but for change over time and/or location. Famous example: Napoleon's 1812 Russian campaign (Minard's map) where ribbon width = troops remaining. Geo-positions are roughly followed but the map is not fully detailed.

Area Chart vs Line Chart: Y-Axis Rule

A line chart can have a Y-axis that does not start at zero — the slope carries the meaning. An area chart must have its Y-axis start at zero, because the filled area is what readers judge, and a truncated area creates false impressions of magnitude.

06 · Method 4

Plotting Connections & Relationships

These charts assess associations, distributions, and patterns between variables — and they usually serve exploratory analysis rather than explanatory storytelling.

Method 4 · Connections & Relationships

Chart Type	Data Variables	Visual Variables	What You Get / Notes
Scatter Plot	2 quantitative	Position, Color-hue	The most fundamental relationship chart. Reveals correlations, clustering, and outliers between two continuous variables. X and Y positions encode the two variables.
Bubble Plot	3 quantitative + 1 categorical	Position, Area, Color-hue	A scatter plot extended with a third dimension: bubble area encodes a third quantitative variable; color encodes a category. More information in one chart — but area comparison is less accurate than position.
Scatter Plot Matrix	2 quantitative + 2 categorical	Position, Area, Color-hue	A grid of scatter plots showing every pairwise variable combination simultaneously. Like small multiples applied to correlation analysis. Excellent for multivariate datasets — lets the eye quickly scan across many variable pairings to spot strong/weak relationships.
Heatmap / Matrix Chart	Multiple categorical + 1 quantitative-ratio	Position, Color-saturation	A matrix where cells are colored by value. Like small multiples using color as the visual variable. Fast visual scanning for patterns, ordering, and hierarchy across category combinations. Good for correlation matrices, calendar heat, and confusion matrices.
Parallel Sets / Parallel Coordinates	Multiple categorical + multiple quant-ratio	Position, Width, Link, Color-hue	Multiple parallel axes, each representing a variable. Each data item is drawn as a polyline crossing all axes. Reveals multi-variable relationships, patterns, and consistency. Functionally similar to Sankey — both show connections across categories.
Chord / Radial Network Diagram	Multiple categorical + 2 quant-ratio	Position, Connection, Width, Color-hue, Color-lightness, Symbol, Size	A circular layout where connecting ribbons/chords between categories show the strength and direction of their relationships. Used for complex, bidirectional relationships. Not constrained by X/Y axes.
Network Diagram	Multiple categorical-nominal + 1 quant-ratio	Position, Connection, Area, Color-hue	Nodes (entities) connected by edges (relationships). Reveals clusters, sparse connections, dominant nodes, and structural patterns. Often visually complex and "hairball-like" for large datasets — requires careful layout algorithms.

07 · Method 5

Mapping Geo-Spatial Data

When your data has a geographic dimension, maps place it in spatial context. Geographic position itself becomes a visual variable.

Method 5 · Geo-Spatial Data

Chart Type	Data Variables	Visual Variables	What You Get / Notes
Choropleth Map	2 quant-interval + 1 quant-ratio	Position, Color-saturation/lightness	Geographic regions (countries, provinces) colored by quantitative value using a gradient (light → dark). Popular but has a critical weakness: larger regions visually dominate even if they have smaller populations, creating potential distortion.
Dot Plot Map	2 quant-interval	Position	Each data point is placed at its geographic coordinates as a dot. Simple and honest — each dot = one occurrence. Dense clusters emerge naturally without distortion from region size.
Bubble Plot Map	2 quant-interval + 1 quant-ratio + 1 categorical-nominal	Position, Area, Color-hue	Combines a map with a bubble plot: dots at geographic coordinates scaled by quantitative value, colored by category. Shows "how much" per location simultaneously.
Isarithmic Map (Contour / Isoline)	Multiple quantitative + multiple categorical	Position, Color-hue, Color-saturation, Color-darkness	Uses contour lines or color gradients to show continuous values over geographic space (like elevation, temperature, rainfall). Familiar from weather maps and topographic maps.
Particle Flow Map	Multiple quantitative	Position, Direction, Thickness, Speed	Animated particles or arrows that flow across the map to encode direction, magnitude, and movement — e.g., wind patterns, ocean currents. Direction and speed are the key encodings.
Cartogram	2 quant-interval + 1 quant-ratio	Position, Size	A distorted map where each region's size is proportional to a variable (e.g., population or GDP) rather than its true geographic area. Corrects the choropleth's size-dominance problem — but geographic shape is distorted.
Dorling Cartogram	2 categorical + 1 quant-ratio	Position, Size, Color-hue	A variant of cartogram where regions are replaced by uniform circles scaled by value. No geographic shape distortion — circles are simply positioned approximately where the region is. Clean and readable.
Connection Map	2 quant-interval + 1 categorical-nominal	Position, Link, Color-hue	Lines drawn between geographic locations to show connections or flows between places (e.g., flight routes, migration, trade). Line weight and color can encode quantity or category.

08 · Choosing the Appropriate Chart Type

Three Criteria for Chart Selection

Once you know which representation method applies, use these three criteria to select the specific chart type within that method.

The Three Criteria (in order)

Accommodate the Physical Properties of the Data
The chart must match what your data actually is. Ask: How many data variables do I have? Are they categorical or quantitative? Are they nominal, ordinal, interval, or ratio? The chart's requirements (listed in every chart's "Data Variables" above) must align with your dataset's structure.

Example: If you have 1 categorical + 1 quantitative → bar chart works. Add a second categorical dimension → grouped or stacked bar chart. Add a quantitative second variable → consider a dot plot or scatter plot.

Facilitate the Desired Degree of Accuracy
Different visual variables allow different levels of perceptual accuracy. If the reader needs to compare precise values, choose a chart that uses position or length (most accurate). If an impressionistic sense of scale is sufficient, area or color can be used.

→ This connects directly to the Visual Variable Ranking (Part 9 below).

Create an Appropriate Metaphor (Stylistic Fit)
Integrate a visual quality that conveys a deeper connection between the data, the design, and the topic. This is the most subjective criterion — it requires design instinct and experience. A well-chosen metaphor makes the chart feel like it belongs to its subject.

Example: Using a flow-like stream graph to show "ebb and flow" of musical trends feels more appropriate than a stacked bar chart, even if both technically convey the same data.

The Final Solution (Andy Kirk)

"The key is not to set out to achieve an attractive and attention-grabbing work — let those qualities emerge as a by-product of good design. Focus instead on delivering the appropriate functional elements by employing the most suitable data representation."

09 · Visual Variables & Perceptual Ranking

What Are Visual Variables? How Accurately Can We Read Them?

Visual variables are the specific visual forms we assign to data to represent it. Understanding which ones humans perceive most accurately is crucial for choosing the right chart.

Definition: Visual Variable

A visual variable is the specific form we assign to data in order to represent it visually. Examples include:

The length or height of a bar
The position of a point on an axis
The color (hue or saturation) of a region on a map
The area of a bubble
The connection between two nodes in a network
The slope of a line between two points

McKinlay's Perceptual Accuracy Ranking (1986)

Not all visual variables are perceived with equal accuracy. McKinlay's ranking tells us which encodings allow the most precise comparisons — and which should be used only for general impressions.

Position on a common scale

Most accurate — e.g., aligned bar charts

Position on identical (non-aligned) scales

e.g., small multiples

Length

e.g., bar chart bar lengths

Angle / Slope

e.g., pie chart slices, line slopes

Area

e.g., bubble charts, treemaps

Volume

e.g., 3D charts — rarely recommended

Color saturation / density

e.g., choropleth shading intensity

Color hue

Good for categorical distinction, not quantity

Shape / Texture

Least accurate for quantity comparison

Key Takeaway for Exams

Position > Length > Angle > Area > Color in terms of perceptual accuracy. This is why bar charts (position + length) are generally more accurate than pie charts (angle + area), and why using color alone to encode quantitative values is the least effective approach.

Visual Variable Example (from lecture)

In a scatter plot of films: X-axis position = profit, Y-axis position = review score, circle area = budget, circle color (hue) = genre. Notice how the most important variable (profit) uses the most accurate encoding (position), while less critical variables use less precise ones (area, color).

10 · Data Presentation — Color

The Use of Color in Data Visualization

Color is the most powerful — and most misused — tool in data visualization. It most efficiently taps into the pre-attentive processing of the visual system, but must be used carefully and with a clear purpose.

Two Key Rules of Color Use

Rule 1 — Use It Unobtrusively

Color should not mislead by implying a representation (category distinction, quantity ordering) when no such representation was intended. Using random colors just for decoration creates false visual groupings in the reader's mind.

Rule 2 — Strive for Elegance, Not Novelty

The sensible objective is elegance over attractiveness. A restrained, purposeful color palette almost always beats a flashy rainbow. As with all design layers, every color choice must serve a function.

Three Purposes of Color

Purpose 01

To Represent Data

Using color to encode a variable — either a category (by hue) or a quantity (by saturation/lightness). This is color at its most functional and informative.

Purpose 02

To Bring the Data Layer to the Fore

Using color contrast to make the data visually prominent — separating it from background elements, grid lines, and labels. High contrast ensures the data is what the eye goes to first.

Purpose 03

To Conform to Design Requirements

Using an organization's brand color palette or design system. Corporate dashboards must often use predefined colors regardless of perceptual optimality.

Color to Represent Quantitative Data

Color hue (red, blue, green…) has no inherent hierarchy or order of magnitude in human perception. You cannot tell which of red, blue, or green is "bigger" just from the hue alone. Therefore:

Color Scheme Types for Quantitative Data

Sequential

Use lightness (light → dark) of a single hue to encode order of magnitude. The darkest shade = highest value. Readers intuitively associate dark with "more." Example: Light blue → Dark blue for low-to-high population density.

Diverging

Use two contrasting hues with a neutral midpoint to encode data that diverges around a meaningful center (e.g., positive vs. negative, above vs. below average). Example: Red ← white → Blue for political lean or temperature anomaly.

Traffic Light

Red (bad) → Amber (average) → Green (good). A universally understood metaphor for performance. Critical caveat: ~10% of the population (especially males) has red-green color deficiency. Solution: Switch green for blue (Red → Amber → Blue).

Color for Categorical Variables

Color hue is a particularly strong aid for distinguishing categorical variables — it triggers pre-attentive processing instantly.
Use a maximum of 12 different hues for category distinction. Beyond that, the palette becomes confusing and categories become hard to tell apart.
Be sensitive to cultural color meanings — colors carry different symbolic associations across regions (e.g., white = mourning in some Asian cultures; red = luck in China but danger in the West).

Color for Contrast: Foreground vs. Background

Some color combinations are inherently low-contrast and should be avoided:

Blue on black — many people struggle to discriminate.
Yellow on white — nearly invisible for most viewers.
Always check your chosen palette against color blindness simulators (e.g., Vis Check at vischeck.com) to test perceptibility for users with color vision deficiencies.

Color Blindness — Always Consider This

Approximately 10% of males have red-green color deficiency. If your visualization relies on red vs. green to communicate a distinction, it fails for 1 in 10 male readers. Use a color blindness simulator during design, and always include a secondary encoding (shape, pattern, label) alongside color.

11 · Data Presentation — Interactivity

The Potential of Interactive Features

Interactivity transforms a static visualization into a tool. The four categories of interactive features each serve a different user need.

Feature 01

Manipulating Variables & Parameters

Giving the user control over what data is displayed:

Select — choose a specific data item or category
Filter — include/exclude based on criteria
Exclude — remove unwanted categories
Modify variables — change which variables are shown on each axis
Grouping & sorting — reorganize the data by different dimensions
Brushing — click-drag to highlight a set of data marks across linked views

Feature 02

Adjusting the View

Adjusting the user's lens or window into the subject:

Vertical exploration — drill down through hierarchical layers of detail (e.g., country → region → city)
Horizontal tabs/panels — switch between different views or cuts of the data
Pan & zoom — navigate spatial or temporal data at different scales

Feature 03

Annotated Details (Tooltips)

Creating extra layers of data detail through hover or click events:

Reveals actual data values on demand (avoid cluttering the main view with all labels)
Provides extra detail about a specific data point, category, or event
Supports: titles, introductions, user guides, visual annotation, legends/keys, units, data sources, and attribution

Feature 04

Animation

Using time-based transitions to tell a story:

Play / Pause / Reset controls for time-series playback
Manually controllable time sliders — let users scrub through time at their own pace
Chapter navigation — skip to key milestones in a narrative visualization
Use sparingly — animation should explain, not just impress

12 · Data Presentation — Architecture & Arrangement

Layout, Placement & Organization

The final presentation layer — how all visible elements are positioned and organized in the overall design.

The Architecture Layer

Architecture and arrangement considers how to lay out the overall design: the placement, organization, and visual hierarchy of all elements (charts, legends, titles, annotations, controls).

Two Core Aims

For the Eye

Reduce the amount of work the eye has to undertake to navigate around the design and decipher the sequence and hierarchy of the display. The eye should naturally flow to the most important element first.

For the Brain

Minimize the amount of thinking and "working out" that goes on in order to understand the layout. A well-arranged design is immediately readable — the reader should never have to ask "where do I look first?"

Guiding Principle: Must Be Intuitive

The key aim of architecture and arrangement is to make the visualization intuitive to navigate. Readers should not need to read a legend to understand the layout structure. The position, size, and grouping of elements should naturally communicate their relationship to each other.

Explanatory Annotation — The 8 Elements

Effective annotation in a visualization includes: Titles · Introductions · User guides · Visual annotation · Legends/keys · Units · Data sources · Attribution. These create the informational scaffold that makes the visualization self-contained and trustworthy.

13 · Recap

High-Yield Summary

Everything from this lecture in one tight review pass.

Two Dimensions of Design

Data Representation — choosing chart type using visual variables.
Data Presentation — color, interactivity, annotation, and layout.

5 Representation Methods — The Categories

Comparing Categorical Values → Bar, Sankey, Histogram, Small Multiples…
Hierarchies & Part-of-a-Whole → Pie, Treemap, Stacked Bar, Waffle…
Changes over Time → Line, Area, Sparklines, Candlestick, Stream Graph…
Connections & Relationships → Scatter, Heatmap, Network, Chord…
Geo-Spatial Data → Choropleth, Cartogram, Bubble Map, Connection Map…

Chart Selection Criteria (3 steps)

Accommodate the physical properties of the data (data variable types and count).
Facilitate the desired degree of accuracy (choose visual variables appropriately).
Create an appropriate metaphor (stylistic fit to the subject).

Perceptual Accuracy Order (McKinlay)

Position > Length > Angle > Area > Volume > Color Saturation > Color Hue > Shape

Color — 3 Purposes, 2 Rules, 3 Schemes

Purposes: Represent data · Bring data to foreground · Conform to design requirements.
Rules: Use unobtrusively · Strive for elegance not novelty.
Quantitative schemes: Sequential (lightness gradient) · Diverging (two hues + midpoint) · Traffic light (Red/Amber/Green — use Blue instead of Green for accessibility).
Max 12 hues for categorical distinction. Always test for color blindness.

4 Interactive Feature Types

Manipulate variables & parameters — filter, select, brush, sort, group.
Adjust the view — drill down, horizontal tabs, pan/zoom.
Annotated details — tooltips, hover/click reveals.
Animation — play/pause/slider for temporal data.

Architecture Aim

Minimize eye travel and cognitive effort. Must be intuitive. A layout succeeds when the reader never has to figure out where to look next.

The Design Concepting Mantra

Let attractiveness emerge as a by-product of good design — not as the goal. Focus on functional representation first. Every visual variable, every color, every interactive feature should earn its place by serving the reader's comprehension.

09 · Recap

High-Yield Summary — What to Remember for the Midterm

The most likely exam targets, organized by topic. If you only have 30 minutes left before the exam, read this section.

Must-Know Definitions

DataViz = Representation + Presentation of data that exploits visual perception to amplify cognition.
Difference between DataViz (raw data) and InfoViz (abstract data).
Representation = choice of visual form. Presentation = the full delivered work including colors, layout, annotations.
Amplify Cognition = making insight extraction faster and more accurate.

The 4 Key Principles (Number them correctly!)

Strive for Forms & Functions — both, not either/or.
Always justify every design choice — deliberate design; nothing is arbitrary.
Create accessibility through intuitive design — clutter is a design failure.
Never deceive the receiver — intentional or unintentional, deception is unethical.

Two Methodologies — Know Both

Fry (2008): 7 stages — Acquire, Parse, Filter, Mine, Represent, Refine, Interact.
Kirk (2012): 5 steps — Purpose & Parameters, Prepare & Explore, Formulate Questions, Design Concepting, Construct & Launch.

Three Visualization Functions

Explanatory — specific narrative, you define the story, visual presentation.
Exploratory — user-driven, no single narrative, visual analysis tool.
Exhibition/Data Art — aesthetic self-expression, emotional impact, not pure information transfer.

Two Tones

Pragmatic — analytical reading, value extraction, corporate/scientific.
Emotive/Abstract — emotional impact, persuasion, art, curves and organic forms.

Four Data Types

Nominal — categories, no order (country, gender).
Ordinal — categories with order, uneven gaps (Likert, medals).
Interval — numeric, no true zero, differences meaningful (dates, °C).
Ratio — numeric, true zero, ratios meaningful (age, price, distance).

Six Data Preparation Steps

Acquisition → Examination → Understand Data Types → Transforming for Quality → Transforming for Analysis → Consolidating.

Five Resolution Options

Full → Filtered → Aggregate → Sample → Headline.

The Golden Thread

Every concept in this course connects back to one idea: effective visualization serves the reader's ability to understand data. Whether you are choosing a chart type, a color, a tone, or a resolution level — always ask: "Does this decision help the reader understand the data better and faster?" If no, cut it.