IF 4061 · Data and Information Visualization · STEI ITB

Data Visualization
Complete Midterm Notes

Intro · Key Principles · Methodology · Data Preparation
Representation & Presentation 4 Key Principles Fry & Kirk Methodology Viz Function & Tone Data Types Data Preparation Editorial Focus User-Centered Design
References: Andy Kirk (2012) · Colin Ware (2004) · Fry (2008) · Lecture Decks IF4061 Sem 2 2025/2026 · Dessi Puji Lestari
01 · What Is Data Visualization?

Definitions, Core Elements & Key Concepts

Before you can design anything, you need to understand what visualization actually is and why it works. Start here.

The Core Definition

Data Visualization has two widely accepted definitions. They sound similar but have an important difference:

Data Visualization

The use of computer-supported, interactive, visual representations of data to amplify cognition.

→ Focus: raw data (structured numbers, tables, etc.)

Information Visualization

The use of computer-supported, interactive, visual representations of abstract data to amplify cognition.

→ Focus: abstract/conceptual data (relationships, hierarchies, etc.)

Simplified Definition (Andy Kirk)

DataViz = the representation and presentation of data that exploits our visual perception abilities in order to amplify cognition. This definition breaks into three inseparable ideas: Representation, Presentation, and Amplify Cognition.

The Three Pillars of the Definition

Every visualization you ever make is built on these three ideas. Understand each one deeply.

1. Representation

Taking data as the raw material and creating a visual form to best portray its attributes. It is the choice of physical forms (shapes, lines, colors, positions) used to encode the data. Think of it as answering: "What shape does my data take?"

2. Presentation

Presentation goes beyond just showing the data. It concerns how you integrate the data representation into the overall communicated work. It includes decisions about:

  • Colors and color palettes
  • Layout and composition
  • Annotations, labels, and titles
  • Interactive features (hover tooltips, filters, etc.)

Think of it as: "How does my visual look and feel as a complete piece of work?"

3. Amplify Cognition

This is the why. Amplifying cognition means maximizing how efficiently and effectively we process information into thought, insights, and knowledge. A visualization that looks beautiful but confuses the viewer has failed. The goal is always to make the reader think better, faster, or more accurately.

The Simple Mental Model

DataViz = Representation + Presentation → Amplify Cognition

Every design decision must serve the goal of helping the reader understand the data faster and more correctly.

Art or Science?

Data visualization is both, but it leans more toward science than most people think. Doing it well requires knowledge from several traditionally separate fields:

Cognitive Science
How humans perceive, process, and remember visual information. Foundation of why certain charts work and others don't.
Statistics
Understanding what the data actually means, choosing the right summary measures, and avoiding misleading aggregations.
Graphic Design
Visual hierarchy, typography, color theory, and layout — making the chart readable and aesthetically coherent.
Cartography
For spatial/geographic data, applying principles developed over centuries of map-making.
Computer Science
Tools, algorithms, interaction design, and performance — implementing the visualization in software.
Key Quote — Stephen Few

"Getting visualization right is much more a science than an art, which we can only achieve by studying human perception."

Gestalt Laws (Theoretical Ancestry)

The Gestalt laws are psychological principles that explain how humans naturally group and perceive visual elements. They are the scientific backbone of why certain visual arrangements feel intuitive. You don't need to memorize all of them for the midterm, but know they exist and why they matter for DataViz design.

Proximity
Elements placed close together are perceived as a group. Use spacing to signal groupings in your chart.
Similarity
Elements that look alike (same color, shape, size) are seen as belonging together. The basis of color encoding in legends.
Continuity
The eye naturally follows lines and curves. Line charts exploit this to show trends over time.
Closure
The brain fills in gaps to perceive complete shapes. Partially drawn outlines are still recognized.
Figure/Ground
We automatically distinguish a foreground object from its background. Important for contrast and readability.

How to Make Good Visualization

Three things must be understood and balanced:

  1. Properties of the data and information — what type of data is it? What story does it hold?
  2. Properties of pictures — what visual encodings (position, length, color, area) are most accurately perceived by humans?
  3. Rules to map data into pictures — the grammar of graphics, the design principles, the methodology.
02 · Purpose of Data Visualization

Why Do We Visualize? The Two Core Purposes

Every visualization project is created for one of two reasons — or a blend of both. Know the difference.

Purpose 1 — Data Analysis

Using visualization to understand data and extract comprehensive information from it. The chart is a tool for you (the analyst), not necessarily for a general audience. When you visualize data to analyze it, you are exploring — looking for patterns, outliers, and hypotheses.

Famous Quote — John W. Tukey

"The greatest value of a picture is when it forces us to notice what we never expected to see." — This is the essence of exploratory data analysis (EDA).

Advantages of Visualization for Data Analysis

  • Understand large datasets faster — patterns that are invisible in a spreadsheet become obvious in a chart.
  • Capture important properties — distribution shape, outliers, trends, clusters.
  • Capture problems — visualization is a tool for quality control. Dirty data often shows up visually before you find it programmatically.
  • Facilitate new hypotheses — a chart can suggest relationships you had not thought to test.

Purpose 2 — Communication

Using visualization to communicate information to an audience. The emphasis here is on clarity, simplicity, and emotional tone. Visualization for communication incorporates simplification (removing noise) and tonal intent (the feeling you want to create in the reader).

Key Quote — Edward Tufte

"Overload, clutter, and confusion are not attributes of information — they are failures of design." If the reader is confused, the designer is at fault, not the data.

The Ultimate Goal

Regardless of purpose (analysis or communication), the ultimate goal of any visualization is to make readers feel like they have become better informed about a subject.

Mackinlay's Principle of Effectiveness

"Visualization A is more effective than B if the information conveyed by A is more readily perceived than the information in B." — Jock Mackinlay

Effectiveness is not about beauty. It is about perceptual efficiency — how fast and accurately a reader extracts the information.
03 · History & Milestones

A Brief History of Data Visualization

DataViz is not a new trend — it has existed for centuries. Understanding its history helps you appreciate how current practice evolved.

Historical Timeline

Visualization Milestones by Era
pre-1600
Maps & Diagrams. The earliest visualizations were geographic maps. Humans have been encoding spatial information visually for millennia — from ancient cave paintings to medieval cartography.
1600–1799
Theory & Metrics. The development of formal measurement systems, coordinate systems, and early statistical graphs. Scientists began plotting data points on axes to understand astronomical and physical phenomena.
1800–1974
Modern Infographics Begin. The 19th century saw the invention of many chart types we use today — the bar chart and pie chart were invented by William Playfair (late 18th century). John Snow's famous 1854 cholera map is a landmark of data-driven visual reasoning. Florence Nightingale's polar area diagrams influenced public health policy.
1975–Now
Computer-Aided Visualization. Catalyzed by powerful computing and a cultural shift toward transparency and data accessibility. The internet made data and visualizations broadly accessible. "Data is the new oil." (Michael Palmer, 2006). Tools like Tableau, D3.js, and Python libraries democratized visualization creation.
Why is DataViz So Important Now?

Catalyzed by two forces: (1) powerful new technological capabilities — cheap computing, cloud storage, open data; and (2) a cultural shift toward transparency and accessibility of data. As Hal Varian (Google Chief Economist) said: "The ability to take data, understand it, process it, extract value from it, visualize it, communicate it — that's going to be a hugely important skill in the next decades."

04 · Key Principles

The Four Key Principles of Data Visualization

These are the non-negotiable rules that separate good visualization from bad. Know them, apply them, and be able to explain each with an example.

Overview of the 4 Principles

Principle 01
Strive for Forms & Functions
Balance aesthetic form with practical function. Neither style without substance, nor function without beauty. Form and function should work together, not compete.
Principle 02
Justify Every Design Choice
Every visual element — shape, color, label position, interaction — must be deliberate and reasoned. Nothing should be accidental or arbitrary.
Principle 03
Create Accessibility Through Intuitive Design
Your visualization should be immediately understandable. Overload, clutter, and confusion are design failures, not information problems.
Principle 04
Never Deceive the Receiver
Visualizations can distort reality — intentionally or accidentally. Ethical visualization ensures an honest, accurate representation of the data.

Deep Dive: Principle 1 — Forms & Functions

Frank Lloyd Wright said: "Form and function should be one, joined in a spiritual union." This is the ideal for DataViz. The question is never "style or substance?" — it is always both.

Practical advice (from the lectures): When starting a project, first secure the functional aspects of the visualization (does it convey the right information accurately?), and only then explore ways to enhance its form (does it look good and engage the reader?).

Deep Dive: Principle 2 — Deliberate Design

Every single design feature in a visualization should be included for a reason:

Shape
Why a circle vs a bar? Circles encode part-of-whole; bars encode magnitude comparisons. The choice must match the data's story.
Color Palette
Sequential vs. diverging vs. categorical palettes? Color blindness considerations? Each color choice must be justified.
Label Position
Inside the bar or outside? To the right of the point? Label placement affects readability and whether the reader even sees the annotation.
Interaction
Hover tooltips, drill-down filters, pan/zoom — every interactive feature must serve user exploration needs, not just exist for "coolness".
Amanda Cox (New York Times)

"We're so busy thinking about if we can do things, we forget to consider whether we should." — Just because a charting tool lets you add a 3D effect or an animation doesn't mean you should.

Deep Dive: Principle 3 — Accessibility Through Intuitive Design

A visualization should be usable without a manual. If your reader needs a lengthy explanation to understand the chart, the chart has failed. Intuitive design means leveraging natural human visual perception so that the message is immediately apparent.

Clutter — too many grid lines, labels, colors, and decorations — adds cognitive load without adding information. Every element you remove that adds no informational value increases the clarity of the remaining elements.

Deep Dive: Principle 4 — Never Deceive

Visualization ethics deals with the potential deception created by visual choices. Deception can be:

  • Intentional — deliberately designing a chart to mislead (e.g., a politician cherry-picking a date range to make a trend look favorable).
  • Unintentional — arising from an ineffective or inappropriate representation of data (e.g., a truncated Y-axis that makes a small difference look huge).
  • From ignorance — caused by a lack of understanding of visual perception (e.g., using area to encode a 1D value, making readers vastly over- or under-estimate).
Common Deception Patterns to Know
  • Truncated Y-axis — not starting the bar chart axis at 0 exaggerates differences.
  • Area vs. length confusion — using bubble size to show a 1D value misleads because humans perceive area, not radius.
  • Cherry-picked timeframes — selecting a window of data that shows a trend favorable to your argument.
  • Dual Y-axes — two unrelated scales can create false correlations by manipulating axis ranges.

Visualization Skills for the Masses (Stephen Few)

"The skills required for most effectively displaying information are not intuitive and rely largely on principles that must be learned." — This is the whole reason this course exists. Good visualization is a learned discipline, not an innate talent.

05 · Methodology

How to Build a Visualization: Two Frameworks

Both Fry's 7 Stages and Kirk's 5-Step process describe how a visualization project actually flows from data to finished product.

Framework 1 — Fry's 7 Stages of Visualizing Data (2008)

Ben Fry proposed a process model for creating data visualizations. These stages are iterative — you may loop back, skip, or re-order them depending on the project.

01
Acquire
Obtain the data from its source.
02
Parse
Structure and categorize the data.
03
Filter
Remove data that is not needed.
04
Mine
Apply statistics / data mining to find patterns.
05
Represent
Choose a visual model (bar, tree, map…).
06
Refine
Improve clarity and visual engagement.
07
Interact
Add interactivity for data exploration.

Note: These stages are often iterative and may have a flexible order or even be omitted in simple projects.

Framework 2 — Andy Kirk's 5-Step Methodology (2012)

This is the primary framework used throughout the course. It is more project-management-oriented than Fry's model.

Step 1
Purpose & Parameters
Define why, for whom, and under what constraints.
Step 2
Prepare & Explore Data
Acquire, clean, understand, and analyze your data.
Step 3
Formulate Questions
Identify the key questions your viz should answer.
Step 4
Design Concepting
Sketch and prototype visual solutions.
Step 5
Construct & Launch
Build, test, and publish the final visualization.
06 · Step 1: Purpose & Parameters

Visualization Function, Tone, Factors & Users

The first and most critical step in any visualization project. Get this wrong and everything downstream is misaligned.

Clarifying the Purpose: Two Questions

  1. The reason for existing — What triggered this project? What is its scope and context? How much creative control do you have?
  2. The intended effect — What should the reader think, feel, or do after seeing this visualization?

Establishing Intent: Visualization Function

Every visualization has one of three primary functions. This is a fundamental classification you must know for the exam:

1. Explanatory

Goal: Convey a specific narrative to the reader.

What it is: Based around a focused story. You already know what the key finding is, and you design the chart to communicate it clearly.

Examples: A corporate dashboard showing key performance figures; a newspaper infographic explaining economic crisis complexity.

→ More about visual presentation of data.

2. Exploratory

Goal: Provide an interface for the user to explore the data themselves.

What it is: Lacks a single, predetermined narrative. The user drives the exploration and finds their own insights.

Examples: A scatterplot matrix for multivariate correlation exploration; interactive dashboards with filters, brushing, and sorting.

→ More about visual analysis of data.

3. Exhibition / Data Art

Goal: Express or exhibit data as an aesthetic or emotional experience.

What it is: The intent is removed from a pure desire to inform. Data becomes the raw material for artistic self-expression.

Examples: A visualization of all adjectives in a novel; artistic renderings of city heartbeat data.

→ More about form and aesthetic than information transfer.

Explanatory vs. Exploratory: Detailed Comparison

Dimension Explanatory Exploratory
Narrative Based around a specific, focused narrative Lacks a single specific narrative
Focus Visual presentation of data Visual analysis of data
Designer role Creates a clear portrayal of interesting stories from the dataset Builds a tool for users to seek personal discoveries and patterns
Finding One specific finding defined beforehand Opens up possibility for chance/serendipitous findings
Interactivity Usually static or minimal Usually highly interactive (filter, sort, brush, zoom)

Establishing Intent: Visualization Tone

Tone is about the type of stimulus or desired emotional response you are trying to create in your reader. There are two ends of a spectrum:

Pragmatic / Analytical Tone

The reader reacts analytically. They read values, compare numbers, track trends. Emotions stay low — unless the data reveals something alarming.

Example: "We need a chart to help monitor our quarterly sales performance."

→ Think: corporate dashboards, scientific reports, financial charts.

Emotive / Abstract Tone

The goal is a personal, impactful experience. Abstract or artistic visual choices are used to create feeling, not just to transfer data.

Example: "We need to present this in a way that persuades people to care." (Chris Jordan: "I fear we aren't feeling enough to digest these huge numbers.")

→ Think: data journalism, advocacy visualizations, data art.

Emotive Tone Note

In emotive/abstract visualizations, you sometimes move beyond bars and straight lines toward curves, circles, and organic shapes. Abstract tone is more about creating an aesthetic that portrays a general sense of the data's story — you might not be able to read exact values, but the visual impression carries the message.

Key Factors Surrounding a Visualization Project

Beyond intent, every project is shaped by real-world constraints. The "8 hats" concept refers to the many roles a DataViz designer must wear:

The Aim
What is the specific goal of this project? Broad enough to guide creativity, specific enough to evaluate success.
Time Pressures
Deadlines constrain how deep the analysis and refinement can go. Know when "good enough" is good enough.
Costs
Budget affects tooling, data acquisition, and team size. Custom interactive D3.js costs more than a Tableau screenshot.
Client Pressures
Client preferences, organizational culture, and politics all shape what you can and cannot do.
Format
Is this a static PDF report, an interactive web dashboard, a slide in a PowerPoint, or a printed poster? Format determines design choices.
Technical Capabilities
What tools and skills are available? A beautiful D3.js visualization is worthless if no one on the team can build or maintain it.

Understanding the Users

Visualizations are always made for someone. The user context fundamentally changes the design. Know these five common user environments:

User Context Characteristics Design Implications
Boardroom Executives, high-stakes decisions, limited time Simple, fast-reading summaries. Highlight the single most important number. High contrast.
One-to-One Exchange Manager or analyst with a peer More detail acceptable. Can support conversation and questions.
Large Range of Customers Diverse backgrounds, variable expertise Must work across knowledge levels. Clear labels, plain language. Avoid jargon.
Global Audience Cross-cultural, multilingual Mind color meanings (red ≠ danger universally), symbols, language, numeric formats.
Personal / Self You are the only audience Function over form. Quick EDA charts. No need for polished presentation.

User-Centered Design (UCD)

Good visualization design starts with understanding the user. The four UCD tools you should know:

User Persona
A fictional but research-based profile of a typical user — their job, goals, pain points, and technical literacy. Grounds design decisions in real human needs.
User Stories
"As a [type of user], I want to [do something] so that [I achieve a goal]." Breaks down user needs into concrete, testable requirements.
User Scenario
A narrative description of how a specific user accomplishes a specific goal with the visualization in a realistic context.
Empathy Map
A canvas that captures what the user says, thinks, does, and feels. Helps designers build empathy and uncover unstated needs.

Physical & Cognitive Characteristics of the User

Physical Capabilities
  • Color perception: ~8% of men have color vision deficiency. Never rely on color alone to encode information.
  • Ergonomics: Screen size, viewing distance, and input device (mouse vs. touch) affect usability.
  • Visual contrast: Low contrast is problematic for older users and those with vision impairments.
Cognitive Characteristics
  • Attention & memory: Working memory is limited. A cluttered chart forces the user to use cognitive resources on navigation rather than insight.
  • Recognition over recall: Users recognize familiar patterns faster than they recall abstract information. Use conventions.
  • Cognitive biases: Anchoring bias (first number seen anchors all other comparisons), confirmation bias (users seek evidence supporting existing beliefs).
  • Change blindness: Significant visual changes in dynamic visualizations can go unnoticed if not properly highlighted.

User Research Methods

How do you find out what your users need? User research methods are mapped across two dimensions:

Attitudinal vs. Behavioral

Attitudinal: What people say (surveys, interviews). Useful for stated preferences and opinions.

Behavioral: What people do (usability testing, analytics). Reveals actual behavior, which often differs from stated preferences.

Qualitative vs. Quantitative

Qualitative: More effective at revealing why — deep insights from small samples (interviews, usability sessions).

Quantitative: Shows what is happening and how much — statistical patterns from large samples (surveys, A/B tests, analytics).

Collaboration & Communication Contexts

Visualizations are often used in shared, multi-user settings:

  • Synchronous Communication: Real-time collaboration — live dashboards in meetings, conferencing, online games. Design for simultaneous group viewing.
  • Asynchronous Communication: Reports, email, social media — users interact at different times. Design must be fully self-explanatory without a presenter.
07 · Step 2: Prepare & Explore Data

Data Preparation: Editorial Focus & the 6 Mechanisms

Data preparation is typically the most time-consuming and intensive activity in any visualization project. Get it right — everything downstream depends on it.

A. Editorial Focus

Editorial focus is the story you want to tell through the visualization — the main narrative or message you want to emphasize to the reader. It determines the direction and goal of the visualization, not just what data to display.

Key Question for Editorial Focus

"What topic or question do I want readers to have answered after seeing this visualization?"

If you cannot answer this in one sentence, your editorial focus is not clear enough yet.

Why Do You Need Editorial Focus?

  • Ensures clarity — guarantees the visualization communicates a clear message.
  • Guides design decisions — determines what to emphasize and what to omit.
  • Prevents information overload — stops you from adding "everything" to a single chart.
  • Delivers the right insight — helps surface the finding that actually matters.
Classic Example

Without editorial focus: Show all products, all regions, all metrics in one chart → information overload.
With editorial focus (goal: show sales decline after 2023): Show only total sales per year, highlight 2023–2024, add annotation explaining the cause.

The most influential data visualizations in history — from the New York Times, The Guardian, National Geographic — succeed largely because of strong editorial focus. They do not dump data; they tell a specific, focused story.

B. Preparing & Familiarizing with Data

Data is the primary raw material. Without good data, there is no compelling story to tell. A strong visualization always starts from strong data. Datasets with errors or missing values do not just slow down analysis — they can corrupt the message you are trying to deliver.

The 6 Mechanisms of Data Preparation

Andy Kirk's 6-Step Data Preparation Process
01
Acquisition — Obtain your data. Sources include: a colleague or client, a download from an organizational system, manual data collection, web API extraction, web scraping, PDF extraction, and more.

⚠️ Ethical Concerns in Acquisition: Ensure data is (1) obtained ethically and responsibly; (2) legally compliant with relevant regulations; (3) respects privacy and confidentiality of sensitive data; (4) used according to its license — especially if publishing or monetizing.

02
Examination — Determine your confidence in the data. Use tools (Excel, Tableau, Google Refine) to scan, filter, sort, and search the dataset to establish its quality. Examination covers two dimensions:

Completeness — Is it all there?

  • Does it have all the categories needed?
  • Does it cover the full time period needed?
  • Are all expected fields/variables present?
  • Does it contain the expected number of records?

Quality — Is it clean?

  • Are there errors or incorrect values?
  • Unexplained classifications or coding conventions?
  • Formatting issues (unusual dates, weird ASCII characters)?
  • Missing items or incomplete records?
  • Duplicate rows?
  • Accuracy issues — does the data appear plausible?
  • Unusual values or obvious outliers that need investigation?
03
Understand Data Types — Know the fundamental structure of your variables. The type of data determines which chart is valid, which statistics are meaningful, and which visual encodings are appropriate.
04
Transforming for Quality — Clean the data by resolving errors found in examination: removing duplicates, filling/handling missing data, cleaning erroneous values, and standardizing formats (dates, string encoding, etc.).
05
Transforming for Analysis — Prepare and refine data for analysis and presentation:
  • Parsing: Split up variables (e.g., extract year from a full date string).
  • Merging: Combine variables into new ones (e.g., first name + surname → full name).
  • Converting: Turn qualitative/free-text data into coded values or keywords.
  • Deriving: Create new values from existing ones (e.g., derive gender from title, sentiment score from text).
  • Calculating: Create new metrics (e.g., percentage proportions, ratios, moving averages).
  • Removing redundancy: Drop variables you have no planned use for in the visualization.
  • Determining resolution: Decide how granular to show the data (see Resolution Options below).
06
Consolidating — Even after preparation, there may be gaps. Additional layers of data may need to be combined with the existing dataset — for supplementary calculations, context, or to enhance the scope of communication. Pro tip: Always consider whether you need additional data to help frame the story before you begin designing.

Data Types You Must Know

Understanding your data type is not academic trivia — it determines which chart types are valid and which statistics are meaningful.

Type Subtype Description Examples
Categorical Nominal Named groups with no inherent order. You can count and compare frequencies, but not rank or calculate averages. Countries, gender, product category, text labels
Categorical Ordinal Named groups with a meaningful order, but the gaps between levels are not uniform or measurable. Olympic medals (Gold/Silver/Bronze), Likert scale (Strongly Agree → Strongly Disagree), education level
Quantitative Interval Scale Numeric values where differences are meaningful, but there is no true zero. Ratios are meaningless. Temperature in °C or °F, calendar dates (year 0 is arbitrary)
Quantitative Ratio Scale Numeric values with a true absolute zero. All arithmetic operations are valid. Ratios are meaningful. Prices, age, distance, weight, speed, count of items
Interval vs. Ratio: The Key Difference

20°C is not "twice as hot" as 10°C — temperature on the Celsius scale has no true zero, so ratios are meaningless. But $20 is twice as much as $10 — money has a true zero. This affects what calculations and visual encodings are appropriate.

Resolution Options — At What Level of Detail?

One of the most important decisions in data preparation is choosing the level of resolution at which to present the data. Showing too much detail creates visual noise; too little hides important patterns.

Five Resolution Strategies
Full
Full Resolution: Plot every available data point as an individual mark. Best for small datasets or when individual data points tell important stories (e.g., each patient in a clinical trial).
Filtered
Filtered Resolution: Exclude records based on specific criteria (e.g., show only transactions above $1,000; show only data from 2020–2024). Focuses attention on the relevant subset.
Aggregate
Aggregate Resolution: "Roll up" the data by a dimension — month, year, category, region. Instead of 365 daily data points, show 12 monthly averages. Reveals macro-trends that individual points obscure.
Sample
Sample Resolution: Apply mathematical selection rules to extract a fraction of potential data. Useful when datasets are so large (billions of records) that plotting all points would be computationally impossible or visually unreadable.
Headline
Headline Resolution: Show only the overall statistical totals — a single number, a ratio, a summary metric. Maximum simplification. Best for executive dashboards or quick key performance indicators.
08 · Glossary

Complete Term Reference

Every key term from the lectures defined clearly. Great for last-minute review.

DataViz Data Visualization — The use of computer-supported, interactive, visual representations of data to amplify cognition.
InfoViz Information Visualization — Like DataViz but specifically applied to abstract data (not raw numerical data).
Core Concept Representation — The choice of physical/visual forms used to encode data (shape, position, length, color, area, etc.).
Core Concept Presentation — How the data representation is integrated into the complete communicated work, including layout, color, annotations, and interactivity.
Core Concept Amplify Cognition — Maximizing how efficiently and effectively humans process visual information into thought, insights, and knowledge.
Principles Forms & Functions — Principle 1: Aesthetic form and practical function must work together, not against each other.
Principles Deliberate Design — Principle 2: Every visual element (shape, color, label, interaction) is included for a specific, reasoned purpose.
Principles Intuitive Design — Principle 3: Designing for accessibility such that the visualization is immediately understandable without training or manuals.
Principles Visualization Ethics — Principle 4: The responsibility to never deceive the receiver, whether intentionally or through poor design choices.
Gestalt Gestalt Laws — Psychological principles describing how humans naturally group and perceive visual elements (proximity, similarity, continuity, closure, figure/ground).
Methodology Fry's 7 Stages — Acquire → Parse → Filter → Mine → Represent → Refine → Interact. An iterative process model for creating data visualizations.
Methodology Kirk's 5 Steps — Purpose & Parameters → Prepare & Explore → Formulate Questions → Design Concepting → Construct & Launch. The primary course methodology.
Function Explanatory Viz — A visualization built around a specific, predetermined narrative to convey to the reader.
Function Exploratory Viz — A visualization designed as a tool for the user to conduct their own visual analysis, without a pre-set narrative.
Function Exhibition / Data Art — Visualization where the intent is aesthetic self-expression or emotional impact through data, not information transfer.
Tone Pragmatic Tone — A visualization designed for analytical, data-reading behavior. The reader extracts values and compares numbers.
Tone Emotive/Abstract Tone — A visualization designed to create a personal emotional experience or visceral impact, often using non-standard visual forms.
Data Prep Editorial Focus — The specific narrative or message you want to communicate; determines what data to show, highlight, and omit.
Data Prep Data Acquisition — The process of obtaining data from its source, including ethical, legal, and privacy considerations.
Data Prep Data Examination — Assessing data completeness (is it all there?) and quality (is it clean and accurate?).
Data Types Nominal Data — Categorical data with no inherent order (e.g., country, gender, product type).
Data Types Ordinal Data — Categorical data with a meaningful order, but non-uniform gaps between levels (e.g., Likert scale, Olympic medals).
Data Types Interval Data — Numeric data with meaningful differences but no true zero. Ratios between values are not meaningful (e.g., temperature in °C).
Data Types Ratio Data — Numeric data with a true absolute zero. All arithmetic operations are valid (e.g., price, age, distance).
Resolution Full Resolution — Showing every individual data point as a mark in the visualization.
Resolution Aggregate Resolution — Rolling up data to a coarser level (e.g., monthly totals instead of daily records).
Resolution Headline Resolution — Showing only the highest-level statistical summary (a single number or ratio).
UCD User Persona — A fictional but research-grounded profile of a target user, used to guide design decisions.
UCD Empathy Map — A canvas capturing what users say, think, do, and feel — used to build empathy and surface unstated needs.
Cognitive Change Blindness — The tendency for significant visual changes to go unnoticed, especially relevant in dynamic/animated visualizations.
Cognitive Anchoring Bias — The tendency for the first number encountered to disproportionately influence all subsequent judgments.
Cognitive Confirmation Bias — The tendency to interpret data in ways that confirm pre-existing beliefs. Designers must account for this in audience behavior.
01 · Design Concepting

Two Dimensions of Visualization Design

Step 4 of Kirk's methodology — Design Concepting — is where all earlier work (purpose, data prep, questions) gets translated into a visual artifact. Every design decision lives in one of two dimensions.

Where Does This Fit?

Kirk's 5-step methodology: Purpose & Parameters → Prepare & Explore Data → Formulate Questions → Design Concepting ← you are here → Construct & Launch.

Design Concepting asks: How do we give form to our data? The answer lives across two dimensions:

Data Representation
How we give form to our data through the use of "visual variables" to construct chart or graph types.

Think: What kind of chart? What shape encodes the data?

Covered by the 5 representation methods and all the chart types within them.
Data Presentation
The delivery format, appearance, and synthesis of the entire design. Concerns the layers of: color use, interactivity, annotation, and the arrangement of all elements.

Think: How does the whole thing look, feel, and behave?

Covered by color, interactivity, annotation, and architecture.
Data Representation Step: Two Sub-decisions

1. Choose the correct visualization method (which of the 5 categories does your data need?)
2. Choose the appropriate chart type within that method (which specific chart best fits the data and purpose?)

02 · The 5 Representation Methods

Five Categories of Visualization

Every chart type in existence belongs to one of these five purposes. Knowing which category your data falls into is the first decision in representation.

# Method What It Does Classic Example
1 Comparing Categorical Values Facilitate comparisons between the relative and absolute sizes of categorical values. Bar Chart
2 Assessing Hierarchies & Part-of-a-Whole Show a breakdown of categorical values in relationship to a population, or as elements of hierarchical structures. Pie Chart
3 Showing Changes over Time Exploit temporal data to show changing trends and patterns of values over a continuous time frame. Line Chart
4 Plotting Connections & Relationships Assess associations, distributions, and patterns between multivariate datasets. Usually facilitates exploratory analysis. Scatter Plot
5 Mapping Geo-Spatial Data Plot and present datasets with geo-spatial properties. Choropleth Map
03 · Method 1

Comparing Categorical Values

Charts in this group allow you to compare the size, frequency, or magnitude of distinct categories against each other.

Method 1 · Comparing Categories
Chart Type Data Variables Visual Variables What You Get / When to Use
Bar / Column Chart 1 categorical + 1 quantitative Height / Length, Position The workhorse of comparison. Compares magnitudes across discrete categories. Bars = horizontal, Columns = vertical.
Floating Bar (Gantt Chart) 1 categorical-nominal + 2 quantitative Position, Length Shows a range of quantitative values per category (bar stretches from min to max, not from zero). Reveals variation, overlap, and outliers across categories.
Pixelated Bar Chart Multiple categorical + 1 quantitative Height, Color-hue, Symbol Two levels of resolution in one: global bar chart view (aggregate) + detail view inside each bar (pixels/symbols). Usually interactive — hover a pixel for precise detail.
Histogram 1 quantitative-interval + 1 quantitative-ratio Height, Width Shows frequency distribution of a continuous quantitative variable over binned intervals. Key difference from bar chart: no gaps between bars; used for continuous data, not categorical.
Slopegraph (Bumps / Table Chart) 1 categorical + 2 quantitative Position, Connection, Color-hue Compares two (or more) quantitative values linked to the same categories. Perfect for before–after or two-point-in-time comparisons. The slope direction and steepness encode change.
Radial / Circular Bar Chart Multiple categorical + 1 categorical-ordinal Position, Color-hue, Color-saturation, Texture Displays changes over time (each ring = time period), proportional comparisons, and multi-category compositions. Good for overview and pattern detection; not for reading precise values.
Glyph Chart Multiple categorical + multiple quantitative Shape, Size, Position, Color-hue Uses a repeated shape (e.g., a flower) where each part encodes a variable. Not for precise reading — for relative comparisons (big, medium, small). Usually interactive for exploration.
Sankey Diagram Multiple categorical + multiple quantitative Height, Position, Link, Width, Color-hue Shows flow — how quantities move from one stage to another through connecting ribbons. Ribbon width = magnitude of flow. Best for multi-stage processes or transformations.
Area Size Chart (Bubble / Circle) 1 categorical + 1 quantitative-ratio Area, Color-hue Uses circle area to represent magnitude. Often used to emphasize stark inequality between categories. Area is less accurately perceived than length — use with caution for precise comparison.
Small Multiples (Trellis Chart) Multiple categorical + multiple quantitative Position + any visual variable A grid of small identical charts, each showing one subset of the data. Exploits the eye's ability to quickly scan and compare many similar charts simultaneously. Best for many categories or time-series comparisons.
Word Cloud 1 categorical + 1 quantitative-ratio Size Font size encodes word frequency. Color is usually decorative only. Good for early exploratory text analysis to find key terms — not for precise frequency comparison. Requires good text preprocessing.
04 · Method 2

Assessing Hierarchies & Part-of-a-Whole

These charts show how a total is broken down into constituent parts, or how elements are nested within larger structural hierarchies.

Method 2 · Hierarchies & Part-of-a-Whole
Chart Type Data Variables Visual Variables What You Get / Notes
Pie Chart 1 categorical + 1 quantitative-ratio Angle, Area, Color-hue Often criticized because angles and areas are harder to compare accurately than length or position. Problems arise from misuse: too many categories, 3D effects, disorganized slices. Best practice: max 3 categories, start first slice at vertical, arrange logically.
Stacked Bar Chart 2 categorical + 1 quantitative-ratio Length, Color-hue, Position, Color-saturation Shows composition of categories using color + position. Can use absolute or normalized values. Weakness: inner segments are hard to compare accurately because they lack a shared baseline. Use ordinal ordering for ordinal data (e.g., sentiment: disagree → agree).
Square Pie / Waffle Chart / Unit Chart 1 categorical + 1 quantitative-ratio Position, Color-hue / Symbol More accurate than pie/donut because it uses grid areas (e.g., 100 squares = 100%). Small parts remain visible and distinguishable. Stays clean even with multiple categories. Good for percentage-based narratives.
Treemap Multiple categorical-nominal + 1 quantitative-ratio Area, Position, Color-hue, Color-saturation Nested rectangles where area encodes magnitude. Great for showing large hierarchical datasets in a compact space. Color can encode a second dimension (e.g., growth rate). Area comparison is less precise than length.
Circle Packing Diagram 2 categorical + 1 quantitative-ratio Area, Color-hue, Position Many circles packed inside a large circle. Each circle = a category; size = quantitative value; color/position = hierarchy or grouping. Not for precise reading — for seeing relative scale and groupings.
Bubble Hierarchy Multiple categorical + 1 quantitative-ratio Area, Position, Color-hue Similar to circle packing but with a more explicit hierarchical arrangement. Bubbles of different sizes grouped by category.
Tree Hierarchy (Dendrogram) 2 categorical + 1 quantitative-ratio Angle/Area, Position, Color-hue A tree-shaped diagram showing parent–child relationships. The branching structure makes hierarchical relationships and depth immediately visible.
Pie Chart Best Practices (Exam-Likely)

Start the first slice from the 12 o'clock (vertical) position as a reference. Limit to ideally maximum 3 categories. Order segments logically. Avoid 3D effects, too many colors, or decorations. If you have more than 3 categories — use a bar chart instead.

05 · Method 3

Showing Changes over Time

Temporal charts exploit time as the primary axis, revealing trends, cycles, and changes in value across a continuous time frame.

Method 3 · Changes over Time
Chart Type Data Variables Visual Variables What You Get / Notes
Line Chart 1 quant-interval (time) + 1 quant-ratio + 1 categorical Position, Slope, Color-hue The fundamental temporal chart. Slope encodes rate of change. Multiple lines can compare categories over time. Y-axis does not need to start at 0 for line charts.
Sparklines 1 quant-interval + 1 quant-ratio Position, Slope Line charts in miniature — Edward Tufte's "intense, word-sized graphics." Not a new chart type, just a very small line chart. Ideal for embedding trend context inside tables or dashboards where space is precious.
Area Chart 1 quant-interval + 1 categorical + 1 quant-ratio Height, Slope, Area, Color-hue Like a line chart but with the area below the line filled. The filled area emphasizes cumulative volume. Important: the Y-axis must start at zero, because the area encoding (unlike line slope) requires a true baseline for accurate interpretation.
Horizon Chart 1 quant-interval + 1 categorical + 2 quant-ratio Height, Slope, Area, Color-hue, Color-saturation A modified area chart that folds negative values upward and uses color to distinguish positive/negative. Allows many time series to be stacked vertically in very little space, enabling cross-series pattern comparison.
Stacked Area Chart 1 quant-interval + 1 categorical + 1 quant-ratio Height, Area, Color-hue Multiple area charts stacked on top of each other. Shows how the composition of categories changes over time. Weakness: middle bands are hard to read accurately because they lack a shared baseline.
Stream Graph 1 quant-interval + 1 categorical + 1 quant-ratio Height, Area, Color-hue Like a stacked area chart but without a baseline — layers flow organically around a central axis. Emphasizes peaks and troughs ("ebb and flow") over time. Not for reading precise values. Aesthetic and organic feel.
Candlestick Chart 1 quant-interval + 4 quant-ratio Position, Height, Color-hue Used in financial data. Shows OHLC (Open, High, Low, Close) for each time period. Bar height = range from open to close; color = price up or down; wicks = high/low range. Conceptually similar to a boxplot.
Barcode Chart 1 quant-interval + 3 categorical Position, Symbol, Color-hue A very compact visualization of event sequences over time using symbols and color. Similar space-efficiency to sparklines. Requires some familiarity to read, but packs a rich story in minimal space.
Flow Map Multiple quant-interval + 1 categorical + 1 quant-ratio Position, Height/Width, Color-hue Like a Sankey diagram, but for change over time and/or location. Famous example: Napoleon's 1812 Russian campaign (Minard's map) where ribbon width = troops remaining. Geo-positions are roughly followed but the map is not fully detailed.
Area Chart vs Line Chart: Y-Axis Rule

A line chart can have a Y-axis that does not start at zero — the slope carries the meaning. An area chart must have its Y-axis start at zero, because the filled area is what readers judge, and a truncated area creates false impressions of magnitude.

06 · Method 4

Plotting Connections & Relationships

These charts assess associations, distributions, and patterns between variables — and they usually serve exploratory analysis rather than explanatory storytelling.

Method 4 · Connections & Relationships
Chart Type Data Variables Visual Variables What You Get / Notes
Scatter Plot 2 quantitative Position, Color-hue The most fundamental relationship chart. Reveals correlations, clustering, and outliers between two continuous variables. X and Y positions encode the two variables.
Bubble Plot 3 quantitative + 1 categorical Position, Area, Color-hue A scatter plot extended with a third dimension: bubble area encodes a third quantitative variable; color encodes a category. More information in one chart — but area comparison is less accurate than position.
Scatter Plot Matrix 2 quantitative + 2 categorical Position, Area, Color-hue A grid of scatter plots showing every pairwise variable combination simultaneously. Like small multiples applied to correlation analysis. Excellent for multivariate datasets — lets the eye quickly scan across many variable pairings to spot strong/weak relationships.
Heatmap / Matrix Chart Multiple categorical + 1 quantitative-ratio Position, Color-saturation A matrix where cells are colored by value. Like small multiples using color as the visual variable. Fast visual scanning for patterns, ordering, and hierarchy across category combinations. Good for correlation matrices, calendar heat, and confusion matrices.
Parallel Sets / Parallel Coordinates Multiple categorical + multiple quant-ratio Position, Width, Link, Color-hue Multiple parallel axes, each representing a variable. Each data item is drawn as a polyline crossing all axes. Reveals multi-variable relationships, patterns, and consistency. Functionally similar to Sankey — both show connections across categories.
Chord / Radial Network Diagram Multiple categorical + 2 quant-ratio Position, Connection, Width, Color-hue, Color-lightness, Symbol, Size A circular layout where connecting ribbons/chords between categories show the strength and direction of their relationships. Used for complex, bidirectional relationships. Not constrained by X/Y axes.
Network Diagram Multiple categorical-nominal + 1 quant-ratio Position, Connection, Area, Color-hue Nodes (entities) connected by edges (relationships). Reveals clusters, sparse connections, dominant nodes, and structural patterns. Often visually complex and "hairball-like" for large datasets — requires careful layout algorithms.
07 · Method 5

Mapping Geo-Spatial Data

When your data has a geographic dimension, maps place it in spatial context. Geographic position itself becomes a visual variable.

Method 5 · Geo-Spatial Data
Chart Type Data Variables Visual Variables What You Get / Notes
Choropleth Map 2 quant-interval + 1 quant-ratio Position, Color-saturation/lightness Geographic regions (countries, provinces) colored by quantitative value using a gradient (light → dark). Popular but has a critical weakness: larger regions visually dominate even if they have smaller populations, creating potential distortion.
Dot Plot Map 2 quant-interval Position Each data point is placed at its geographic coordinates as a dot. Simple and honest — each dot = one occurrence. Dense clusters emerge naturally without distortion from region size.
Bubble Plot Map 2 quant-interval + 1 quant-ratio + 1 categorical-nominal Position, Area, Color-hue Combines a map with a bubble plot: dots at geographic coordinates scaled by quantitative value, colored by category. Shows "how much" per location simultaneously.
Isarithmic Map (Contour / Isoline) Multiple quantitative + multiple categorical Position, Color-hue, Color-saturation, Color-darkness Uses contour lines or color gradients to show continuous values over geographic space (like elevation, temperature, rainfall). Familiar from weather maps and topographic maps.
Particle Flow Map Multiple quantitative Position, Direction, Thickness, Speed Animated particles or arrows that flow across the map to encode direction, magnitude, and movement — e.g., wind patterns, ocean currents. Direction and speed are the key encodings.
Cartogram 2 quant-interval + 1 quant-ratio Position, Size A distorted map where each region's size is proportional to a variable (e.g., population or GDP) rather than its true geographic area. Corrects the choropleth's size-dominance problem — but geographic shape is distorted.
Dorling Cartogram 2 categorical + 1 quant-ratio Position, Size, Color-hue A variant of cartogram where regions are replaced by uniform circles scaled by value. No geographic shape distortion — circles are simply positioned approximately where the region is. Clean and readable.
Connection Map 2 quant-interval + 1 categorical-nominal Position, Link, Color-hue Lines drawn between geographic locations to show connections or flows between places (e.g., flight routes, migration, trade). Line weight and color can encode quantity or category.
08 · Choosing the Appropriate Chart Type

Three Criteria for Chart Selection

Once you know which representation method applies, use these three criteria to select the specific chart type within that method.

The Three Criteria (in order)
1
Accommodate the Physical Properties of the Data
The chart must match what your data actually is. Ask: How many data variables do I have? Are they categorical or quantitative? Are they nominal, ordinal, interval, or ratio? The chart's requirements (listed in every chart's "Data Variables" above) must align with your dataset's structure.

Example: If you have 1 categorical + 1 quantitative → bar chart works. Add a second categorical dimension → grouped or stacked bar chart. Add a quantitative second variable → consider a dot plot or scatter plot.

2
Facilitate the Desired Degree of Accuracy
Different visual variables allow different levels of perceptual accuracy. If the reader needs to compare precise values, choose a chart that uses position or length (most accurate). If an impressionistic sense of scale is sufficient, area or color can be used.

→ This connects directly to the Visual Variable Ranking (Part 9 below).

3
Create an Appropriate Metaphor (Stylistic Fit)
Integrate a visual quality that conveys a deeper connection between the data, the design, and the topic. This is the most subjective criterion — it requires design instinct and experience. A well-chosen metaphor makes the chart feel like it belongs to its subject.

Example: Using a flow-like stream graph to show "ebb and flow" of musical trends feels more appropriate than a stacked bar chart, even if both technically convey the same data.

The Final Solution (Andy Kirk)

"The key is not to set out to achieve an attractive and attention-grabbing work — let those qualities emerge as a by-product of good design. Focus instead on delivering the appropriate functional elements by employing the most suitable data representation."

09 · Visual Variables & Perceptual Ranking

What Are Visual Variables? How Accurately Can We Read Them?

Visual variables are the specific visual forms we assign to data to represent it. Understanding which ones humans perceive most accurately is crucial for choosing the right chart.

Definition: Visual Variable

A visual variable is the specific form we assign to data in order to represent it visually. Examples include:

  • The length or height of a bar
  • The position of a point on an axis
  • The color (hue or saturation) of a region on a map
  • The area of a bubble
  • The connection between two nodes in a network
  • The slope of a line between two points

McKinlay's Perceptual Accuracy Ranking (1986)

Not all visual variables are perceived with equal accuracy. McKinlay's ranking tells us which encodings allow the most precise comparisons — and which should be used only for general impressions.

1
Position on a common scale
Most accurate — e.g., aligned bar charts
2
Position on identical (non-aligned) scales
e.g., small multiples
3
Length
e.g., bar chart bar lengths
4
Angle / Slope
e.g., pie chart slices, line slopes
5
Area
e.g., bubble charts, treemaps
6
Volume
e.g., 3D charts — rarely recommended
7
Color saturation / density
e.g., choropleth shading intensity
8
Color hue
Good for categorical distinction, not quantity
9
Shape / Texture
Least accurate for quantity comparison
Key Takeaway for Exams

Position > Length > Angle > Area > Color in terms of perceptual accuracy. This is why bar charts (position + length) are generally more accurate than pie charts (angle + area), and why using color alone to encode quantitative values is the least effective approach.

Visual Variable Example (from lecture)

In a scatter plot of films: X-axis position = profit, Y-axis position = review score, circle area = budget, circle color (hue) = genre. Notice how the most important variable (profit) uses the most accurate encoding (position), while less critical variables use less precise ones (area, color).

10 · Data Presentation — Color

The Use of Color in Data Visualization

Color is the most powerful — and most misused — tool in data visualization. It most efficiently taps into the pre-attentive processing of the visual system, but must be used carefully and with a clear purpose.

Two Key Rules of Color Use

Rule 1 — Use It Unobtrusively

Color should not mislead by implying a representation (category distinction, quantity ordering) when no such representation was intended. Using random colors just for decoration creates false visual groupings in the reader's mind.

Rule 2 — Strive for Elegance, Not Novelty

The sensible objective is elegance over attractiveness. A restrained, purposeful color palette almost always beats a flashy rainbow. As with all design layers, every color choice must serve a function.

Three Purposes of Color

Purpose 01
To Represent Data
Using color to encode a variable — either a category (by hue) or a quantity (by saturation/lightness). This is color at its most functional and informative.
Purpose 02
To Bring the Data Layer to the Fore
Using color contrast to make the data visually prominent — separating it from background elements, grid lines, and labels. High contrast ensures the data is what the eye goes to first.
Purpose 03
To Conform to Design Requirements
Using an organization's brand color palette or design system. Corporate dashboards must often use predefined colors regardless of perceptual optimality.

Color to Represent Quantitative Data

Color hue (red, blue, green…) has no inherent hierarchy or order of magnitude in human perception. You cannot tell which of red, blue, or green is "bigger" just from the hue alone. Therefore:

Color Scheme Types for Quantitative Data
Sequential
Use lightness (light → dark) of a single hue to encode order of magnitude. The darkest shade = highest value. Readers intuitively associate dark with "more." Example: Light blue → Dark blue for low-to-high population density.
Diverging
Use two contrasting hues with a neutral midpoint to encode data that diverges around a meaningful center (e.g., positive vs. negative, above vs. below average). Example: Red ← white → Blue for political lean or temperature anomaly.
Traffic Light
Red (bad) → Amber (average) → Green (good). A universally understood metaphor for performance. Critical caveat: ~10% of the population (especially males) has red-green color deficiency. Solution: Switch green for blue (Red → Amber → Blue).

Color for Categorical Variables

  • Color hue is a particularly strong aid for distinguishing categorical variables — it triggers pre-attentive processing instantly.
  • Use a maximum of 12 different hues for category distinction. Beyond that, the palette becomes confusing and categories become hard to tell apart.
  • Be sensitive to cultural color meanings — colors carry different symbolic associations across regions (e.g., white = mourning in some Asian cultures; red = luck in China but danger in the West).

Color for Contrast: Foreground vs. Background

Some color combinations are inherently low-contrast and should be avoided:

  • Blue on black — many people struggle to discriminate.
  • Yellow on white — nearly invisible for most viewers.
  • Always check your chosen palette against color blindness simulators (e.g., Vis Check at vischeck.com) to test perceptibility for users with color vision deficiencies.
Color Blindness — Always Consider This

Approximately 10% of males have red-green color deficiency. If your visualization relies on red vs. green to communicate a distinction, it fails for 1 in 10 male readers. Use a color blindness simulator during design, and always include a secondary encoding (shape, pattern, label) alongside color.

11 · Data Presentation — Interactivity

The Potential of Interactive Features

Interactivity transforms a static visualization into a tool. The four categories of interactive features each serve a different user need.

Feature 01
Manipulating Variables & Parameters
Giving the user control over what data is displayed:
  • Select — choose a specific data item or category
  • Filter — include/exclude based on criteria
  • Exclude — remove unwanted categories
  • Modify variables — change which variables are shown on each axis
  • Grouping & sorting — reorganize the data by different dimensions
  • Brushing — click-drag to highlight a set of data marks across linked views
Feature 02
Adjusting the View
Adjusting the user's lens or window into the subject:
  • Vertical exploration — drill down through hierarchical layers of detail (e.g., country → region → city)
  • Horizontal tabs/panels — switch between different views or cuts of the data
  • Pan & zoom — navigate spatial or temporal data at different scales
Feature 03
Annotated Details (Tooltips)
Creating extra layers of data detail through hover or click events:
  • Reveals actual data values on demand (avoid cluttering the main view with all labels)
  • Provides extra detail about a specific data point, category, or event
  • Supports: titles, introductions, user guides, visual annotation, legends/keys, units, data sources, and attribution
Feature 04
Animation
Using time-based transitions to tell a story:
  • Play / Pause / Reset controls for time-series playback
  • Manually controllable time sliders — let users scrub through time at their own pace
  • Chapter navigation — skip to key milestones in a narrative visualization
  • Use sparingly — animation should explain, not just impress
12 · Data Presentation — Architecture & Arrangement

Layout, Placement & Organization

The final presentation layer — how all visible elements are positioned and organized in the overall design.

The Architecture Layer

Architecture and arrangement considers how to lay out the overall design: the placement, organization, and visual hierarchy of all elements (charts, legends, titles, annotations, controls).

Two Core Aims

For the Eye

Reduce the amount of work the eye has to undertake to navigate around the design and decipher the sequence and hierarchy of the display. The eye should naturally flow to the most important element first.

For the Brain

Minimize the amount of thinking and "working out" that goes on in order to understand the layout. A well-arranged design is immediately readable — the reader should never have to ask "where do I look first?"

Guiding Principle: Must Be Intuitive

The key aim of architecture and arrangement is to make the visualization intuitive to navigate. Readers should not need to read a legend to understand the layout structure. The position, size, and grouping of elements should naturally communicate their relationship to each other.

Explanatory Annotation — The 8 Elements

Effective annotation in a visualization includes: Titles · Introductions · User guides · Visual annotation · Legends/keys · Units · Data sources · Attribution. These create the informational scaffold that makes the visualization self-contained and trustworthy.

13 · Recap

High-Yield Summary

Everything from this lecture in one tight review pass.

Two Dimensions of Design

  • Data Representation — choosing chart type using visual variables.
  • Data Presentation — color, interactivity, annotation, and layout.

5 Representation Methods — The Categories

  1. Comparing Categorical Values → Bar, Sankey, Histogram, Small Multiples…
  2. Hierarchies & Part-of-a-Whole → Pie, Treemap, Stacked Bar, Waffle…
  3. Changes over Time → Line, Area, Sparklines, Candlestick, Stream Graph…
  4. Connections & Relationships → Scatter, Heatmap, Network, Chord…
  5. Geo-Spatial Data → Choropleth, Cartogram, Bubble Map, Connection Map…

Chart Selection Criteria (3 steps)

  1. Accommodate the physical properties of the data (data variable types and count).
  2. Facilitate the desired degree of accuracy (choose visual variables appropriately).
  3. Create an appropriate metaphor (stylistic fit to the subject).

Perceptual Accuracy Order (McKinlay)

Position > Length > Angle > Area > Volume > Color Saturation > Color Hue > Shape

Color — 3 Purposes, 2 Rules, 3 Schemes

  • Purposes: Represent data · Bring data to foreground · Conform to design requirements.
  • Rules: Use unobtrusively · Strive for elegance not novelty.
  • Quantitative schemes: Sequential (lightness gradient) · Diverging (two hues + midpoint) · Traffic light (Red/Amber/Green — use Blue instead of Green for accessibility).
  • Max 12 hues for categorical distinction. Always test for color blindness.

4 Interactive Feature Types

  1. Manipulate variables & parameters — filter, select, brush, sort, group.
  2. Adjust the view — drill down, horizontal tabs, pan/zoom.
  3. Annotated details — tooltips, hover/click reveals.
  4. Animation — play/pause/slider for temporal data.

Architecture Aim

Minimize eye travel and cognitive effort. Must be intuitive. A layout succeeds when the reader never has to figure out where to look next.

The Design Concepting Mantra

Let attractiveness emerge as a by-product of good design — not as the goal. Focus on functional representation first. Every visual variable, every color, every interactive feature should earn its place by serving the reader's comprehension.

09 · Recap

High-Yield Summary — What to Remember for the Midterm

The most likely exam targets, organized by topic. If you only have 30 minutes left before the exam, read this section.

Must-Know Definitions

  • DataViz = Representation + Presentation of data that exploits visual perception to amplify cognition.
  • Difference between DataViz (raw data) and InfoViz (abstract data).
  • Representation = choice of visual form. Presentation = the full delivered work including colors, layout, annotations.
  • Amplify Cognition = making insight extraction faster and more accurate.

The 4 Key Principles (Number them correctly!)

  1. Strive for Forms & Functions — both, not either/or.
  2. Always justify every design choice — deliberate design; nothing is arbitrary.
  3. Create accessibility through intuitive design — clutter is a design failure.
  4. Never deceive the receiver — intentional or unintentional, deception is unethical.

Two Methodologies — Know Both

  • Fry (2008): 7 stages — Acquire, Parse, Filter, Mine, Represent, Refine, Interact.
  • Kirk (2012): 5 steps — Purpose & Parameters, Prepare & Explore, Formulate Questions, Design Concepting, Construct & Launch.

Three Visualization Functions

  • Explanatory — specific narrative, you define the story, visual presentation.
  • Exploratory — user-driven, no single narrative, visual analysis tool.
  • Exhibition/Data Art — aesthetic self-expression, emotional impact, not pure information transfer.

Two Tones

  • Pragmatic — analytical reading, value extraction, corporate/scientific.
  • Emotive/Abstract — emotional impact, persuasion, art, curves and organic forms.

Four Data Types

  • Nominal — categories, no order (country, gender).
  • Ordinal — categories with order, uneven gaps (Likert, medals).
  • Interval — numeric, no true zero, differences meaningful (dates, °C).
  • Ratio — numeric, true zero, ratios meaningful (age, price, distance).

Six Data Preparation Steps

Acquisition → Examination → Understand Data Types → Transforming for Quality → Transforming for Analysis → Consolidating.

Five Resolution Options

Full → Filtered → Aggregate → Sample → Headline.

The Golden Thread

Every concept in this course connects back to one idea: effective visualization serves the reader's ability to understand data. Whether you are choosing a chart type, a color, a tone, or a resolution level — always ask: "Does this decision help the reader understand the data better and faster?" If no, cut it.