Data Visualization with D3 and React - Part 1

Introduction

If you, dear reader, are like me, a developer working with React who wants to visualize some data in a web application, it can be a pretty daunting task to even know where to start. Data visualization is its own entire discipline with its own history of development and popular libraries and frameworks, not all of which necessarily play nice with your React app. In this article I will be laying the groundwork for how to use D3.js in tandem with React to render high-quality data visualizations efficiently. Let’s begin with a quick overview of our stack.

React

As its own marketing materials will tell you, “React is a JavaScript library for building user interfaces.” It’s gotten real popular in the last few years and rightfully so. In my experience, building user interfaces with React feels right. The interface is fairly declarative, allowing you as a developer to describe the elements you want to render based on some underlying data, and it deals with updating your UI when data changes. For simplicity’s sake the remainder of this article will assume a working knowledge of React, and in particular react-dom, the module binding the DOM to React’s core lifecycle for web application development.

D3

D3 stands for Data-Driven Documents and boy-howdy there’s a lot to unpack there. It’s been around a few years longer than React, has an infamously steep learning curve, and while even experts fail to agree on what exactly it is, it is fairly safe to call it an industry standard resource for data visualization. It is broken up into a sizable collection of javascript modules. Taken all together D3 gives you pretty much everything you need to make some pretty fancy data visualizations as can be seen in their gallery. I would describe D3 as a very mature set of low-level abstractions that give you the building blocks you need for visualizing data in a browser with fine-grained control.

Common Ground

At the center of both React and D3’s abstractions is the underlying problem “How do I update the DOM when data changes?”, and each library has its own opinionated approach on how to address that problem. Let’s take a look at them.

Reconciliation

So with React you write components with a declarative interface that describes the DOM you want based on the current state of that component. The library is responsible for deciding when to render that output, and how to update the DOM based on what is rendered. The “When to render” part is handled by detecting changes either in component state (triggers when setState is called), or in received props (triggers when rendering a child component with new props). But the output of rendering a component is a JavaScript data structure, not the actual DOM on the page. The process by which React uses that output to update the DOM is known as reconciliation. Fundamentally, the rendered output structurally mirrors the DOM, and React has to decide whether a given component in that tree already exists and needs to be updated, doesn’t exist yet and needs to be created, or no longer exists and needs to be removed.

Data Joins

D3 on the other hand uses what is called Data Joins. The original creator of D3 gave a really nice explanation of them back in 2012. Data Joins tie data to DOM elements, and are responsible for updating the DOM when data changes. When new data comes in, DOM elements need to be created. When existing data changes, DOM elements need to be updated. When existing data is removed, DOM elements need to be deleted. Sound familiar? Of the great big pile of D3 modules, d3-selection is the one that implements data joins, and can be seen as something of a lynchpin sitting in the center of the entirety of D3. In a sense it is the “Data-Driven Documents” part of the greater library of the same name.

Library Strengths

So both React and D3 have their own answers to the same problem, and those answers aren’t exactly neatly compatible with each other. Using two libraries to solve the same problem in two different ways is costly both in terms of the cognitive load of writing and maintaining code as well as the performance implication of having to load more JavaScript on a webpage. Let’s look at a way to use each library to its own strengths, without unnecessary complexity or bloat.

React: DOM Manipulation & Testing

I’m assuming that you’re coming at this from the perspective of a developer who is already using React and needs to visualize data. React is really good at efficiently manipulating the DOM. Reconciliation is fast, and the React API is expressive enough to build all manner of user interfaces, not just graphical data visualizations.

The React world has a whole pile of libraries and frameworks and blog posts and gizmos and doo-dads for testing your UI code. In my examples I will use Jest and Enzyme, and we will see that by using React for DOM manipulation we get access to sophisticated tools for testing said DOM manipulation. Inversely, while D3 has a great many examples of how to cobble together a visualization there is a notable lack of tooling and practice around testing visualizations made with D3.

Let’s use it for what it is good at and what we are already using it for. It’s lack of opinion about where your data comes from and how it is formatted gives both the expressiveness that makes it so useful, and leaves a hole in our design about how to prepare data for React to render. Whatever might be able to fill that void?

D3: Data Manipulation

While data joins are the defining concept of D3, they aren’t the only concept. And D3’s modules have built a whole ecosystem of handy utilities and abstractions for data visualization that are only loosely coupled with its Data Join module (d3-selection). Broadly speaking, D3’s API can be categorized into 3 buckets:

  • d3-selection
  • modules that depend on d3-selection to draw complex things into the DOM or handle user input. Examples include:

    • d3-axis
    • d3-zoom
    • d3-brush
  • modules that don’t depend on d3-selection and help manipulate and transform data to prepare it for visualization. Examples include:

    • d3-scale
    • d3-shape
    • d3-interpolate

So for now let’s have React handle d3-selection’s responsibilities, throw away the second bucket entirely, and see some examples of what we can do with the third.

d3-scale gives you a nice interface for scaling from domain data (be it time, or percentiles, or meters, or discrete labels, or whatever else have you), to the range of visualization data (pixel coordinates, SVG coordinates, color gradients, etc)

A module like d3-shape gives you the tools you need to prepare complex shapes for rendering into an SVG or a canvas. For instance, it can build you a function that takes in a series of data points as an argument and returns a string of SVG path commands.

Basically, we want to use D3 for these modules to perform the kind of data manipulation from domain space to visual space that is tedious at best and downright hellish at worst to implement yourself.

The best of both worlds

So now that we’ve got an idea of how we want to use each of these libraries, let’s contrive an example!

The Component

// MyChart.js
import React from "react";
import { scaleLinear } from "d3-scale";
import { extent } from "d3-array";
import { line, curveMonotoneX } from "d3-shape";

const MyChart = (props) => {
  const {
    width,
    height,
    data
  } = props;

  // The smallest x value in the data set will be mapped to the left edge (0)
  // The largest x value in the data set will be mapped to the right edge (width)
  // Everything in between will be interpolated linearly
  const xScale = scaleLinear()
    .domain(extent(data, datum => datum.x))
    .range([0, width]);

  // The smallest y value in the data set will be mapped to the bottom edge (height)
  // The largest y value in the data set will be mapped to the top edge (0)
  // Everything in between will be interpolated linearly
  const yScale = scaleLinear()
    .domain(extent(data, datum => datum.y))
    .range([height, 0]);

  // Create a function that when given [{ x, y }] in domain coordinates returns
  // svg path data string based on the xScale and yScale we just made.
  // Smoothly interpolate the path between points in the data.
  const path = line()
    .x(datum => xScale(datum.x))
    .y(datum => yScale(datum.y))
    .curve(curveMonotoneX);

  return (
    <svg width={width} height={height}>
      <path d={path(data)} fill="none" stroke="black" stroke-width="3" />
    </svg>
  );
}

export default MyChart

The Test

// MyChart.test.js
import React from "react";
import MyChart from "./MyChart"
import { shallow } from "enzyme";

const data = Array(5).fill().map((_, i) => ({
  x: i,
  y: Math.sin(i)
}))

describe("MyChart", () => {
  let wrapper;
  beforeAll(() => {
    wrapper = shallow(<MyChart width="1000" height="300" data={data}/>);
  });

  it("renders", () => {
    expect(wrapper).toMatchSnapshot();
  });

  it("renders a path inside an svg", () => {
    const path = wrapper.find('svg').find('path')
    expect(path).toHaveLength(1);
    expect(path.prop('d')).toMatch(/.+/)
  });
});

The Snapshot

// __snapshots__/MyChart.test.js.snap
// Jest Snapshot v1, https://goo.gl/fbAQLP

exports[`MyChart renders 1`] = `
<svg
  height="300"
  width="1000"
>
  <path
    d="M0,163.72921241024383C83.33333333333333,92.04203330849256,166.66666666666669,20.3548542067413,250,12.212912524044782C333.3333333333333,4.070970841348261,416.6666666666667,0,500,0C583.3333333333334,0,666.6666666666666,88.31897028998452,750,138.31897028998452C833.3333333333334,188.31897028998452,916.6666666666666,244.15948514499226,1000,300"
    fill="none"
    stroke="black"
    stroke-width="3"
  />
</svg>
`;

The outcome: Whenever we get to the point that we are mounting this component somewhere in a React component hierarchy, it will receive width, height, and data from its parent. It will use D3 to compute a path for this data within a coordinate system based on the given width and height. We design and lay out the DOM with JSX syntax, where we can reason about it in a way that is structurally analogous to the DOM, and easily control other aspects of how it renders (give things className for styling, add other chart elements, etc). React handles making sure the DOM elements get created when the component is mounted, updated when width, height, or data change, or removed when this component is unmounted. We get to test this output using tools from the React ecosystem. React has done its job well, and we have chosen the pieces of D3 that are difficult to implement ourselves and complement what React is doing nicely.