Scatterplot of cereal protein content and calories

This tutorial uses the Breakfast Cereal example data from Interactive Data Visualization and D3 example blocks by Mike Bostock and Michele Weigle. This tutorial is part of a course I teach — Fundamentals of Data Visualization — at General Assembly Boston.

D3.js is the hottest web data visualization framework right now. But it can be difficult to work with unless you have a firm grounding in Javascript and can host your files on a webserver. This tutorial takes you through the process step-by-step without requiring you to know a lot of Javascript, and with a workaround that will let you quickly view your D3 visualizations without having to upload files to a webserver.

D3 uses Javascript to visualize datasets you give it, in this case, a CSV showing the calorie and protein content of popular breakfast cereals. Scatterplot of cereal protein content and calories

Technically, D3.js is a “library,” that is, a suite of prewritten code blocks that we can call on so that we don’t have to code everything from scratch ourselves. You can see many examples of D3 in action at D3js.org.

Your D3 visualization will have two components: an HTML document (which we’ll call index.html), that will incorporate Javascript that makes use of some of the components that D3 makes available to us, and a .CSV file (which we’ll name cereal.csv) that contains the data we are visualizing.

In order to be able to see our visualization, though, we’re going to make use of Github and Bl.ocks.org. Github is where we will build our code, and Bl.ocks.org will let us see our visualizations without the hassle of uploading files to a webserver. If you do have a web hosting account with space and the ability to add random files, all the code we use here will work there too — you’ll just need to put both files in the same folder.

Recipe:

Ingredients: 

  1. A free Github account
  2. index.html
  3. cereal.csv

Index.html and cereal.csv will be linked in the steps of our recipe below — you’ll have them when you need them.

Steps

Step 1: Get a Free Github Account And Create A Gist
You will need a free Github account. Once you get your account and log in, go to gist.github.com. Github’s Gist feature allows you to save and embed snippets of code you find useful. Once you are logged in and at gist.github.com, click “New Gist” in the upper right near your username.

Give your Gist a name — in our case you must call it index.html.

Give it an easy-to-remember name, too, above index.html.

Now you will paste some code into the window under index.html. Look at the code below. What you see is a pretty plain-vanilla HTML page, with some CSS to style our visualization. In the lower right, you’ll see a link that says “view raw.” Click that.  A new tab will open. Copy all the code you see there and paste it into your new index.html Gist.

Note: if you have Javascript turned off in your browser, you will not see the chunks of code I walk you through below. If you prefer, you can look at the code in Github directly. It’s the second file on this page, index.html.

Step 2: Start Building our D3 dataviz

In the next few steps, you’re going to be adding code to your Gist. Each step’s code goes immediately below the previous step’s code; we are working from top to bottom.

First, let’s start by drawing the margins of our dataviz. Click “view raw” on the bottom left and copy and paste the code between the two /script tags.

Step 3: Build the X and Y axis

Here’s where we get into some pretty deep water if you’re not familiar with Javascript or D3. That’s okay! Allow me to repeat that: that’s okay! Really. Copying and pasting chunks of code you know work, then tinkering with them, breaking and fixing them, is a great way to learn. Remember that every single program you will ever write stands on top of a HUGE mountain of code, starting from the firmware to run the chips, to the operating system, the web server, and the browser serving you the web page you are reading right now.  Incorporating and reusing code is now part of what programming is — unless you really want to start from the chips up. But that’s a whole other tutorial!

Step 4: Setup some variables to draw some stuff

Next we’ll start creating Javascript variables that we will “call” later to draw our scatterplot chart. As usual, click “view raw” and a new tab will open. Copy and paste what you see in this step directly below the code we used to set up the X and Y axis of our chart.

Step 5: Let’s load some data

Here’s where the rubber meets the road. We’re going to tell our script where to find the data we want to visualize. In this case, we are loading cereal.csv, and using D3 to turn the text into numerical values that our script can read. Take this and paste it immediately below the code from Step 4.

Step 6: Let’s get down to actual drawing

Up until this point, we’ve mainly been assigning values to variables. But unless we call our variables and do something with them, we won’t see our script doing much of anything at all. D3 typically uses an image format known as SVG (Scalable Vector Graphics). Most other file formats you encounter, like, say, a GIF, are just a file with a bitmap of colors. It’s probably more useful to think of SVG as “HTML for images.” SVG lets you write instructions to your browser and draw something on the screen. In this case, we’re going to use all the variables we set up above, our data, and use them both as part of the instructions in our SVG.

Step 7: Adding Our Datafile And Making Our Gist Public

Now we’ve got a fair amount of code, but we’ll need to add the cereal.csv file we told our script about in Step 5 if we expect our data visualization to work. Look at the bottom left of the code window you are working in. You’ll see a button called Add File. A Gist can be composed of multiple files. Ours will have two: index.html and cereal.csv. Click “Add File.”

Once you do that, a new window will open beneath. Copy and paste the data from here into the window, and call your new file cereal.csv. Remember, if it doesn’t have the same name, spelled exactly the same way, your script will not know where your data is, and will give up.

Last, click the button in the lower left called “Create Public Gist.” You should now have a page with a gist called cereal.csv. If you scroll down, you’ll see all the code you’ve added in another file called index.html.

Step 8: Look and See

All this way, and we still haven’t seen anything yet? That’s not much fun, is it? With HTML and CSS, you can often quickly view how changes you make to your code affect the outcome by just looking at them in your browser. In the case of this code, we need our files to be living on a web server for them to work. At home, I use MAMP, a “web stack for laptops” that lets me put a little virtual web server right on my personal machine. But setting up MAMP is a little too much heavy lifting for this tutorial, so I’ve got an easier way for you to do it. We’re going to use Bl.ocks.org, which will allow you to see the end results of the code and data you’ve loaded on Github.

First, visit http://bl.ocks.org/yourgithubusername. 

If you can’t remember your username, click the icon in the upper right to reveal it.

If you’ve forgotten what your Github username is, you can find it in the upper right of the page on Github with your new public Gist.

In my case, I can see my public Gists at http://blocks.org/lisawilliams.

Just click on “Cereal Scatterplot,” or whatever you named your Gist in Step 1.

Voilá! You should see a functional scatterplot just like this one!

Scatterplot of cereal protein content and calories

Troubleshooting Break: Mine Didn’t Work

That’s okay. It can be easy to accidentally place something as small as a semicolon in the wrong spot. If your visualization is not working, copy and paste this code into your index.html. This is a total replace! You’ll need to visit your Gist and click the Edit button:

Scroll down past cereal.csv and select everything in index.html and delete it. Now paste in this code.

Step 9: Add A Legend & Tooltips

This is all great, but nobody knows what those dots are! Let’s add a legend and tooltips to our visualization so that people can read it.

Go back to your Gist and edit it by clicking the Edit button.

Scroll down to index.html.

Add the following right after your last block of code, but before the final /script tag:

Now you should have a scatterplot with a legend and tooltips. Just run your mouse over some of the dots to see the name and data for each cereal!

If you just made your first foray into D3, congratulations! If you’d like to learn more, check out my reading list.

Remember, if it doesn’t work, copy and paste this code into the index.html of your gist, save, and then go look at http://bl.ocks.org/yourgithubusername again. Be patient, and remember to reload the page  on Blocks with your visualization. Sometimes services like Blocks that depend on APIs (because Blocks is going to grab the code for your visualization from Github) have a little delay in between the time you make edits to your code and the time they show up on Blocks. Breathe deep, and remember the rules for getting stuck.

Leave a Reply

Your email address will not be published. Required fields are marked *