In a recent Tableau project, I wanted to divide a long span of years into decades, as this would provide a more visually effective way to grasp the growth of revenue from top movies (data from The Movie Database) over time. With a little searching, I found the pieces I needed. Below I’ll include a description of my process, followed by links to the helpful sources of insight I found on this topic.
First, here is the visualization with total revenues year by year. Notice that despite its current width you still have to scroll left to reach the early 1900s. Meanwhile, the difference year to year is not in itself that interesting.
Now here is the visualization when years are chunked into decades. Much more effective!
DISCLAIMER: These charts use revenue numbers as entered in The Movie Database by contributors based on publicly reported figures. Thus, the data includes only a portion of all movies. I’ve as yet made no adjustments for inflation.
Getting to Decades from Dates
Now for the process of getting decades from dates. I broke my approach into two steps:
- I first created a calculated field to pull the Year from the Release Dates field, using Tableau’s DATEPART function.
Once that field was created, I moved the new calculated field from Measures to Dimensions, where it should be.
- Then I created a Decades dimension as an additional calculated field. This calculation uses Year and the modulo operator to round each year down to the nearest multiple of 10.
Then, similarly, once created, I made this calculated field a Dimension.
That’s all it took!
Many thanks to Nick Parsons and Erik Bokobza for their helpful replies in the Tableau Community Forums. Links below.
This article from Tech Republic is worth a read. In summary: The data revolution is now transforming the world of finance. A recent Deloitte survey reveals that traditional roles are being automated. To be a human working in finance, you need skills in data science, analytics, and visualization. More than manipulating spreadsheets, you need to create business value with data-informed innovations.
The finance robots are coming — TechRepublic.com
I’ve been reflecting on Elijah Meeks’ provocative essay, “3rd Wave Data Visualization”. In this post, I want to reflect on the tension between his first and third “waves.” I’ll refer to these as attitudes. (Meeks himself acknowledges that none of his “waves” have washed away. Each lives on.) He refers to them as Wave 1: Clarity and Wave 3: Convergence.
Upon re-reading his argument a few times, I believe we may useful understand the contrast Meeks highlights as the tension between these two imperatives:
Attitude 1: Design with Clarity. (Make sure we don’t miss the message.)
Attitude 2: Bring back the Creativity and Fun. (Give us some enjoyment.)
I’ll talk about these attitudes in more detail in a later post.
For now, I’m going to spend some time going out and evaluating a number of data visualizations bearing in mind questions such as these:
- How clear is this visualization? How easy is it to understand and interpret? Is that a good or a bad thing?
- How creative and fun is this visualization? Am I motivated to explore it further? Why or why not?
- Are there times, places, and audiences for whom clarity is more important than creativity? And vice versa?
The Tableau Public Gallery is a good place to start. And there are many others.
I’d be interested in your responses below. Include a link to a relevant data visualization.
I’ll report back with an update to this post.
This impressive interactive data visualization demonstrates the value of the format. More than merely interesting, or intriguing, or even fun — it massively amplifies the communicative power of its subject matter.
Check it out:
How to Cut U.S. Emissions Faster? Do What These Countries Are Doing.
By Brad Plumer and Blacki Migliozzi — FEB. 13, 2019
I’ve just published a Tableau story on the birth, growth, and rise of Bitcoin.
I would love feedback and recommendations, as I intend to develop this project over time.
Bitcoin: Growth, and Rise – Tableau Public
Hal Varian, Google’s chief economist, gave a nice summary of a major need of our era.
“The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. Because now we really do have essentially free and ubiquitous data. So the complimentary scarce factor is the ability to understand that data and extract value from it.
“I think statisticians are part of it, but it’s just a part. You also want to be able to visualize the data, communicate the data, and utilize it effectively. … being able to access, understand, and communicate the insights you get from data analysis —are going to be extremely important.”
Hal Varian, Google’s Chief Economist, 2009
Martin O’Leary recently posted some sound advice for Kaggle competitors. You can find the three-graph version in the Kaggle wiki.
Here I’ll break it into four key points:
- Spend a while on visualization, making graphs of various properties of the data and trying to get a feel for how everything fits together.
- Test the performance of a variety of standard algorithms (random forests, SVMs, elastic net, etc.) to see how they compare. It’s often very informative to look at which data points are the least well predicted by standard algorithms, as this can give you a good idea of what direction to move in. (Be warned: Home-brew algorithms can be useful later on in a project, but in the early stages you want to try out as many things as possible, not get bogged down in the details of implementing a particular algorithm.)
- Then move into the nitty-gritty details once you have a sense for the lay of the land.
- Of course, all this assumes a certain kind of problem, where the data is already in numeric/categorical form. For more “interesting” datasets, such as the recent Automated Essay Scoring competition, a lot of the early work is in feature extraction — just looking for numbers which you can pull out of the data. That tends to be a bit more creative, and I use a variety of tools to see what works best. However, one of the joys of this kind of problem is that every one is different, so it’s hard to give general advice.