Country | GDP |
---|---|
str | i64 |
"USA" | 22670 |
"China" | 16640 |
"Japan" | 5370 |
"Germany" | 4310 |
"UK" | 3120 |
"India" | 3040 |
"France" | 2930 |
"Italy" | 2100 |
"Canada" | 1880 |
"Korea" | 1800 |
How to improve a bad graph with plotly
All data visualizations should, first and foremost, inform. Any visualization that falls short of this is simply data art. Data visualizations that are uninformative may be bad but misleading ones are the worst. Here’s an example of a bad visualization that Olga Berezovsky posted on LinkedIn. I thought I’d recreate it based on the suggestions she highlighted.
The dataset
Below is a polars dataframe showing the data I’ll use to recreate the bad graph. This is the original data as it’s shown on the bad graph.
Since the GDP numbers will be presented in trillion and not billion, I’ll divide the values by 1,000 and round the values to 2 decimal places. Here’s the resulting dataframe.
= (data
df 'GDP') / 1000).round(2))
.with_columns((pl.col(
) df
Country | GDP |
---|---|
str | f64 |
"USA" | 22.67 |
"China" | 16.64 |
"Japan" | 5.37 |
"Germany" | 4.31 |
"UK" | 3.12 |
"India" | 3.04 |
"France" | 2.93 |
"Italy" | 2.1 |
"Canada" | 1.88 |
"Korea" | 1.8 |
Creating a better graph
I’ll develop the graph in a step-by-step manner, addressing the problems Olga highlighted in the bad graph until we have a perfect graph that is not only informative but beautiful to look at. I’ll use the graphing library called Plotly.
Out of the box plot
Here’s a regular plot without employing any tweaks to make it look better.
import plotly.graph_objects as go
= go.Figure(go.Bar(
fig =df["Country"],
x=df["GDP"]
y
))
='iframe') fig.show(renderer
This regular graph solves the problem of rotated country names, making them easier to read without tilting your head. Another problem that we’ve solve right away is the representation of figures from billion to trillion. However, it’s still not clear from this graph that the values are in trillion. In the next iteration, we’ll add a title to make this clear.
Add descriptive title
The title is an important part of the graph because it tells us what ot focus on. But some titles can be too vague. You wouldn’t, for instance, use a title like GDP of countries for this graph. You want to use a descriptive title – one that tells the audience what to focus on.
= go.Figure(go.Bar(
fig =df["Country"],
x=df["GDP"]
y
))
fig.update_layout(="<b>Countries with the highest nominal GDP</b><br><sup><b>(in US $trillion, 2021)</b></sup>",
title=dict(size=20),
title_font
)
='iframe') fig.show(renderer
Make GDP values explicit
It’s very difficult to know the exact GDP values of the countries by looking at the y-axis. To solve this problem, we’ll insert the GDP value for each country on top of their respective bar. Additionally, I’ll hide the values on the y-axis since they won’t be needed anymore.
= go.Figure(go.Bar(
fig =df["Country"],
x=df["GDP"],
y=df['GDP'],
text='outside'
textposition
))
fig.update_layout(="<b>Countries with the highest nominal GDP</b><br><sup><b>(in US $trillion, 2021)</b></sup>",
title=dict(size=20),
title_font=dict(visible=False),
yaxis
)
='iframe') fig.show(renderer
Remove grid lines
Because we have removed the values on the y-axis, it’s pointless to have the horizontal grid lines. We’ll remove them and we’ll also change the background color of the graph.
= go.Figure(go.Bar(
fig =df["Country"],
x=df["GDP"],
y=df['GDP'],
text='outside'
textposition
))
fig.update_layout(="<b>Countries with the highest nominal GDP</b><br><sup><b>(in US $trillion, 2021)</b></sup>",
title=dict(size=20),
title_font=dict(showgrid=False, visible=False),
yaxis='#FFE4B5',
plot_bgcolor='#FFE4B5',
paper_bgcolor
)
='iframe') fig.show(renderer
Include source and add padding
Finally, let’s include the source at the bottom of the graph. We’ll also add padding (spacing) at the top and bottom of the graph so that the title text and the source text are not too close to the edge of the graph.
import plotly.graph_objects as go
= go.Figure(data=[
fig
go.Bar(=df['Country'],
x=df['GDP'],
y='#1E90FF',
marker_color=df['GDP'],
text='outside'
textposition
)
])
fig.update_layout(="<b>Countries with the highest nominal GDP</b><br><sup><b>(in US $trillion, 2021)</b></sup>",
title=dict(size=20),
title_font='#FFE4B5',
plot_bgcolor='#FFE4B5',
paper_bgcolor=dict(showgrid=False, visible=False),
yaxis=dict(t=70, b=100)
margin
)
fig.add_layout_image(dict(
=f"data:image/png;base64,{encoded_image}",
source="paper",
xref="paper",
yref=0.98,
x=-0.26,
y="right",
xanchor="bottom",
yanchor=0.22,
sizex=0.22,
sizey="above"
layer
)
)
fig.add_annotation(dict(
="<b>Source</b>: IMF, April 2021",
text=0.01, # x position (0 means far left)
x=-0.26, # y position (adjust as necessary)
y="paper",
xref="paper",
yref=False, # No arrow
showarrow=dict(
font=11, # Font size
size='grey' # Font color
color
),="left"
align
),
)
='iframe') fig.show(renderer
Notice that I’ve added the logo for our data consulting company. Contact us if you need any services regarding your data.