New in version 0.17.1
*Provisional: This is a new feature and still under development. We'll be adding features and possibly making breaking changes in future releases. We'd love to hear your feedback.*
This document is written as a Jupyter Notebook, and can be viewed or downloaded here.
You can apply conditional formatting, the visual styling of a DataFrame
depending on the data within, by using the DataFrame.style
property.
This is a property that returns a Styler
object, which has
useful methods for formatting and displaying DataFrames.
The styling is accomplished using CSS.
You write "style functions" that take scalars, DataFrame
s or Series
, and return like-indexed DataFrames or Series with CSS "attribute: value"
pairs for the values.
These functions can be incrementally passed to the Styler
which collects the styles before rendering.
Pass your style functions into one of the following methods:
Styler.applymap
: elementwiseStyler.apply
: column-/row-/table-wiseBoth of those methods take a function (and some other keyword arguments) and applies your function to the DataFrame in a certain way.
Styler.applymap
works through the DataFrame elementwise.
Styler.apply
passes each column or row into your DataFrame one-at-a-time or the entire table at once, depending on the axis
keyword argument.
For columnwise use axis=0
, rowwise use axis=1
, and for the entire table at once use axis=None
.
For Styler.applymap
your function should take a scalar and return a single string with the CSS attribute-value pair.
For Styler.apply
your function should take a Series or DataFrame (depending on the axis parameter), and return a Series or DataFrame with an identical shape where each value is a string with a CSS attribute-value pair.
Let's see some examples.
In [ ]:
import matplotlib.pyplot
# We have this here to trigger matplotlib's font cache stuff.
# This cell is hidden from the output
In [ ]:
import pandas as pd
import numpy as np
np.random.seed(24)
df = pd.DataFrame({'A': np.linspace(1, 10, 10)})
df = pd.concat([df, pd.DataFrame(np.random.randn(10, 4), columns=list('BCDE'))],
axis=1)
df.iloc[0, 2] = np.nan
Here's a boring example of rendering a DataFrame, without any (visible) styles:
In [ ]:
df.style
Note: The DataFrame.style
attribute is a property that returns a Styler
object. Styler
has a _repr_html_
method defined on it so they are rendered automatically. If you want the actual HTML back for further processing or for writing to file call the .render()
method which returns a string.
The above output looks very similar to the standard DataFrame HTML representation. But we've done some work behind the scenes to attach CSS classes to each cell. We can view these by calling the .render
method.
In [ ]:
df.style.highlight_null().render().split('\n')[:10]
The row0_col2
is the identifier for that particular cell. We've also prepended each row/column identifier with a UUID unique to each DataFrame so that the style from one doesn't collide with the styling from another within the same notebook or page (you can set the uuid
if you'd like to tie together the styling of two DataFrames).
When writing style functions, you take care of producing the CSS attribute / value pairs you want. Pandas matches those up with the CSS classes that identify each cell.
Let's write a simple style function that will color negative numbers red and positive numbers black.
In [ ]:
def color_negative_red(val):
"""
Takes a scalar and returns a string with
the css property `'color: red'` for negative
strings, black otherwise.
"""
color = 'red' if val < 0 else 'black'
return 'color: %s' % color
In this case, the cell's style depends only on it's own value.
That means we should use the Styler.applymap
method which works elementwise.
In [ ]:
s = df.style.applymap(color_negative_red)
s
Notice the similarity with the standard df.applymap
, which operates on DataFrames elementwise. We want you to be able to reuse your existing knowledge of how to interact with DataFrames.
Notice also that our function returned a string containing the CSS attribute and value, separated by a colon just like in a <style>
tag. This will be a common theme.
Finally, the input shapes matched. Styler.applymap
calls the function on each scalar input, and the function returns a scalar output.
Now suppose you wanted to highlight the maximum value in each column.
We can't use .applymap
anymore since that operated elementwise.
Instead, we'll turn to .apply
which operates columnwise (or rowwise using the axis
keyword). Later on we'll see that something like highlight_max
is already defined on Styler
so you wouldn't need to write this yourself.
In [ ]:
def highlight_max(s):
'''
highlight the maximum in a Series yellow.
'''
is_max = s == s.max()
return ['background-color: yellow' if v else '' for v in is_max]
In [ ]:
df.style.apply(highlight_max)
In this case the input is a Series
, one column at a time.
Notice that the output shape of highlight_max
matches the input shape, an array with len(s)
items.
We encourage you to use method chains to build up a style piecewise, before finally rending at the end of the chain.
In [ ]:
df.style.\
applymap(color_negative_red).\
apply(highlight_max)
Above we used Styler.apply
to pass in each column one at a time.
*Debugging Tip*: If you're having trouble writing your style function, try just passing it into DataFrame.apply
. Internally, Styler.apply
uses DataFrame.apply
so the result should be the same.
What if you wanted to highlight just the maximum value in the entire table?
Use .apply(function, axis=None)
to indicate that your function wants the entire table, not one column or row at a time. Let's try that next.
We'll rewrite our highlight-max
to handle either Series (from .apply(axis=0 or 1)
) or DataFrames (from .apply(axis=None)
). We'll also allow the color to be adjustable, to demonstrate that .apply
, and .applymap
pass along keyword arguments.
In [ ]:
def highlight_max(data, color='yellow'):
'''
highlight the maximum in a Series or DataFrame
'''
attr = 'background-color: {}'.format(color)
if data.ndim == 1: # Series from .apply(axis=0) or axis=1
is_max = data == data.max()
return [attr if v else '' for v in is_max]
else: # from .apply(axis=None)
is_max = data == data.max().max()
return pd.DataFrame(np.where(is_max, attr, ''),
index=data.index, columns=data.columns)
When using Styler.apply(func, axis=None)
, the function must return a DataFrame with the same index and column labels.
In [ ]:
df.style.apply(highlight_max, color='darkorange', axis=None)
Style functions should return strings with one or more CSS attribute: value
delimited by semicolons. Use
Styler.applymap(func)
for elementwise stylesStyler.apply(func, axis=0)
for columnwise stylesStyler.apply(func, axis=1)
for rowwise stylesStyler.apply(func, axis=None)
for tablewise stylesAnd crucially the input and output shapes of func
must match. If x
is the input then func(x).shape == x.shape
.
Both Styler.apply
, and Styler.applymap
accept a subset
keyword.
This allows you to apply styles to specific rows or columns, without having to code that logic into your style
function.
The value passed to subset
behaves simlar to slicing a DataFrame.
(row_indexer, column_indexer)
Consider using pd.IndexSlice
to construct the tuple for the last one.
In [ ]:
df.style.apply(highlight_max, subset=['B', 'C', 'D'])
For row and column slicing, any valid indexer to .loc
will work.
In [ ]:
df.style.applymap(color_negative_red,
subset=pd.IndexSlice[2:5, ['B', 'D']])
Only label-based slicing is supported right now, not positional.
If your style function uses a subset
or axis
keyword argument, consider wrapping your function in a functools.partial
, partialing out that keyword.
my_func2 = functools.partial(my_func, subset=42)
We distinguish the display value from the actual value in Styler
.
To control the display value, the text is printed in each cell, use Styler.format
. Cells can be formatted according to a format spec string or a callable that takes a single value and returns a string.
In [ ]:
df.style.format("{:.2%}")
Use a dictionary to format specific columns.
In [ ]:
df.style.format({'B': "{:0<4.0f}", 'D': '{:+.2f}'})
Or pass in a callable (or dictionary of callables) for more flexible handling.
In [ ]:
df.style.format({"B": lambda x: "±{:.2f}".format(abs(x))})
Finally, we expect certain styling functions to be common enough that we've included a few "built-in" to the Styler
, so you don't have to write them yourself.
In [ ]:
df.style.highlight_null(null_color='red')
You can create "heatmaps" with the background_gradient
method. These require matplotlib, and we'll use Seaborn to get a nice colormap.
In [ ]:
import seaborn as sns
cm = sns.light_palette("green", as_cmap=True)
s = df.style.background_gradient(cmap=cm)
s
Styler.background_gradient
takes the keyword arguments low
and high
. Roughly speaking these extend the range of your data by low
and high
percent so that when we convert the colors, the colormap's entire range isn't used. This is useful so that you can actually read the text still.
In [ ]:
# Uses the full color range
df.loc[:4].style.background_gradient(cmap='viridis')
In [ ]:
# Compress the color range
(df.loc[:4]
.style
.background_gradient(cmap='viridis', low=.5, high=0)
.highlight_null('red'))
There's also .highlight_min
and .highlight_max
.
In [ ]:
df.style.highlight_max(axis=0)
Use Styler.set_properties
when the style doesn't actually depend on the values.
In [ ]:
df.style.set_properties(**{'background-color': 'black',
'color': 'lawngreen',
'border-color': 'white'})
You can include "bar charts" in your DataFrame.
In [ ]:
df.style.bar(subset=['A', 'B'], color='#d65f5f')
New in version 0.20.0 is the ability to customize further the bar chart: You can now have the df.style.bar
be centered on zero or midpoint value (in addition to the already existing way of having the min value at the left side of the cell), and you can pass a list of [color_negative, color_positive]
.
Here's how you can change the above with the new align='mid'
option:
In [ ]:
df.style.bar(subset=['A', 'B'], align='mid', color=['#d65f5f', '#5fba7d'])
The following example aims to give a highlight of the behavior of the new align options:
In [ ]:
import pandas as pd
from IPython.display import HTML
# Test series
test1 = pd.Series([-100,-60,-30,-20], name='All Negative')
test2 = pd.Series([10,20,50,100], name='All Positive')
test3 = pd.Series([-10,-5,0,90], name='Both Pos and Neg')
head = """
<table>
<thead>
<th>Align</th>
<th>All Negative</th>
<th>All Positive</th>
<th>Both Neg and Pos</th>
</thead>
</tbody>
"""
aligns = ['left','zero','mid']
for align in aligns:
row = "<tr><th>{}</th>".format(align)
for serie in [test1,test2,test3]:
s = serie.copy()
s.name=''
row += "<td>{}</td>".format(s.to_frame().style.bar(align=align,
color=['#d65f5f', '#5fba7d'],
width=100).render()) #testn['width']
row += '</tr>'
head += row
head+= """
</tbody>
</table>"""
HTML(head)
Say you have a lovely style built up for a DataFrame, and now you want to apply the same style to a second DataFrame. Export the style with df1.style.export
, and import it on the second DataFrame with df1.style.set
In [ ]:
df2 = -df
style1 = df.style.applymap(color_negative_red)
style1
In [ ]:
style2 = df2.style
style2.use(style1.export())
style2
Notice that you're able share the styles even though they're data aware. The styles are re-evaluated on the new DataFrame they've been use
d upon.
You've seen a few methods for data-driven styling.
Styler
also provides a few other options for styles that don't depend on the data.
Each of these can be specified in two ways:
Styler.__init__
.set_
or .hide_
methods, e.g. .set_caption
or .hide_columns
The best method to use depends on the context. Use the Styler
constructor when building many styled DataFrames that should all share the same properties. For interactive use, the.set_
and .hide_
methods are more convenient.
You can control the precision of floats using pandas' regular display.precision
option.
In [ ]:
with pd.option_context('display.precision', 2):
html = (df.style
.applymap(color_negative_red)
.apply(highlight_max))
html
Or through a set_precision
method.
In [ ]:
df.style\
.applymap(color_negative_red)\
.apply(highlight_max)\
.set_precision(2)
Setting the precision only affects the printed number; the full-precision values are always passed to your style functions. You can always use df.round(2).style
if you'd prefer to round from the start.
Regular table captions can be added in a few ways.
In [ ]:
df.style.set_caption('Colormaps, with a caption.')\
.background_gradient(cmap=cm)
The next option you have are "table styles".
These are styles that apply to the table as a whole, but don't look at the data.
Certain sytlings, including pseudo-selectors like :hover
can only be used this way.
In [ ]:
from IPython.display import HTML
def hover(hover_color="#ffff99"):
return dict(selector="tr:hover",
props=[("background-color", "%s" % hover_color)])
styles = [
hover(),
dict(selector="th", props=[("font-size", "150%"),
("text-align", "center")]),
dict(selector="caption", props=[("caption-side", "bottom")])
]
html = (df.style.set_table_styles(styles)
.set_caption("Hover to highlight."))
html
table_styles
should be a list of dictionaries.
Each dictionary should have the selector
and props
keys.
The value for selector
should be a valid CSS selector.
Recall that all the styles are already attached to an id
, unique to
each Styler
. This selector is in addition to that id
.
The value for props
should be a list of tuples of ('attribute', 'value')
.
table_styles
are extremely flexible, but not as fun to type out by hand.
We hope to collect some useful ones either in pandas, or preferable in a new package that builds on top the tools here.
The index can be hidden from rendering by calling Styler.hide_index
. Columns can be hidden from rendering by calling Styler.hide_columns
and passing in the name of a column, or a slice of columns.
In [ ]:
df.style.hide_index()
In [ ]:
df.style.hide_columns(['C','D'])
Certain CSS classes are attached to cells.
index_name
and level<k>
where k
is its level in a MultiIndexrow_heading
row<n>
where n
is the numeric position of the rowlevel<k>
where k
is the level in a MultiIndexcol_heading
col<n>
where n
is the numeric position of the columnlevel<k>
where k
is the level in a MultiIndexblank
data
(use Series.to_frame().style)
Some of these will be addressed in the future.
Styler.apply
or Styler.applymap
and returns values like 'css attribute: value'
Styler
selector
and props
. selector
is the CSS selector that props
will apply to. props
is a list of (attribute, value)
tuples. A list of table styles passed into Styler
.
In [ ]:
from IPython.html import widgets
@widgets.interact
def f(h_neg=(0, 359, 1), h_pos=(0, 359), s=(0., 99.9), l=(0., 99.9)):
return df.style.background_gradient(
cmap=sns.palettes.diverging_palette(h_neg=h_neg, h_pos=h_pos, s=s, l=l,
as_cmap=True)
)
In [ ]:
def magnify():
return [dict(selector="th",
props=[("font-size", "4pt")]),
dict(selector="td",
props=[('padding', "0em 0em")]),
dict(selector="th:hover",
props=[("font-size", "12pt")]),
dict(selector="tr:hover td:hover",
props=[('max-width', '200px'),
('font-size', '12pt')])
]
In [ ]:
np.random.seed(25)
cmap = cmap=sns.diverging_palette(5, 250, as_cmap=True)
bigdf = pd.DataFrame(np.random.randn(20, 25)).cumsum()
bigdf.style.background_gradient(cmap, axis=1)\
.set_properties(**{'max-width': '80px', 'font-size': '1pt'})\
.set_caption("Hover to magnify")\
.set_precision(2)\
.set_table_styles(magnify())
New in version 0.20.0
*Experimental: This is a new feature and still under development. We'll be adding features and possibly making breaking changes in future releases. We'd love to hear your feedback.*
Some support is available for exporting styled DataFrames
to Excel worksheets using the OpenPyXL
or XlsxWriter
engines. CSS2.2 properties handled include:
background-color
border-style
, border-width
, border-color
and their {top
, right
, bottom
, left
variants}color
font-family
font-style
font-weight
text-align
text-decoration
vertical-align
white-space: nowrap
Only CSS2 named colors and hex colors of the form #rgb
or #rrggbb
are currently supported.
In [ ]:
df.style.\
applymap(color_negative_red).\
apply(highlight_max).\
to_excel('styled.xlsx', engine='openpyxl')
A screenshot of the output:
The core of pandas is, and will remain, its "high-performance, easy-to-use data structures".
With that in mind, we hope that DataFrame.style
accomplishes two goals
If you build a great library on top of this, let us know and we'll link to it.
If the default template doesn't quite suit your needs, you can subclass Styler and extend or override the template. We'll show an example of extending the default template to insert a custom header before each table.
In [ ]:
from jinja2 import Environment, ChoiceLoader, FileSystemLoader
from IPython.display import HTML
from pandas.io.formats.style import Styler
In [ ]:
%mkdir templates
This next cell writes the custom template.
We extend the template html.tpl
, which comes with pandas.
In [ ]:
%%file templates/myhtml.tpl
{% extends "html.tpl" %}
{% block table %}
<h1>{{ table_title|default("My Table") }}</h1>
{{ super() }}
{% endblock table %}
Now that we've created a template, we need to set up a subclass of Styler
that
knows about it.
In [ ]:
class MyStyler(Styler):
env = Environment(
loader=ChoiceLoader([
FileSystemLoader("templates"), # contains ours
Styler.loader, # the default
])
)
template = env.get_template("myhtml.tpl")
Notice that we include the original loader in our environment's loader. That's because we extend the original template, so the Jinja environment needs to be able to find it.
Now we can use that custom styler. It's __init__
takes a DataFrame.
In [ ]:
MyStyler(df)
Our custom template accepts a table_title
keyword. We can provide the value in the .render
method.
In [ ]:
HTML(MyStyler(df).render(table_title="Extending Example"))
For convenience, we provide the Styler.from_custom_template
method that does the same as the custom subclass.
In [ ]:
EasyStyler = Styler.from_custom_template("templates", "myhtml.tpl")
EasyStyler(df)
Here's the template structure:
In [ ]:
with open("template_structure.html") as f:
structure = f.read()
HTML(structure)
See the template in the GitHub repo for more details.
In [ ]:
# Hack to get the same style in the notebook as the
# main site. This is hidden in the docs.
from IPython.display import HTML
with open("themes/nature_with_gtoc/static/nature.css_t") as f:
css = f.read()
HTML('<style>{}</style>'.format(css))