Bar Plot
A bar plot is a graphical representation of data that uses rectangular bars to display the values of different categories or groups. Bar plots aggregate the response variable across the entire dataset for each category, so that the y-axis represents the sum of the response variable per category.
Bar plots are appropriate when the data contain a continuous response variable that is directly related to a categorical explanatory variable. Additionally, if the response variable is a cumulative total of contributions from different subcategories, each bar can be broken up to demonstrate those contributions.
What are bar plots useful for?
- Comparing categorical data: Bar plots are ideal for comparing the quantities or frequencies of different categories. The height of each bar represents the value of each category, making it easy to compare them at a glance.
- Decomposing data by category: When the data belong to several independent categories, bar plots make it easy to visualize the relative contributions of each category to the overall total. The bar segments are colored by category, making it easy to identify the contribution of each.
- Tracking trends: If the categorical explanatory variable can be ordered left-to-right (like day of week), then bar plots provide a visualization of how the response variable changes as the explanatory variable evolves.
Examples
A basic bar plot
Visualize the relationship between a continuous variable and a categorical or discrete variable by passing the column names to the x
and y
arguments.
import deephaven.plot.express as dx
tips = dx.data.tips()
bar_plot = dx.bar(tips, x="Day", y="TotalBill")
Change the x-axis ordering by sorting the dataset by the categorical variable.
import deephaven.plot.express as dx
tips = dx.data.tips()
# sort the dataset to get a specific x-axis ordering, sort() acts alphabetically
ordered_bar_plot = dx.bar(tips.sort("Day"), x="Day", y="TotalBill")
Partition bars by group
Break bars down by group by passing the name of the grouping column(s) to the by
argument.
import deephaven.plot.express as dx
tips = dx.data.tips()
sorted_tips = tips.sort("Day")
# group by smoker / non-smoker
bar_plot_smoke = dx.bar(sorted_tips, x="Day", y="TotalBill", by="Smoker")
# group by male / female
bar_plot_sex = dx.bar(sorted_tips, x="Day", y="TotalBill", by="Sex")
Frequency of categories
Visualize the frequency of categories in a column by passing to either the x
or y
argument.
import deephaven.plot.express as dx
tips = dx.data.tips()
# count the number of occurrences of each day with a vertical bar plot
bar_plot_vertical = dx.bar(tips, x="Day")
# count the number of occurrences of each day with a horizontal bar plot
bar_plot_horizontal = dx.bar(tips, y="Day")
API Reference
Returns a bar chart
Returns: DeephavenFigure
A DeephavenFigure that contains the bar chart
Parameters | Type | Default | Description |
---|---|---|---|
table | PartitionedTable | Table | DataFrame | A table to pull data from. | |
x | str | list[str] | None | None | A column or list of columns that contain x-axis values. If only x is specified, the y-axis values are the count of each unique x value. |
y | str | list[str] | None | None | A column or list of columns that contain y-axis values. If only y is specified, the x-axis values are the count of each unique y value. |
by | str | list[str] | None | None | A column or list of columns that contain values to plot the figure traces by. All values or combination of values map to a unique design. The variable by_vars specifies which design elements are used. This is overriden if any specialized design variables such as color are specified |
by_vars | str | list[str] | 'color' | A string or list of string that contain design elements to plot by. Can contain color and pattern_shape. If associated maps or sequences are specified, they are used to map by column values to designs. Otherwise, default values are used. |
color | str | list[str] | None | None | A column or list of columns that contain color values. If only one column is passed, and it contains numeric values, the value is used as a value on a continuous color scale. Otherwise, the value is used for a plot by on color. See color_discrete_map for additional behaviors. |
pattern_shape | str | list[str] | None | None | A column or list of columns that contain pattern shape values. The value is used for a plot by on pattern shape. See pattern_shape_map for additional behaviors. |
error_x | str | None | None | A column with x error bar values. These form the error bars in both the positive and negative direction if error_x_minus is not specified, and the error bars in only the positive direction if error_x_minus is specified. None can be used to specify no error bars on the corresponding series. |
error_x_minus | str | None | None | A column with x error bar values. These form the error bars in the negative direction, and are ignored if error_x is not specified. |
error_y | str | None | None | A column with x error bar values. These form the error bars in both the positive and negative direction if error_y_minus is not specified, and the error bars in only the positive direction if error_y_minus is specified. None can be used to specify no error bars on the corresponding series. |
error_y_minus | str | None | None | A column with y error bar values. These form the error bars in the negative direction, and are ignored if error_y is not specified. |
text | str | None | None | A column that contains text annotations. |
hover_name | str | None | None | A column that contains names to bold in the hover tooltip. |
labels | dict[str, str] | None | None | A dictionary of labels mapping columns to new labels. |
color_discrete_sequence | list[str] | None | None | A list of colors to sequentially apply to the series. The colors loop, so if there are more series than colors, colors will be reused. |
color_discrete_map | str | tuple[str, dict[str | tuple[str], dict[str | tuple[str], str]]] | dict[str | tuple[str], str] | None | None | If dict, the keys should be strings of the column values (or a tuple of combinations of column values) which map to colors. If "identity", the values are taken as literal colors. If "by" or ("by", dict) where dict is as described above, the colors are forced to by |
pattern_shape_sequence | list[str] | None | None | A list of patterns to sequentially apply to the series. The patterns loop, so if there are more series than patterns, patterns will be reused. |
pattern_shape_map | str | tuple[str, dict[str | tuple[str], dict[str | tuple[str], str]]] | dict[str | tuple[str], str] | None | None | If dict, the keys should be strings of the column values (or a tuple of combinations of column values) which map to patterns. If "identity", the values are taken as literal patterns. If "by" or ("by", dict) where dict is as described above, the patterns are forced to by |
color_continuous_scale | list[str] | None | None | A list of colors for a continuous scale |
range_color | list[float] | None | None | A list of two numbers that form the endpoints of the color axis |
color_continuous_midpoint | float | None | None | A number that is the midpoint of the color axis |
opacity | float | None | None | Opacity to apply to all markers. 0 is completely transparent and 1 is completely opaque. |
orientation | Literal['v', 'h'] | None | None | The orientation of the bars. If 'v', the bars are vertical. If 'h', the bars are horizontal. Defaults to 'v' if only x is specified. Defaults to 'h' if only y is specified. Defaults to 'v' if both x and y are specified unless x is passed only numeric columns and y is not. |
barmode | str | 'relative' | If 'relative', bars are stacked. If 'overlay', bars are drawn on top of each other. If 'group', bars are drawn next to each other. |
log_x | bool | False | A boolean or list of booleans that specify if the corresponding axis is a log axis or not. The booleans loop, so if there are more series than booleans, booleans will be reused. |
log_y | bool | False | A boolean or list of booleans that specify if the corresponding axis is a log axis or not. The booleans loop, so if there are more series than booleans, booleans will be reused. |
range_x | list[int] | None | None | A list of two numbers or a list of lists of two numbers that specify the range of the x axes. None can be specified for no range The ranges loop, so if there are more axes than ranges, ranges will be reused. |
range_y | list[int] | None | None | A list of two numbers or a list of lists of two numbers that specify the range of the y axes. None can be specified for no range The ranges loop, so if there are more axes than ranges, ranges will be reused. |
text_auto | bool | str | False | If True, display the value at each bar. If a string, specifies a plotly texttemplate. |
title | str | None | None | The title of the chart |
template | str | None | None | The template for the chart. |
unsafe_update_figure | Callable | <function default_callback> | An update function that takes a plotly figure as an argument and optionally returns a plotly figure. If a figure is not returned, the plotly figure passed will be assumed to be the return value. Used to add any custom changes to the underlying plotly figure. Note that the existing data traces should not be removed. This may lead to unexpected behavior if traces are modified in a way that break data mappings. |