Violin Plot

A violin plot is a data visualization that combines a box plot with a rotated kernel density plot to provide a comprehensive representation of the data distribution. It offers a detailed view of the data’s central tendency, spread and density.

Violin plots are appropriate when the data contain a continuous variable of interest. If there is an additional categorical variable that the variable of interest depends on, side-by-side violin plots may be appropriate using the by argument.

What are violin plots useful for?

  • Comparing distributions: Violin plots are effective for visually comparing and contrasting the distribution of multiple datasets or categories, allowing for quick identification of differences in data patterns.
  • Assessing central tendency and spread: Violin plots provide insights into the central tendencies and variability of data, including the median, quartiles, and potential outliers.
  • Identifying multimodal data: They are particularly useful when dealing with data that exhibits multiple modes or peaks, as they can reveal these underlying patterns effectively.

Examples

A basic violin plot

Visualize the distribution of a single variable by passing the column name to the x or y arguments.

import deephaven.plot.express as dx
iris = dx.data.iris()

# subset to get a specific group
versicolor = iris.where("Species == `versicolor`")

# control the plot orientation using `x` or `y`
violin_plot_x = dx.violin(versicolor, x="SepalLength")
violin_plot_y = dx.violin(versicolor, y="SepalLength")

Distributions for multiple groups

Create separate violins for each group of data by passing the name of the grouping column(s) to the by argument.

import deephaven.plot.express as dx
iris = dx.data.iris()

violin_plot_group = dx.violin(iris, x="SepalLength", by="Species")

API Reference

Returns a violin chart

Returns: DeephavenFigure A DeephavenFigure that contains the violin chart

ParametersTypeDefaultDescription
tablePartitionedTable |
Table |
DataFrame
A table to pull data from.
xstr |
list[str] |
None
NoneA column or list of columns that contain x-axis values. If both x and y are specified, one should be numerical and the other categorical. If x is numerical, the violins are drawn horizontally.
ystr |
list[str] |
None
NoneA column or list of columns that contain y-axis values. If both x and y are specified, one should be numerical and the other categorical. If y is numerical, the violins are drawn vertically.
bystr |
list[str] |
None
NoneA column or list of columns that contain values to plot the figure traces by. All values or combination of values map to a unique design. The variable by_vars specifies which design elements are used. This is overriden if any specialized design variables such as color are specified
by_varsstr |
list[str]
'color'A string or list of string that contain design elements to plot by. Can contain color. If associated maps or sequences are specified, they are used to map by column values to designs. Otherwise, default values are used.
colorstr |
list[str] |
None
NoneA column or list of columns that contain color values. The value is used for a plot by on color. See color_discrete_map for additional behaviors.
hover_namestr |
None
NoneA column that contains names to bold in the hover tooltip.
labelsdict[str, str] |
None
NoneA dictionary of labels mapping columns to new labels.
color_discrete_sequencelist[str] |
None
NoneA list of colors to sequentially apply to the series. The colors loop, so if there are more series than colors, colors will be reused.
color_discrete_mapdict[str | tuple[str], str] |
None
NoneIf dict, the keys should be strings of the column values (or a tuple of combinations of column values) which map to colors.
violinmodestr'group'Default 'group', which draws the violins next to each other or 'overlay' which draws them on top of each other.
log_xboolFalseA boolean that specifies if the corresponding axis is a log axis or not.
log_yboolFalseA boolean that specifies if the corresponding axis is a log axis or not.
range_xlist[int] |
None
NoneA list of two numbers that specify the range of the x-axis.
range_ylist[int] |
None
NoneA list of two numbers that specify the range of the y-axis.
pointsbool |
str
'outliers'Default 'outliers', which draws points outside the whiskers. 'suspectedoutliers' draws points below 4*Q1-3*Q3 and above 4*Q3-3*Q1. 'all' draws all points and False draws no points.
boxboolFalseDraw boxes inside the violin if True.
titlestr |
None
NoneThe title of the chart
templatestr |
None
NoneThe template for the chart.
unsafe_update_figureCallable<function default_callback>An update function that takes a plotly figure as an argument and optionally returns a plotly figure. If a figure is not returned, the plotly figure passed will be assumed to be the return value. Used to add any custom changes to the underlying plotly figure. Note that the existing data traces should not be removed. This may lead to unexpected behavior if traces are modified in a way that break data mappings.