3D Scatter Plot

A 3D scatter plot is a type of data visualization that displays data points in three-dimensional space. Each data point is represented as a marker or point, and its position in the plot is determined by the values of three different variables, one for each axis (x, y, and z). This plot allows for the visualization of relationships and patterns among three continuous variables simultaneously.

3D scatter plots are appropriate when a continuous response variable depends on two continuous explanatory variables. If there is an additional categorical variable that the response variable depends on, shapes or colors can be used in the scatter plot to distinguish the categories.

What are 3D scatter plots useful for?

Visualizing multivariate data: When you have three variables of interest, a 3D scatter plot allows you to visualize and explore their relationships in a single plot. It enables you to see how changes in one variable affect the other two, providing a more comprehensive understanding of the data.
Identifying clusters and patterns: In some datasets, 3D scatter plots can reveal clusters or patterns that might not be evident in 2D scatter plots. The added dimensionality can help identify complex structures and relationships that exist in the data.
Outlier detection: Outliers, which are data points that deviate significantly from the general pattern, can be more easily spotted in a 3D scatter plot. They may appear as isolated points away from the main cluster, drawing attention to potentially interesting observations or anomalies.

Examples

A basic 3D scatter plot

Visualize the relationship between three variables by passing their column names to the x, y, and z arguments. Click and drag on the resulting chart to rotate it for new perspectives.

import deephaven.plot.express as dx
iris = dx.data.iris()

scatter_plot_3D = dx.scatter_3d(iris, x="SepalWidth", y="SepalLength", z="PetalWidth")

Create a bubble plot

Use the size of the markers in a 3D scatter plot to visualize a fourth quantitative variable. Such a plot is commonly called a bubble plot, where the size of each bubble corresponds to the value of the additional variable.

The size argument interprets the values in the given column as pixel size, so you may consider scaling or normalizing these values before creating the bubble chart.

import deephaven.plot.express as dx
iris = dx.data.iris()

bubble_plot_3D = dx.scatter_3d(iris, x="SepalWidth", y="SepalLength", z="PetalWidth", size="PetalLength")

Color markers by group

Denote groups of data by using the color of the markers as group indicators. Pass the name of the grouping column(s) to the by argument.

import deephaven.plot.express as dx
iris = dx.data.iris()

scatter_plot_3D_groups = dx.scatter_3d(iris, x="SepalWidth", y="SepalLength", z="PetalWidth", by="Species")

Customize these colors using the color_discrete_sequence or color_discrete_map arguments. Any CSS color name, hexadecimal color code, or set of RGB values will work.

import deephaven.plot.express as dx
iris = dx.data.iris()

# set custom colors using color_discrete_sequence
scatter_3D_custom_1 = dx.scatter_3d(
    iris,
    x="SepalWidth",
    y="SepalLength",
    z="PetalWidth",
    by="Species",
    # A list of colors to sequentially apply to one or more series
    # The colors loop if there are more series than colors
    color_discrete_sequence=["salmon", "#fffacd", "rgb(100,149,237)"]
)

# use a dictionary to specify custom colors
scatter_3D_custom_2 = dx.scatter_3d(
    iris,
    x="SepalWidth",
    y="SepalLength",
    z="PetalWidth",
    by="Species",
    # set each series to a specific color
    color_discrete_map={"virginica":"lemonchiffon", "setosa": "cornflowerblue", "versicolor":"#FA8173"}
)

# or, create a new table with a column of colors, and use that column for the color values
iris_with_custom_colors = iris.update(
    "ExampleColors = `rgb(` + Math.round(Math.random() * 255) + `,` + Math.round(Math.random() * 255) + `,`  + Math.round(Math.random() * 255) +`)`"
)

scatter_3D_custom_3 = dx.scatter_3d(
    iris_with_custom_colors,
    x="SepalWidth",
    y="SepalLength",
    z="PetalWidth",
    by="ExampleColors",
    # When set to `identity`, the column data passed to the
    # color parameter will used as the actual color
    color_discrete_map="identity"
)

Color markers by a continuous variable

Markers can also be colored by a continuous value by specifying the color_continuous_scale argument.

import deephaven.plot.express as dx
iris = dx.data.iris()

# use the `color` argument to specify the value column, and the `color_continuous_scale` to specify the color scale
scatter_3D_color = dx.scatter_3d(
    iris,
    x="SepalWidth",
    y="SepalLength",
    z="PetalWidth",
    by="PetalLength",
    # use any plotly express built in color scale name
    color_continuous_scale="viridis"
)

Or, supply your own custom color scale to color_continuous_scale.

import deephaven.plot.express as dx
iris = dx.data.iris()

scatter_3D_custom_color = dx.scatter_3d(
    iris,
    x="SepalWidth",
    y="SepalLength",
    z="PetalWidth",
    by="PetalLength",
    # custom scale colors can be any valid browser css color
    color_continuous_scale=["lemonchiffon", "#FA8173", "rgb(201, 61, 44)"]
)

API Reference

Returns a 3D scatter chart

Returns: DeephavenFigure A DeephavenFigure that contains the 3D scatter chart

Parameters	Type	Default	Description
table	PartitionedTable \| Table \| DataFrame		A table to pull data from.
x	str \| None	None	A column that contains x-axis values.
y	str \| None	None	A column that contains y-axis values.
z	str \| None	None	A column that contains z-axis values.
by	str \| list[str] \| None	None	A column or list of columns that contain values to plot the figure traces by. All values or combination of values map to a unique design. The variable by_vars specifies which design elements are used. This is overriden if any specialized design variables such as color are specified
by_vars	str \| list[str]	'color'	A string or list of string that contain design elements to plot by. Can contain size, line_dash, width, color, and symbol. If associated maps or sequences are specified, they are used to map by column values to designs. Otherwise, default values are used.
filter_by	str \| list[str] \| bool \| None	None	A column or list of columns that contain values to filter the chart by. If a boolean is passed and the table is partitioned, all partition key columns used to create the partitions are used. If no filters are specified, all partitions are shown on the chart.
required_filter_by	str \| list[str] \| bool \| None	None	A column or list of columns that contain values to filter the chart by. Values set in input filters or linkers for the relevant columns determine the exact values to display. If a boolean is passed and the table is partitioned, all partition key columns used to create the partitions are used. All required input filters or linkers must be set for the chart to display any data.
color	str \| list[str] \| None	None	A column or list of columns that contain color values. If only one column is passed, and it contains numeric values, the value is used as a value on a continuous color scale. Otherwise, the value is used for a plot by on color. See color_discrete_map for additional behaviors.
symbol	str \| list[str] \| None	None	A column or list of columns that contain symbol values. The value is used for a plot by on symbol. See color_discrete_map for additional behaviors.
size	str \| None	None	A column or list of columns that contain size values. If only one column is passed, and it contains numeric values, the value is used as a size. Otherwise, the value is used for a plot by on size. See size_map for additional behaviors.
error_x	str \| None	None	A column with x error bar values. These form the error bars in both the positive and negative direction if error_x_minus is not specified, and the error bars in only the positive direction if error_x_minus is specified.
error_x_minus	str \| None	None	A column with x error bar values. These form the error bars in the negative direction, and are ignored if error_x is not specified.
error_y	str \| None	None	A column with y error bar values. These form the error bars in both the positive and negative direction if error_y_minus is not specified, and the error bars in only the positive direction if error_y_minus is specified.
error_y_minus	str \| None	None	A column with y error bar values. These form the error bars in the negative direction, and are ignored if error_y is not specified.
error_z	str \| None	None	A column with z error bar values. These form the error bars in both the positive and negative direction if error_z_minus is not specified, and the error bars in only the positive direction if error_z_minus is specified.
error_z_minus	str \| None	None	A column with z error bar values. These form the error bars in the negative direction, and are ignored if error_z is not specified.
text	str \| None	None	A column that contains text annotations.
hover_name	str \| None	None	A column that contains names to bold in the hover tooltip.
labels	dict[str, str] \| None	None	A dictionary of labels mapping columns to new labels.
color_discrete_sequence	list[str] \| None	None	A list of colors to sequentially apply to the series. The colors loop, so if there are more series than colors, colors will be reused.
color_discrete_map	str \| tuple[str, dict[str \| tuple[str], dict[str \| tuple[str], str]]] \| dict[str \| tuple[str], str] \| None	None	If dict, the keys should be strings of the column values (or a tuple of combinations of column values) which map to colors. If "identity", the values are taken as literal colors. If "by" or ("by", dict) where dict is as described above, the colors are forced to by
symbol_sequence	list[str] \| None	None	A list of symbols to sequentially apply to the markers in the series. The symbols loop, so if there are more series than symbols, symbols will be reused.
symbol_map	str \| tuple[str, dict[str \| tuple[str], dict[str \| tuple[str], str]]] \| dict[str \| tuple[str], str] \| None	None	If dict, the keys should be strings of the column values (or a tuple of combinations of column values) which map to symbols. If "identity", the values are taken as literal symbols. If "by" or ("by", dict) where dict is as described above, the symbols are forced to by
size_sequence	list[int] \| None	None	A list of sizes to sequentially apply to the markers in the series. The sizes loop, so if there are more series than symbols, sizes will be reused. This is overriden is "size" is specified.
size_map	str \| tuple[str, dict[str \| tuple[str], dict[str \| tuple[str], str]]] \| dict[str \| tuple[str], str] \| None	None	If dict, the keys should be strings of the column values (or a tuple of combinations of column values) which map to sizes. If "identity", the values are taken as literal sizes. If "by" or ("by", dict) where dict is as described above, the sizes are forced to by
color_continuous_scale	list[str] \| None	None	A list of colors for a continuous scale
range_color	list[float] \| None	None	A list of two numbers that form the endpoints of the color axis
color_continuous_midpoint	float \| None	None	A number that is the midpoint of the color axis
opacity	float \| None	None	Opacity to apply to all markers. 0 is completely transparent and 1 is completely opaque.
log_x	bool	False	A boolean that specifies if the corresponding axis is a log axis or not.
log_y	bool	False	A boolean that specifies if the corresponding axis is a log axis or not.
log_z	bool	False	A boolean that specifies if the corresponding axis is a log axis or not.
range_x	list[int] \| None	None	A list of two numbers that specify the range of the x axis.
range_y	list[int] \| None	None	A list of two numbers that specify the range of the y axis.
range_z	list[int] \| None	None	A list of two numbers that specify the range of the z axis.
title	str \| None	None	The title of the chart
template	str \| None	None	The template for the chart.
unsafe_update_figure	Callable	<function default_callback>	An update function that takes a plotly figure as an argument and optionally returns a plotly figure. If a figure is not returned, the plotly figure passed will be assumed to be the return value. Used to add any custom changes to the underlying plotly figure. Note that the existing data traces should not be removed. This may lead to unexpected behavior if traces are modified in a way that break data mappings.