Funnel Plot

A funnel plot is a data visualization that represents a process with various stages and allows multiple stacked categories, showing the quantitative values or counts at each stage in a funnel shape. It is a useful tool for tracking the progression or attrition of data through different stages, providing a visual overview of data distribution within the process. The data must be ordered by the response variable, or the “funnel” shape will not be guaranteed.

Funnel plots differ from funnel area plots in that they display the absolute count of data points in each category, while funnel area plots display the percentage of data points that belong to each category. Funnel plots also count each data point as belonging to at least one category, so the categories are represented as subsets of each other. On the other hand, funnel area plots also count each data point as belonging to exactly one category, and display the categories as mutually exclusive.

Funnel plots are appropriate when the data contain a categorical variable where the frequencies of each category can be computed, and the categories can be ordered. Additionally, funnel plots assume a particular relationship between levels of the categorical variable, where each category is a proper subset of the previous category. If the data contain an unordered categorical variable, or the categories are better conceptualized as parts of a whole, consider a pie plot instead of a funnel plot.

What are funnel plots useful for?

Visualizing sequential data: Data that are staged or sequential in some way are often visualized with funnel plots, yielding insight on the absolute changes between each stage.
Comparing categories: Funnel plots can be broken down into categories to produce insights into the distribution of data at each stage within a process. Then
Evaluating efficiency: Assessing the efficiency and effectiveness of a process or workflow, particularly when evaluating the attrition or conversion at each stage, is easy with funnel plots.

Examples

A basic funnel plot

Visualize the trend in consecutive stages of a categorical variable by passing column names to the x and y arguments.

import deephaven.plot.express as dx
marketing = dx.data.marketing()

# `Count` is the frequency/value column, and `Stage` is the category column
funnel_plot = dx.funnel(marketing, x="Count", y="Stage")

API Reference

Returns a funnel chart

Returns: DeephavenFigure A DeephavenFigure that contains the funnel chart

Parameters	Type	Default	Description
table	Table \| DataFrame		A table to pull data from.
x	str \| list[str] \| None	None	A column or list of columns that contain x-axis values.
y	str \| list[str] \| None	None	A column or list of columns that contain y-axis values.
by	str \| list[str] \| None	None	A column or list of columns that contain values to plot the figure traces by. All values or combination of values map to a unique design. The variable by_vars specifies which design elements are used. This is overriden if any specialized design variables such as color are specified
by_vars	str \| list[str]	'color'	A string or list of string that contain design elements to plot by. Can contain color and pattern_shape. If associated maps or sequences are specified, they are used to map by column values to designs. Otherwise, default values are used.
filter_by	str \| list[str] \| bool \| None	None	A column or list of columns that contain values to filter the chart by. If a boolean is passed and the table is partitioned, all partition key columns used to create the partitions are used. If no filters are specified, all partitions are shown on the chart.
required_filter_by	str \| list[str] \| bool \| None	None	A column or list of columns that contain values to filter the chart by. Values set in input filters or linkers for the relevant columns determine the exact values to display. If a boolean is passed and the table is partitioned, all partition key columns used to create the partitions are used. All required input filters or linkers must be set for the chart to display any data.
color	str \| list[str] \| None	None	A column or list of columns that contain color values. If only one column is passed, and it contains numeric values, the value is used as a value on a continuous color scale. Otherwise, the value is used for a plot by on color. See color_discrete_map for additional behaviors.
text	str \| None	None	A column that contains text annotations.
hover_name	str \| None	None	A column that contains names to bold in the hover tooltip.
labels	dict[str, str] \| None	None	A dictionary of labels mapping columns to new labels.
color_discrete_sequence	list[str] \| None	None	A list of colors to sequentially apply to the series. The colors loop, so if there are more series than colors, colors will be reused.
color_discrete_map	dict[str \| tuple[str], str] \| None	None	If dict, the keys should be strings of the column values (or a tuple of combinations of column values) which map to colors.
opacity	float \| None	None	Opacity to apply to all markers. 0 is completely transparent and 1 is completely opaque.
orientation	str \| None	None	"h" for horizontal or "v" for vertical
log_x	bool	False	A boolean that specifies if the corresponding axis is a log axis or not.
log_y	bool	False	A boolean that specifies if the corresponding axis is a log axis or not.
range_x	list[int] \| None	None	A list of two numbers that specify the range of the x-axis.
range_y	list[int] \| None	None	A list of two numbers that specify the range of the y-axis.
title	str \| None	None	The title of the chart
template	str \| None	None	The template for the chart.
unsafe_update_figure	Callable	<function default_callback>	An update function that takes a plotly figure as an argument and optionally returns a plotly figure. If a figure is not returned, the plotly figure passed will be assumed to be the return value. Used to add any custom changes to the underlying plotly figure. Note that the existing data traces should not be removed. This may lead to unexpected behavior if traces are modified in a way that break data mappings.