How to select, view, and update data in tables
Analysis of data in queries often requires creating new tables from some or all of the columns in existing tables. The Deephaven API offers a variety of methods that can achieve this. In this guide, five are discussed:
The methods in this guide store results in different ways that can have a huge impact on query performance. The purpose of this guide is to show how to use these methods, not how to pick the best one for your needs. See our concept guide Choosing the right selection method for more insight.
The examples in this guide use a table called students
created using new_table
. The students
table contains data on four students in a class. If you are unfamiliar with the method, check out our guide Create a new table.
from deephaven import new_table
from deephaven.column import int_col, double_col, string_col
students = new_table([
string_col("Name", ["Andy", "Claire", "Jane", "Steven"]),
int_col("StudentID", [1, 2, 3, 4]),
int_col("TestGrade", [85, 95, 88, 72]),
int_col("HomeworkGrade", [85, 95, 90, 95]),
double_col("GPA", [3.0, 4.0, 3.7, 2.8])
])
- students
Create a table from columns of a source table
The select
and view
methods allow the user to create a new table containing columns derived from columns in a source table. The examples below show how to use these methods to select two columns from the source table and create a third to determine if a student passed the class.
Notice how the tables studentsPassedSelect
and studentsPassedView
contain the same data. The select
and view
methods return tables that appear identical, but they differ in how they store their results.
select
computes and stores the result in memory.view
saves formulas that are recomputed from data in the source table every time a cell is accessed.
select
students_passed_select = students.select(formulas=["Name", "GPA", "Passed = GPA >= 3.0"])
- students_passed_select
view
students_passed_view = students.view(formulas=["Name", "GPA", "Passed = GPA >= 3.0"])
- students_passed_view
Add columns to a table
The update
, update_view
, and lazy_update
methods allow the user to add one or more columns to a source table. The examples below show how these three methods keep all of the data from the source table and add an additional column.
Notice how students_passed_update
, students_passed_update_view
, and students_passed_lazy_update
contain all columns from the students
table and one additional column. The update
, update_view
, and lazy_update
methods return tables that appear identical, but they differ in how they store their results.
update
computes and stores the new columns in memory.update_view
saves the new columns as formulas that are recomputed from data in the source table every time a cell is accessed.lazy_update
caches new column formula evaluations so that each set of formula inputs is computed at most once.
update
students_passed_update = students.update(formulas=["Passed = GPA >= 3.0"])
- students_passed_update
update_view
students_passed_update_view = students.update_view(formulas=["Passed = GPA >= 3.0"])
- students_passed_update_view
lazy_update
students_passed_lazy_update = students.lazy_update(formulas=["Passed = GPA >= 3.0"])
- students_passed_lazy_update