Drop, move, and rename columns
Queries often call for dropping, moving, or renaming columns in a table. This guide covers doing so programmatically.
The following methods drop columns from a table:
The following methods move columns in a table:
The following methods rename columns in a table:
The examples in this guide will use the following table:
from deephaven import new_table
from deephaven.column import int_col, double_col, string_col
source = new_table(
[
string_col("Name", ["Andy", "Claire", "Jane", "Steven"]),
int_col("StudentID", [1, 2, 3, 4]),
int_col("TestGrade", [85, 95, 88, 72]),
int_col("HomeworkGrade", [85, 95, 90, 95]),
double_col("GPA", [3.0, 4.0, 3.7, 2.8]),
]
)
Remove columns from a table
drop_columns
drop_columns
removes one or more columns from a table based on the column names it's given.
result_drop_one_col = source.drop_columns(["GPA"])
result_drop_two_cols = source.drop_columns(["TestGrade", "HomeworkGrade"])
view
view
removes any columns not given as input that are present in the source table. If view
is not given any input, it returns a table the same as the source table but with all formula columns.
Important
Formula columns initially store only the formulas used to create columnar data. Values are calculated on the fly as needed.
The following code block uses view
to remove the StudentID
column from the source
table.
result_no_student_id = source.view(["Name", "TestGrade", "HomeworkGrade", "GPA"])
select
select
is similar to view
in that it removes any columns not given as input that are present in the source table. However, unlike view
,select
creates in-memory columns.
Important
In-memory columns store all of their data in memory. This includes columns of any type, such as formula or memoized columns. Be careful when using select
with large tables, as it can consume a lot of memory.
The following code block uses select
to remove the Name
column from the source
table.
result_no_name = source.select(["StudentID", "TestGrade", "HomeworkGrade", "GPA"])
Move columns in a table
move_columns
move_columns
moves one or more columns to a specified column index in the resulting table.
The following code block uses move_columns
twice: once to move a single column further right, and once to move multiple columns further right.
result_move_one = source.move_columns(3, ["StudentID"])
result_move_two = source.move_columns(2, ["Name", "StudentID"])
move_columns_up
move_columns_up
moves one or more columns to the zeroth column index (the left) in the resulting table.
The following example moves StudentID
to the leftmost position in the table.
result_student_id_first = source.move_columns_up(["StudentID"])
move_columns_down
move_columns_down
moves one or more columns to the last column index (the right) in the resulting table.
The following example moves TestGrade
and HomeworkGrade
to the rightmost position in the table.
result_grades_last = source.move_columns_down(["TestGrade", "HomeworkGrade"])
Rename columns in a table
rename_columns
rename_columns
renames one or more columns in a table. Renaming a column follows the syntax NewColumnName = OldColumnName
.
The following example renames StudentID
to ID
and GPA
to GradePointAverage
.
result_renamed = source.rename_columns(["ID = StudentID", "GradePointAverage = GPA"])
view
and select
Renaming columns with view
and select
follows the same syntax as rename_columns
(NewColumnName = OldColumnName
). As we saw previously when removing columns, select
and view
take a list of column names as input.
The following example uses both select
and view
to rename the Name
column to StudentName
.
Important
select
creates in-memory columns. Be careful when using select
with large tables, as it can consume a lot of memory.
result_select_renamed = source.select(
["StudentName = Name", "StudentID", "TestGrade", "HomeworkGrade", "GPA"]
)
result_view_renamed = source.view(
["StudentName = Name", "StudentID", "TestGrade", "HomeworkGrade", "GPA"]
)