Drop, move, and rename columns
Queries often call for dropping, moving, or renaming columns in a table. This guide covers doing so programmatically.
The following methods drop columns from a table:
The following methods move columns in a table:
The following methods rename columns in a table:
The examples in this guide will use the following table:
from deephaven import new_table
from deephaven.column import int_col, double_col, string_col
source = new_table(
[
string_col("Name", ["Andy", "Claire", "Jane", "Steven"]),
int_col("StudentID", [1, 2, 3, 4]),
int_col("TestGrade", [85, 95, 88, 72]),
int_col("HomeworkGrade", [85, 95, 90, 95]),
double_col("GPA", [3.0, 4.0, 3.7, 2.8]),
]
)
- source
Remove columns from a table
drop_columns
drop_columns
removes one or more columns from a table based on the column names it's given.
result_drop_one_col = source.drop_columns(["GPA"])
result_drop_two_cols = source.drop_columns(["TestGrade", "HomeworkGrade"])
- result_drop_one_col
- result_drop_two_cols
view
view
removes any columns not given as input that are present in the source table. If view
is not given any input, it returns a table the same as the source table but with all formula columns.
Formula columns initially store only the formulas used to create columnar data. Values are calculated on the fly as needed.
The following code block uses view
to remove the StudentID
column from the source
table.
result_no_student_id = source.view(["Name", "TestGrade", "HomeworkGrade", "GPA"])
- result_no_student_id
select
select
is similar to view
in that it removes any columns not given as input that are present in the source table. However, unlike view
,select
creates in-memory columns.
In-memory columns store all of their data in memory. This includes columns of any type, such as formula or memoized columns. Be careful when using select
with large tables, as it can consume a lot of memory.
The following code block uses select
to remove the Name
column from the source
table.
result_no_name = source.select(["StudentID", "TestGrade", "HomeworkGrade", "GPA"])
- result_no_name
Move columns in a table
move_columns
move_columns
moves one or more columns to a specified column index in the resulting table.
The following code block uses move_columns
twice: once to move a single column further right, and once to move multiple columns further right.
result_move_one = source.move_columns(3, ["StudentID"])
result_move_two = source.move_columns(2, ["Name", "StudentID"])
- result_move_one
- result_move_two
move_columns_up
move_columns_up
moves one or more columns to the zeroth column index (the left) in the resulting table.
The following example moves StudentID
to the leftmost position in the table.
result_student_id_first = source.move_columns_up(["StudentID"])
- result_student_id_first
move_columns_down
move_columns_down
moves one or more columns to the last column index (the right) in the resulting table.
The following example moves TestGrade
and HomeworkGrade
to the rightmost position in the table.
result_grades_last = source.move_columns_down(["TestGrade", "HomeworkGrade"])
- result_grades_last
Rename columns in a table
rename_columns
rename_columns
renames one or more columns in a table. Renaming a column follows the syntax NewColumnName = OldColumnName
.
The following example renames StudentID
to ID
and GPA
to GradePointAverage
.
result_renamed = source.rename_columns(["ID = StudentID", "GradePointAverage = GPA"])
- result_renamed
view
and select
Renaming columns with view
and select
follows the same syntax as rename_columns
(NewColumnName = OldColumnName
). As we saw previously when removing columns, select
and view
take a list of column names as input.
The following example uses both select
and view
to rename the Name
column to StudentName
.
result_select_renamed = source.select(
["StudentName = Name", "StudentID", "TestGrade", "HomeworkGrade", "GPA"]
)
result_view_renamed = source.view(
["StudentName = Name", "StudentID", "TestGrade", "HomeworkGrade", "GPA"]
)
- result_select_renamed
- result_view_renamed