Skip to main content
Version: Python

How to import data with date-times

It's common to import data with date and date-time columns. This guides shows how to work with imported data of this type.

Before running these examples, make sure you've set your date/time settings to use an appropriate display time zone. You can adjust it at any time, and all the displayed date-time columns should adjust to use the new time zone.

note

All the screenshots in this guide assume the time zone is set to New York UTC-5.

We will work with a simple CSV file with example dates and date-times. Assuming you have the examples directory created in your Deephaven directory, create a file data/examples/testDates.csv with the following data:

DateTimeKnownFormat,DateUnknownFormat1,DateUnknownFormat2,DateTimeUnknownFormat
1948-07-01T12:00:00 ET,19480701,1956-07-01,1948-07-01 12:00:00-0400
1948-07-02T16:00:00 ET,19480702,1956-07-02,1948-07-02 16:00:00-0400
1948-07-03T13:59:59 ET,19480703,1956-07-03,1948-07-03 13:59:59-0400
1948-07-04T09:00:00 ET,19480704,1956-07-04,1948-07-04 09:00:00-0400

This CSV has four columns.

  • DateTimeKnownFormat - a date/time in the standard format that Deephaven understands (e.g., 1948-07-01T12:00:00 ET)
  • DateUnknownFormat1 - a date without separators (e.g., 19480701 for July 1, 1948)
  • DateUnknownFormat2 - a date in a standard ISO 8061 format (e.g., 1956-07-01), which we'll convert to a date-time after the import
  • DateTimeUnknownFormat - a date/time in an unknown-to-Deephaven format, with a built-in timezone offset (e.g., 1948-07-02 16:00:00-0400, meaning 16:00:00 on July 2, 1948, time zone offset -4 hours from UTC)

Import the file into a Deephaven table.

from deephaven import read_csv

testDates = read_csv("/data/examples/testDates.csv")

The result will be a small table.

img

We'll cover each column individually. You can see the column type by hovering over its name.

The easiest is DateTimeKnownFormat. This was imported as a date-time because it was in Deephaven's standard date-time format, and the CSV import utility automatically parsed it. Nothing needs to be done here.

Because of its format, DateUnknownFormat1 gets imported as an int column. These int values can be converted to strings in the yyyy-MM-dd format by using the Java date/time class SimpleDateFormat.

To achieve this transformation:

  • int values are converted to strings.
  • string values are parsed into Java dates.
  • Java Dates are formatted in yyyy-MM-dd format.
SimpleDateFormat = jpy.get_type("java.text.SimpleDateFormat")

simpleParse = SimpleDateFormat("yyyyMMdd")
simpleFormat = SimpleDateFormat("yyyy-MM-dd")

testDatesUpdated1 = testDates.update_view(formulas=["Date=simpleFormat.format(simpleParse.parse(Integer.toString(DateUnknownFormat1)))"])

The resulting table has a new string column containing the date in the target format.

img

DateUnknownFormat2 was imported as a string column because it has data that's not numeric and not in a known format. Let's convert it to a date-time. Because there is no associated time zone in the string, we need to specify one. Let's assume the data is for the Asia/Tokyo time zone.

  • We'll define a SimpleDateFormat in the column's format yyyy-MM-dd.
  • We'll specify the time zone for that formatter.
  • We'll call the parse method, and pass the result of that into millis_to_datetime (after calling getTime to get the millis from epoch from the parse method's result).
SimpleDateFormat = jpy.get_type("java.text.SimpleDateFormat")
TimeZone = jpy.get_type("java.util.TimeZone")

dateFormat = SimpleDateFormat("yyyy-MM-dd")
dateFormat.setTimeZone(TimeZone.getTimeZone("Asia/Tokyo"))

testDatesUpdated2 = testDates.update_view(formulas=["DateTimeUpdated = millis_to_datetime(dateFormat.parse(DateUnknownFormat2).getTime())"])

The resulting table contains a new DateTimeUpdated column with a date-time. Note that in this screenshot, we're displaying in the New York time zone, which is earlier than the Tokyo time zone we used for the conversion, so the dates are earlier.

img

DateTimeUnknownFormat was imported as a string column because it couldn't be interpreted by Deephaven. However, it already contains everything needed to parse the date/time - the date, time, and a time zone offset from UTC. This will look similar to the DateUnknownFormat example, but we don't need to specify the time zone.

SimpleDateFormat = jpy.get_type("java.text.SimpleDateFormat")

dateFormat2 = SimpleDateFormat("yyyy-MM-dd HH:mm:ssZZZZ")

testDateTimesUpdated = testDates.update_view(formulas=["DateTimeUpdated = millisToTime(dateFormat2.parse(DateTimeUnknownFormat).getTime())"])

Notice that the hours in the new column are different by one hour from the original string. This is because the data is being shown in a different time zone.

img