Skip to main content

Solve WORDLE with Deephaven

· 6 min read
Colin Alworth

Impress your friends and outsmart WORDLE using word frequency data

If haven't heard of WORDLE, then you probably haven't been on the internet lately. Chances are your friends are routinely posting their outcomes. Each day, a new WORDLE goes live, challenging players to beat their prior scores and of course, show their awesomeness on social media. Every person plays the same word each day, so no spoilers, please!

img

This video demo is just one example of how to use Deephaven. We love WORDLE and wanted to see if we could beat the normal human score. Sure, there are other ways to automatically solve the puzzle. This is just a toy example to show how flexible Deephaven is, but we think you'll find it easy and effective. In our video, we solve WORDLE without any programming. We've also provided another example below if you prefer to enter and edit code.

Prereqs

  1. To get started, follow our Quickstart guide or run the following commands to download all the code and dependencies to work with Docker.
export VERSION=edge
curl https://raw.githubusercontent.com/deephaven/deephaven-core/main/containers/python/docker-compose.yml -O
docker-compose pull
docker-compose up -d
  1. In your internet browser, navigate to http://localhost:10000/ide.

  2. We use data from Kaggle. We like it because it has the frequency of words, so we can enter words that are more likely than others. Download the unigram_freq.csv to follow along below.

Follow along with the video

Get the data

In the top right of the Deephaven console, click on the More Actions menu (three dots) to upload the file from Kaggle. You can also drag and drop the CSV into the browser.

The table unigram_freq will load in the IDE with two columns: word and count to represent the word and its frequency.

WORDLE always has 5 letters, so first, we want to filter the data to only words with a length of 5.

  • Each Deephaven table has a Table Options menu on the right of its header (three lines). Open to select Manage Custom Columns.
  • Make a new column called length with the formula: word.length(). This is a query formula in disguise. For each row, this will ask "how long is this word?" and return the number of characters in that word. The answer is returned in the new column.
  • On the length column, click the Quick Filters option and enter 5. Now only five-letter words are shown.

Next, we split the words into letters so we can search on a letter's location.

  • Again, go to Manage Custom Columns. This time we add 5 columns, one for each letter. The column names and formulas are:
NameFormula
_0word.charAt(0)
_1word.charAt(1)
_2word.charAt(2)
_3word.charAt(3)
_4word.charAt(4)

All of this is before we get to the WORDLE. We are left with a table of all the possible 5 letter words in this dictionary, with each letter split out.

Defeat WORDLE

Go to WORDLE. If you already did it for the day, try a private or incognito tab.

Enter a guess for the word into the WORDLE, which could be anything that is in your table, since right now you have no clue what it could be!

As players know, you'll get some color-coded results:

  • grey: that letter is not in the word
  • yellow: that letter is in the word, but in the wrong location
  • green: that letter is in the right location

You can now apply that information to your table:

  • Grey letters:
    • Exclude from your table. In the word column, filter out specific letters with !~t (for the letter T) in your Quick Filter field.
  • Yellow letters:
    • Include each yellow letter in the table. In the word column, filter to include specific letters with !~a (for the letter A).
    • Exclude them from position columns. For each position where a yellow letter appeared in the WORDLE, filter out that letter in the appropriate _# column. So, to filter A from a given column, use !a.
  • Green letters:
    • Type any green letter in the corresponding column. If you know R is the fifth letter, filter the column in that position to R.

As you enter grey, yellow, and green letters, watch your table get smaller and smaller!

Eventually, you are left with the right answer! We never needed more than 4 tries with this method. What is your best attempt? We'd love to hear about it on Gitter!

Code solution

If you prefer to type in some code rather than engage with the UI, this is the section for you.

In the console of the IDE, import the CSV file. Makes sure the file is in your /data/ directory. For more information, see our Docker data volumes guide.

from deephaven import read_csv

words = read_csv('data/unigram_freq.csv').sortDescending("count")

Next, limit the words to only those that are 5 letters.

dh_regex_filter = jpy.get_type("io.deephaven.engine.table.impl.select.RegexFilter")
possible = words.where(dh_regex_filter("word", "....."))

In the daily WORDLE, enter any word that is in your Deephaven table. As you get green, yellow, and grey values, replace the three lines in the query with those values.

  • Replace the ..... with green letters in the right location. A . means any letter in that spot.
  • Add yellow letters to replace the _ in the (?=.*_) for the second filter; comment out that line.
  • Add grey letters to replace the _ in [^_]; comment out that line.
possible = possible.where(dh_regex_filter("word", "....."),\
#dh_regex_filter("word", "^(?=.*).*"), \
#dh_regex_filter("word", "^[^]*$"))
possible = possible.where(dh_regex_filter("word", "....."),\
#dh_regex_filter("word", "^(?=.*_).*"), \
#dh_regex_filter("word", "^[^_]*$"))

For example, if I get a yellow I and R, a grey B, a grey T, and a green R in the second spot of the word, I would filter the table as shown:

possible = possible.where(dh_regex_filter("word", ".r..."),\
dh_regex_filter("word", "^(?=.*r)(?=.*i).*"),
dh_regex_filter("word", "^[^bt]*$"))

The table will update to show you only the words that are available as possible solutions.

Develop with Deephaven Core