This Week I Learned
Learning is a fundamental part of our daily lives as software engineers. I've got so used to it that I often don't even notice when I learn something new. That's why I've created the "This Week I Learned" journal. Check it out below - or better yet, start your own!
Week of March 17, 2025: Playwright MCP
Playwright team has just released their MCP for navigating web in LLMs: https://github.com/microsoft/playwright-mcp Under the hood, it exposes web page accessibility tree that LLMs can easily consume.
Week of March 10, 2025: JSON_DOCUMENT_MAX_DEPTH in MySQL
This week I’ve proudly exceeded MySQL’s JSON_DOCUMENT_MAX_DEPTH
limit of 100 (code) 🌀
Week of March 3, 2025: Vitest Browser Mode
Week of February 24, 2025: ulid
Universally Unique Lexicographically Sortable Identifier: https://github.com/ulid/spec
Week of February 17, 2025: Screenplay Pattern
Writing tests at scale requires a more thougtful code organization to keep codebase maintainable. The most popular pattern in testing is Page Object Model, or POM. The Screenplay Pattern is an alternative inspired by Domain-Driven Design and popularized by Serenity/JS.
The Screenplay Pattern in one line: Actors use Abilities to perform Interactions.
For example, an Actor may be given an Ability to browse the Web using a specific browser like Chrome. The Ability would hold a reference to a Chrome WebDriver instance. Then, the Actor could call a Task to load a login page, a second Task to enter username and password, and a final Task to click the “login” button. Each task would access the WebDriver instance through the calling Actor’s Ability to control the Chrome browser.
Week of February 10, 2025: spurious correlations
Week of February 3, 2025: stringsAsFactors
Statisticians use the term “factors” to describe categorical variables, or enums. They are so essential that R coerces all character strings to be factors by default.
Why do we need factor variables to begin with? Because of modeling functions like ‘lm()’ and ‘glm()’. Modeling functions need to treat expand categorical variables into individual dummy variables, so that a categorical variable with 5 levels will be expanded into 4 different columns in your modeling matrix. There’s no way for R to know it should do this unless it has some extra information in the form of the factor class. From this point of view, setting ‘stringsAsFactors = TRUE’ when reading in tabular data makes total sense. If the data is just going to go into a regression model, then R is doing the right thing.
Week of January 27, 2025: ANTLR
https://www.antlr.org/ (plus a ton of community grammars)
Week of January 20, 2025: Common Table Expressions (CTEs)
Complex SQL queries can be broken down into smaller parts using Common Table Expressions (CTEs):
WITH FilteredOrders AS (
SELECT order_id, customer_id, total_amount
FROM orders
WHERE total_amount > 100
),
TopCustomers AS (
SELECT customer_id, COUNT(*) AS order_count
FROM FilteredOrders
GROUP BY customer_id
HAVING COUNT(*) > 3
)
SELECT customer_id, order_count
FROM TopCustomers;
Common Table Expressions (CTE) are part of the ANSI standard since SQL:1999. Beware that MySQL always materializes CTEs, which can introduce performance issues.
Week of January 13, 2025: Big Data is Dead
In 2004, when the Google MapReduce paper was written, it would have been very common for a data workload to not fit on a single commodity machine. […] Today, however, a standard instance on AWS uses a physical server with 64 cores and 256 GB of RAM. That’s two orders of magnitude more RAM. […]
One definition of “Big Data” is “whatever doesn’t fit on a single machine. By that definition, the number of workloads that qualify has been decreasing every year.
On a separate note, it’s a lot of fun to debug memory leaks in 256 GB RAM machines.
Week of January 6, 2025: Nemawashi
To ensure you have everyone’s support, it’s helpful to spend time on a consensus-building practice called nemawashi, a process of seeking approval from each significant person on a proposed project before committing to a group decision.