Nothing is difficult, only unfamiliar 💡

Application and real creation with LLMs is the next hill to climb

Aug 08, 2024

I’ve been working in and around software for the majority of my professional life. At the beginning it was intimidating.

It still can be.

But it should be obvious to you and I that barriers to entry have dramatically decreased.

At GitHub in 2019 our Data Science team built a few models to predict when there would be “100 Million Developers” on the platform. The target was 2025. It was crossed in 20231.

The new goal is 1B or approximately 12.5% of the Earth’s population.

Why this matters

We all benefit as more problems are brought into the domain of software2.

And computational literacy is required.

As abstractions have improved3, building conceptual familiarity continues to improve at an accelerating rate.

I’ve always learned best by doing. Through trial and error. Ideally when working against an objective with real stakes.

But knowing what questions to ask is rooted in fundamentals and exposure.

If you’ve never built a working app, trained a discriminatory model on a dataset or worked with a database — the breadth of topics and subtopics that you need to first understand is daunting. It’s the barrier that impedes your ability to start.

And starting is often the largest roadblock.

When I was a more hands-on keyboard practitioner in Data Science a decade ago, I’d often get asked by junior team members how to best get started modeling.

R was more popular at that time than Python even despite it’s idiosyncratic nature. I’d recommend two books4:

R for Data Science (R4DS) by Wickham and,
Introduction to Statistical Learning by James, Witten, Hastie and Tibrishani

Their pedagogical approach follows constructivism and is agile.

In the first chapter of R4DS after its introduction you’re immediately plotting the body mass vs. flipper length of penguins.

You might not understand all of the libraries, functions and datasets you’re using but visualizing a plot appear in your console window almost instantaneously is immensely rewarding.

The specter, haunting

Since GenAI’s public epoch 05, there has been so much bombast in the news and on social channels about what a “post GPT world” means for capital J, jobs.

The multi-trillion dollar question as myopically framed: Is it a tool and complement (dare we say co-pilot) or the catalyst that will lead to widespread structural unemployment?

And when of course will all this CAPEX investment in chips, data centers and associated infrastructure pay off6?

An academic paper published in June 2024 by Giordana et al. poses a handful of research questions that attempt to quantitatively estimate which skills and tasks may be most impacted. They harvest and analyze tweets and their sentiment from real users leveraging LLMs. The degree of substitution for these skills or their bundles — jobs — are not estimated. They found 185 out of 408 applicable ESCO skills would face some impact.

Goldman’s report has contrasting views. On one hand, a bearish sentiment from Daron Acemoglu stating that only 5% of tasks over the next decade would be impacted leading to a paltry 0.5% US productivity labor rate increase. On the other, the bullish sentiment from GS senior economist Joseph Briggs stating up to 25% of tasks can be automated leading to a 9% US labor productivity rate increase in the same time period.

What you and I do know: prediction is very difficult, especially if it's about the future. And especially if the best fit function turns out to be exponential.

I’m personally irked by prognosticators that:

Haven’t been practitioners in a meaningful sense.
Extrapolate linearly and point generalize based on historical trends — especially as it relates to past waves of digitization and workflow automation (e.g. process mining and robotic process automation).
Can’t imagine shifts in existing paradigms that lead to consequential outcomes on “real productivity” with new and different incentives.

Said simply: the way knowledge work happens will dramatically change. And will continue to do so at increasing rates.

Nature’s job: to shift things elsewhere, to transform them, to pick them up and move them here and there. Constant alteration. But not to worry: there’s nothing new here. Everything is familiar. Even the proportions are unchanged. —VIII. 6

Building rapid familiarity

The impending Cambrian explosion of non-farm labor productivity — what BLS computes — has much to do with our willingness, uptake and adoption of the capabilities already present and available to us for $20/month or less7.

And how our own domain expertise and knowledge is embedded, remixed and used to augment (and eventually replace) workflows that keep entering the domain of software.

Because what really matters are workflows8 — not independent tasks. Dealing with unstructured inputs, data and risks and making judgements on them.

But let’s snap back to the practical for a moment.

Even for those in the mix, it’s become increasing challenging to keep pace with how foundational models are being updated let alone the scaffolding around operationalizing them: extraction, retrieval, evaluation, multistep reasoning, etc.

On August 1st, Nicholas Carlini of Deepmind published what quickly became a viral and quite refreshing post titled “How I use AI”.

With clarity and depth it overviews 50 use cases with full prompt histories:

Building complete applications
As a tutor
Getting started with new programming languages
Simplifying code
Automating tasks (he asserts this is where the utility is higher for non-experts vs. experts)
As an API reference
Solving one-offs and already solved problems
Etc

The kicker? The use cases, by his estimate, represent only 2% of what he’s already leveraged LLMs for and has increased his productivity subjectively by 50%.

The takeaway: it’s up to our collective imagination and willingness to start to gain the benefits.

Create, don’t tell

Inspired to go deeper than I have previously, I ventured into using Claude 3.5 Sonnet and its amazing artifacts functionality to scaffold and build a “non-trivial” application last night.

Not one that I think the world needs but to understand the current edges.

Early in the week, using Gemini 1.5, I built a pipeline to download, summarize and extract parts — with calls to ChatGPTs API — of 10Ks from an input list of stock tickers. Fun. But I know python.

Something different

At GitHub our team created a beloved internal IDE to query our warehouse named “data dot”.

With a spec for it’s UX burned into my brain, how fast could I get a prima facie “working mockup” of functionality built in a language I’ve never really used?

35 minutes.

👉 You can play around with the Claude artifact here 👈

Why was I impressed?

First, it’s built completely via ReactJS.

Second, I was able to have Claude implement a very naive “SQL lexer and parser”9 in pure javascript that supports basic DQL despite Claude’s initial unwillingness to provide anything more than mocked code for “connecting to an actual database engine”.

~350 lines of generated code described declaratively with a couple iterations allowed me to reminisce — if but briefly — halcyon days.

And serve, in a counterfactual world, as a much higher fidelity PRD than words or user stories ever could.

The only limits?

Claude’s current context window length and message rates per hour. Ironically when he bonks just telling him to continue where he left off is the fix!

Further proof that nothing is difficult, only unfamiliar.

Your imagination10 is calling — go get started!

https://github.blog/news-insights/company-news/100-million-developers-and-counting/

The premise and conclusion of Andreessen’s seminal “Software is eating the world” written 13 years ago.

Example: binary —> assembly —> high level languages —> open source —> GenAI (declarative and inquiry based programming vs. imperative).

It’s original subtitle was: “With Applications in R”. For the Pythonistas, a few years ago a complementary text was produced.

Let’s call this November 30th, 2022 — ChatGPTs public debut.

https://www.goldmansachs.com/insights/top-of-mind/gen-ai-too-much-spend-too-little-benefit

For foundational models like ChatGPT, Claude and Gemini $20 seems to be the penetration price they’ve all anchored to — for now. You can llama 3.1:8B up and running locally on commodity hardware in a few minutes with Ollama.

Platforms that make agentic systems easy to compose, modify and evaluate are still very early. IMHO, these will have the most profound affect on workflows. Examples: anakin.ai, llama agents.

It really isn’t implementing SQL at all — there is no query planning or execution involved. And the data tables “customers and orders” are stored as a complex JS object in memory. But it does work!

More Claude artifact inspiration has been catalogued here: https://github.com/NVO-2021/Awesome-Claude-Artifacts

Meditations on Tech

Discussion about this post