I don’t know why this week became the tipping point, but nearly every software engineer I’ve talked to is experiencing some degree of mental health crisis… Many people assuming I meant job loss anxiety but that’s just one presentation. I’m seeing near-manic episodes triggered by watching software shift from scarce to abundant. Compulsive behaviors around agent usage. Dissociative awe at the temporal compression of change. It’s not fear necessarily — just the cognitive overload from living in an inflection point.
Living in the inflection point
AI, career development, software engineering

What Wall Street Gets Wrong About SaaS – Roger Wong
AI, LLMs, software engineering

Last week, B2B software companies tumbled in the stock market, dropping over 10%. Software stocks have been trending down since September 2025, now down 30% according to the IGV software index. The prevailing sentiment is because AI tools like Anthropic’s Claude are now capable of doing things companies used to pay thousands of dollars for.
But investors are wrong. Nvidia CEO Jensen Huang calls this line of thinking “illogical.” Mark Murphy, an analyst at JPMorgan uses the same word to describe the “expectation that every company will hereby write and maintain a bespoke product to replace every layer of mission-critical enterprise software they have ever deployed.”
Does AI already have human-level intelligence? The evidence is clear

In 1950, in a paper entitled ‘Computing Machinery and Intelligence’1, Alan Turing proposed his ‘imitation game’. Now known as the Turing test, it addressed a question that seemed purely hypothetical: could machines display the kind of flexible, general cognitive competence that is characteristic of human thought, such that they could pass themselves off as humans to unaware humans?
Three-quarters of a century later, the answer looks like ‘yes’. In March 2025, the large language model (LLM) GPT-4.5, developed by OpenAI in San Francisco, California, was judged by humans in a Turing test to be human 73% of the time — more often than actual humans were2. Moreover, readers even preferred literary texts generated by LLMs over those written by human experts3.
Hamish Songsmith – Blog & Links
AI, LLMs, software engineering

A growing chasm separates those building around AI from those still debating it—and it has nothing to do with model size or vendor choice.
On one side: people who aren’t fixated on measuring or justifying it first. They have experimented enough and understand that, applied well, AI generally means more productivity. They’re already building tools like Ralph loops, OpenClaw, and AI factories such as GSD and Gas Town.
If those names sound like “random internet projects,” that’s exactly the problem: capability is moving outside your organisation faster than you recognise.
The Coherence Premium

In 1937, the British economist Ronald Coase asked a question that seems almost embarrassingly simple: why do firms exist at all? If markets are so efficient at allocating resources, why don’t we just have billions of individuals contracting with each other for every task? Why do we need these hulking organizational structures called companies?
His answer, which eventually won him a Nobel Prize, was transaction costs. It’s expensive to negotiate contracts and coordinate with strangers, to monitor performance and enforce agreements. Firms exist because sometimes it’s cheaper to bring activities inside an organization than to contract for them on the open market. The boundary of the firm, Coase argued, sits wherever the cost of internal coordination equals the cost of external transaction.
We’re in a Coasean inversion. The economics that made large firms necessary are reversing. But most people are looking at this transformation through the wrong lens. They see AI as a productivity tool, a way to do more faster. They measure success in hours saved or output multiplied, and this misses the point entirely.
Hierarchical memory management in agent harnesses | LinkedIn
AI, coding agent, LLMs, software engineering

We’ve seen incredible momentum toward files as the memory layer for agents, and this has accelerated significantly over the last year. But why use the file system, and why use Unix commands? What are the advantages these tools provide over alternatives like semantic search, databases, and simply very long context windows?
What a file system provides for an agent, along with tools to search and access it, is the ability to make a fixed context feel effectively infinite in size.
Bash commands are powerful for agents because they provide composable tools that can be piped together to accomplish surprisingly complex tasks. They also remove the need for tool definition JSON, since bash commands are already known to the LLM.
Is Learning CSS a Waste of Time in 2026? – DEV Community

With modern frameworks, component libraries, and utility-first CSS, it’s a fair question.
Most frontend developers today rarely write “real” CSS. Layouts come prebuilt. Responsiveness is handled for us. Accessibility is supposed to be baked in. If something needs styling, we tweak a variable, add a utility class, or override a component token.
AI open models have benefits. So why aren’t they more widely used? | MIT Sloan

A new paper co-authored by Frank Nagle, a research scientist at the MIT Initiative on the Digital Economy, found that users largely opt for closed, proprietary AI inference models, namely those from OpenAI, Anthropic, and Google. Those models account for nearly 80% of all AI tokens that are processed on OpenRouter, the leading AI inference platform. In comparison, less-expensive open models from the likes of Meta, DeepSeek, and Mistral account for only 20% of AI tokens processed. (A token is a unit of input or output to an AI model, roughly equivalent to one word in a prompt to an AI chatbot.)
Open models achieve about 90% of the performance of closed models when they are released, but they can quickly close that gap — and the price of running inference is 87% less on open models. Nagle and co-author Daniel Yue at the Georgia Institute of Technology found that optimal reallocation of demand from closed to open models could cut average overall spending by more than 70%, saving the global AI economy about $25 billion annually.
How Product Discovery changes with AI – by David Hoang

In Jenny Wen’s talk at Hatch Conference in 2025, “Don’t Trust the Process,” she raises an important point: the processes we’ve established are rapidly becoming lagging indicators. Process is important, but it should work for you, not the other way around.
People worshipped the process artifacts, not the final result. We’re in a moment where the moment you document a process, it becomes irrelevant. I don’t believe it’ll be like this forever, but until software is completely rewritten with AI as a core capability, it’s going to be like this for a while.
So, where does Product Discovery change? Let’s revisit those four risks.
One Human + One Agent = One Browser From Scratch
AI, coding agent, LLMs, software engineering

One Human + One Agent = One Browser From Scratch (via) embedding-shapes was so infuriated by the hype around Cursor’s FastRender browser project – thousands of parallel agents producing ~1.6 million lines of Rust – that they were inspired to take a go at building a web browser using coding agents themselves.
The result is one-agent-one-browser and it’s really impressive. Over three days they drove a single Codex CLI agent to build 20,000 lines of Rust that successfully renders HTML+CSS with no Rust crate dependencies at all – though it does (reasonably) use Windows, macOS and Linux system frameworks for image and text rendering.
The Five Levels: from Spicy Autocomplete to the Dark Factory – Dan Shapiro’s Blog
AI, LLMs, software engineering

If you are just using ChatGPT to write your regex, you aren’t really getting the benefits of deflation. You’re just typing faster.
I’ve now seen dozens of companies struggling to put AI to work writing code, and each one has moved through five clear tiers of automation. That felt familiar, and I realized that the federal government had been there first – but for cars.
This is a Time of Technical Deflation – Dan Shapiro’s Blog
AI, economics, realms, software engineering
Now economists have this thing they call deflation. For an economy, it’s a nightmare. Prices drop day after day, creating a psychological trap where consumers stop spending. Why buy a washing machine today when it will be cheaper tomorrow?1 The whole economy grinds to a halt.
But what is a ‘trap’ for a nation is a miracle for a codebase. Usually, deflation is bad for debtors because money becomes harder to come by. But technical debt is different: you don’t owe money, you owe work. And the cost of work is what’s deflating. The cost to pay off your debt – the literal dollars and hours required to fix the mess – is diminishing. It is cheaper to clean up your code today than it has ever been. And if you put it off? It becomes cheaper still. This leads to a striking reversal: technical debt2 becomes a wise investment3.
Why Designers Can No Longer Trust the Design Process
AI, Design, LLMs, Product Design

For years, designers have been told to trust the process.
Research first. Personas. Journey maps. Problem statements. Then solutions.In this talk from Hatch Conference, Jenny Wen, Design Lead at Anthropic and former Director of Design at Figma, explains why that model no longer fits the reality of modern design work.
With AI accelerating prototyping, smaller teams doing more, and craft becoming a key differentiator, rigid processes are failing designers. Jenny shares real examples from Figma and Anthropic that show how great work actually gets made today. Starting from solutions, caring deeply about details, building intuition, skipping steps, and designing for delight.
This is not a rejection of research or strategy.
It is a call to stop worshipping process artifacts and start trusting designer judgment again.
the browser is the sandbox | AI Focus

This got me thinking about the browser. Over the last 30 years, we have built a sandbox specifically designed to run incredibly hostile, untrusted code from anywhere on the web, the instant a user taps a URL. I think it’s incredible that we have this way to run code that you’ve no clue what it will do when you see a little blue link or a piece of text that looks like
https://paul.kinlan.me/– I mean, who would trust that guy?Could you build something like Cowork in the browser? Maybe. To find out, I built a demo called Co-do that tests this hypothesis. In this post I want to discuss the research I’ve done to see how far we can get, and determine if the browser’s ability to run untrusted code is useful (and good enough) for enabling software to do more for us directly on our computer.
How To Use AI for the Ancient Art of Close Reading – fast.ai

Close reading is a technique for careful analysis of a piece of writing, paying close attention to the exact language, structure, and content of the text. As Eric Ries described it,“close reading is one of our civilization’s oldest and most powerful technologies for trying to communicate the gestalt of a thing, the overall holistic understanding of it more than just what can be communicated in language because language is so limited.” It was (and in some cases still is) practiced by many ancient cultures and major religions.
It might come as a surprise that a technique associated with such a long history could now see a revival with the use of Large Language Models (LLMs). With an LLM, you can pause after a paragraph to ask clarifying questions, such as ‘What does this term mean?’ or ‘How does this connect to what came before?’
Electricity use of AI coding agents | Simon P. Couch – Simon P. Couch
AI, environmental impact, LLMs
Throughout 2025, we got better estimates of electricity and water use of AI chatbots. There are all sorts of posts I could cite on this topic, but a favorite is this blog post from Our World in Data’s Hannah Ritchie. On the electricity front:
The average American uses 1600 liters of water per day, so even if you make 100 prompts per day, at 2ml per prompt, that’s only 0.01% of your total water consumption. Using a shower for one second would use far more.
Generally, these analyses guide my own thinking about the environmental impacts of my individual usage of LLMs; if I’m interested in reducing my personal carbon footprint, I’m much better off driving a couple miles less a week or avoiding one flight each year. This is indeed the right conclusion for users of chat interfaces like chatgpt.com or claude.ai.
Some Thoughts on the Open Web

“The Open Web” means several things to different people, depending on context, but recently discussions have focused on the Web’s Openness in terms of access to information — how easy it is to publish and obtain information without barriers there.
…
In other words, we have to create an Internet where people want to publish content openly – for some definition of “open.” Doing that may challenge the assumptions we’ve made about the Web as well as what we want “open” to be. What’s worked before may no longer create the incentive structure that leads to the greatest amount of content available to the greatest number of people for the greatest number of purposes.
jordanhubbard/nanolang: A tiny experimental language designed to be targeted by coding LLMs
computer science, software engineering

A tiny experimental language designed to be targeted by coding LLMs
I was a top 0.01% Cursor user. Here’s why I switched to Claude Code 2.0. | Silen
AI Native Dev, software engineering

You have 6-7 articles bookmarked about Claude Code. You’ve seen the wave. You want to be a part of it. Here’s a comprehensive guide from someone who’s been using coding AI since 2021 and read all those Claude Code guides so you don’t have to.
Scaling long-running autonomous coding
AI Native Dev, LLMs, software, software engineering

In my predictions for 2026 the other day I said that by 2029:
I think somebody will have built a full web browser mostly using AI assistance, and it won’t even be surprising. Rolling a new web browser is one of the most complicated software projects I can imagine[…] the cheat code is the conformance suites. If there are existing tests that it’ll get so much easier.
I may have been off by three years, because Cursor chose “building a web browser from scratch” as their test case for their agent swarm approach:
Code Reviewing AI-Generated JavaScript: What I Found – Schalk Neethling – Open Web Engineer
AI Native Dev, debugging, JavaScript, software engineering

I recently had an AI agent build a JavaScript utility for calculating road distances using a third-party API. The task was complex: batch multiple API requests, validate inputs and outputs, handle errors gracefully, and manage timeouts. The agent delivered working code that passed its own tests.
The code worked, but several issues ranged from minor inefficiencies to a critical bug that would break production. Here’s what I found and how we fixed each one.
The year everything changed – Network Games

In popular imagination, “AI” has come to mean the cheap version of ChatGPT, prattling in a grating tone with too many emojis, variously misleading and making things up.
AI, in this view, is a stupid machine that makes stupid text. LLMs can certainly be this thing.
Software circles aren’t much better: LLM-enabled development is about code generation. Tell it to extrude code for a purpose, and maybe it will, and maybe it will work.
The truth of things is far, far stranger than either conception. By the close of 2025, it was possible to know the true purpose of LLMs: to act as the engines for a previously-impossible category of software.
If you asked me in 2024 what my biggest fear was, I’d have told you: I was afraid my best years were behind me, as someone who builds things.That speaks to me. I've made and built things with software for the better part of 40 years, professionally for 30 or more. Even as my primary focus increasingly became communicating, organising conferences, and connecting people, it remained important to me to make and build things. Often they were tools we used internally to run our conferences and other systems better, but sometimes they were just ideas I wanted to explore. Clearly something has shifted over the last few weeks. Perhaps people took time off over the holiday period to spend a bit more time working with ChatGPT, Claude, or Google's large language model offerings, all of which released significantly improved models in terms of capabilities—particularly when it comes to code—in the latter part of last year. People who gave these models a go 6, 12, or 18 months ago and found them underwhelming came back and realised how much more capable they've become. My timeline across social media, in blog posts, and on podcasts is full of people now thinking more deeply about the implications of all this. So over the coming weeks you'll see here, as you have in recent days, a collection of pieces I think are valuable in helping explore these ideas.
Ralph Wiggum Loop Explained
AI, AI Native Dev, software engineering

The Ralph Wiggum Loop is getting a lot of attention in the AI agent space, but there’s still confusion about what it actually is and what problem it’s trying to solve. In this video, we break down the real failure mode behind long-running agent loops. Context accumulation, why retries make agents worse over time, and why common fixes like compaction can be lossy. The Ralph Wiggum Loop is one response to that problem. Not as a magic trick, but as a pattern that resets context while preserving progress through external memory and review. This video uses goose as the concrete environment to explore the idea, but the concepts apply broadly to agentic workflows.
As AI coding agents take flight, what does this mean for jobs?
AI, LLMs, software engineering

But if AI is doing more of the software building grunt work, what does that mean for the humans involved? It’s a question that’s front-of-mind for just about everyone in the industry. Anthony Goto, a staff engineer at Netflix, addressed this matter directly on TikTok a few weeks back. The most common question that he hears from new graduates and early-career engineers, he said, is whether they’ve made a mistake entering software development just as AI tools are accelerating. “Are we cooked?” is how he succinctly summed up the concern.
Code as Commodity

Thanks to generative AI, code is following a similar pattern. Projects that would have been uneconomic through traditional software development are now just a prompt away. Those 500+-products-per-day on Product Hunt? Not all of them are good, but that’s what abundance brings.
But this doesn’t mean developers are obsolete. It means that the locus of value is broadening as this exclusive skill becomes more widely accessible.
Thus, the real question isn’t “will we be replaced?” but, “what becomes valuable when code itself is cheap?”
AI-Assisted Development at Block | Block Engineering Blog
AI, AI Native Dev, LLMs, software engineering

About 95% of our engineers are regularly using AI to assist with their development efforts. The largest population is at Stage 5, running a single agent mostly outside of an IDE. The second largest population is at Stage 6 and is running 3-5 agent instances in parallel. Then there’s a small population that is actively building our internal agent orchestrator in preparation for the inevitable.
So how does an engineering organization move from Stage 1, where engineers are just starting their AI-assisted coding journey, to an advanced stage where they are managing so many parallel agents that they now need an orchestrator? Here’s how we’re doing it at Block.
Porting MiniJinja to Go With an Agent
AI Native Dev, software engineering

Turns out you can just port things now. I already attempted this experiment in the summer, but it turned out to be a bit too much for what I had time for. However, things have advanced since. Yesterday I ported MiniJinja (a Rust Jinja2 template engine) to native Go, and I used an agent to do pretty much all of the work. In fact, I barely did anything beyond giving some high-level guidance on how I thought it could be accomplished.
Mike Olson – Managing AI Like You Manage People
AI Native Dev, software engineering

With the release of Claude Opus 4.5 (and the pace of improvement in frontier models), we’ve reached the point where AI coding assistants can handle tasks that would previously require a team of skilled developers. The capabilities are impressive enough that drawing parallels between managing AI agents and managing human teams feels less like a thought experiment and more like practical advice.
As someone who has spent years in engineering management and who has more recently been working extensively with AI coding tools, I’ve noticed that the skills transfer remarkably well in both directions. Here are some patterns that work for both.
The Economics of AI Coding: A Real-World Analysis
AI Native Dev, software engineering
My whole stream in the past months has been about AI coding. From skeptical engineers who say it creates unmaintainable code, to enthusiastic (or scared) engineers who say it will replace us all, the discourse is polarized. But I’ve been more interested in a different question: what does AI coding actually cost, and what does it actually save?
I recently had Claude help me with a substantial refactoring task: splitting a monolithic Rust project into multiple workspace repositories with proper dependency management. The kind of task that’s tedious, error-prone, and requires sustained attention to detail across hundreds of files. When it was done, I asked Claude to analyze the session: how much it cost, how long it took, and how long a human developer would have taken.
The answer surprised me. Not because AI was faster or cheaper (that’s expected), but because of how much faster and cheaper.
Agent Guardrails and Controls | Block Engineering Blog
AI, AI Engineering, LLMs, MCP, security

In our previous blog post, Securing the Model Context Protocol, we detailed the Model Context Protocol (MCP) system and discussed some security concerns and mitigations. As a brief recap, MCP provides agents with a means to accomplish tasks using defined tools; reducing the burden of using complex and varied APIs and integrations on the agent.
However, in our prior blog post we did not cover mitigations for injection attacks against LLMs that are performed by MCPs themselves. At the time, this was because we didn’t have any security advice we believed was helpful to offer.
However, that is the focus of this post where we outline a way of modelling this attack using the established threat model of browser security, and specifically CSRF (Cross-Site Request Forgery), to provide insights into novel mitigations we believe could help dramatically reduce the attack’s likelihood.
Why AI is pushing developers toward typed languages – The GitHub Blog
AI Native Dev, computer science, software engineering

It’s a tale as old as time: tabs vs. spaces, dark mode vs. light mode, typed languages vs. untyped languages. It all depends!
But as developers use AI tools, not only are they choosing the more popular (thus more trained into the model) libraries and languages, they are also using tools that reduce risk. When code comes not just from developers, but also from their AI tools, reliability becomes a much bigger part of the equation.
Dynamic languages like Python and JavaScript make it easy to move quickly when building, and developers who argue for those languages push for the speed and flexibility they provide. But that agility lacks the safety net you get with typed languages.
All I Want for Christmas is a Better Alt Text – Part 1

Earlier this year, I built the backend for the local alt text generation feature in Firefox. Nearly half of the images on the web still lack alternative text, creating a major accessibility barrier for screen reader users. The goal of this work is straightforward but ambitious: generate high-quality alt text entirely on device, preserving user privacy while improving access to visual content.
Testing Pyramid of AI Agents | Block Engineering Blog
AI Engineering, LLMs, software engineering, testing

I’m a huge advocate for software testing and have written and spoken quite a bit about the testing pyramid. Unit tests at the bottom. Integration tests in the middle. UI tests at the top. Fewer tests as you go up, because they’re slower, flakier, and more expensive.
That model worked really well as it gave teams a shared mental model for how to think about confidence, coverage, and tradeoffs. It helped people stop writing brittle UI tests and start investing where it mattered.
But now that I work on an AI agent and have to write tests for it, that pyramid stopped making sense because agents change what “working” even means.
Introducing beginners to the mechanics of machine learning – Miriam Posner

Every year, I spend some time introducing students to the mechanics of machine learning with neural nets. I definitely don’t go into great depth; I usually only have one class for this. But I try to unpack at least some of the major concepts, so that ML isn’t quite such a black box.
Whether you’re an AI critic or enthusiast, I find that conversations can be much more specific and productive if the participants have a basic understanding of how the tools work. That way, if students hear some kind of outlandish claim—like, that ChatGPT loves them—they can compare the claim to a mental image of how the tool actually works.
Block red-teamed its own AI agent to run an infostealer • The Register

“Being CISO is very much about being okay with ambiguity and being uncomfortable in situations,” Nettesheim said. “We are balancing risk constantly, and having to make trade off – in the AI space in particular. Like: What is a bigger risk right now? Not taking advantage of the technology enough? Or the security downsides of it? LLMs and agents are introducing a new, very rapidly evolving space.”
“AI” is bad UX

This is in many ways a worst case scenario for user experience. An application where clicking “save” deletes your files. An icon where clicking and dragging it makes thousands of copies. A sliding control where every time you move it something different happens. Perhaps the best pre-LLM analogy for the LLM user experience is the browser game QWOP, where something immediately intuitive (in the game’s case, running), is rendered dramatically and hilariously unintuitive by the mode of interaction (in the game’s case, this is fun).
This mismatch, between this incredibly powerful user metaphor and the actual abilities of these systems, is at the heart of most of the emerging problems with ‘AI’. For most people, it is functionally impossible to break the mental connection between “this is a person you talk to” and inferences about internal states and goals. So-called ‘AI psychosis’, where the funhouse mirror agreeableness of a mindless chatbot sends people into ratiocinatory spirals of delusional ideation, stems from this failed metaphor. If somebody else is agreeing with you and expanding on and elaborating what you’re saying, it must make sense, right? They sound like they know what they’re talking about.
Attention? Attention!

Attention is, to some extent, motivated by how we pay visual attention to different regions of an image or correlate words in one sentence. Take the picture of a Shiba Inu in Fig. 1 as an example.
Human visual attention allows us to focus on a certain region with “high resolution” (i.e. look at the pointy ear in the yellow box) while perceiving the surrounding image in “low resolution” (i.e. now how about the snowy background and the outfit?), and then adjust the focal point or do the inference accordingly. Given a small patch of an image, pixels in the rest provide clues what should be displayed there. We expect to see a pointy ear in the yellow box because we have seen a dog’s nose, another pointy ear on the right, and Shiba’s mystery eyes (stuff in the red boxes). However, the sweater and blanket at the bottom would not be as helpful as those doggy features.
Getting started with Claude for software development
AI Native Dev, coding agent, software engineering
2025 was an interesting year in many ways. One way in which it was interesting for me is that I went from an AI hater to a pretty big user. And so I’ve had a few requests for a “using Claude” guide, so I figure new year, why not give it a shot? The lack of this kind of content was something that really frustrated me starting out, so feels like a good thing to contribute to the world.
Origin Story: A Tale of Two Ralphs

Origin Story: A Tale of Two Ralphs
To understand the “Ralph” tool is to understand a new approach toward improving autonomous AI coding performance — one that relies on brute force, failure, and repetition as much as it does on raw intelligence and reasoning.
Because Ralph Wiggum is not merely a Simpsons character anymore; it is a methodology born on a goat farm and refined in a San Francisco research lab, a divergence best documented in the conversations between its creator and the broader developer community.
The story begins in roughly May 2025 with Geoffrey Huntley, a longtime open source software developer who pivoted to raising goats in rural Australia.
LLM predictions for 2026, shared with Oxide and Friends
AI Native Dev, realms, software engineering

In 2023, saying that LLMs write garbage code was entirely correct. For most of 2024 that stayed true. In 2025 that changed, but you could be forgiven for continuing to hold out. In 2026 the quality of LLM-generated code will become impossible to deny.
I base this on my own experience—I’ve spent more time exploring AI-assisted programming than most.
The key change in 2025 (see my overview for the year) was the introduction of “reasoning models” trained specifically against code using Reinforcement Learning. The major labs spent a full year competing with each other on who could get the best code capabilities from their models, and that problem turns out to be perfectly attuned to RL since code challenges come with built-in verifiable success conditions.