Porting MiniJinja to Go With an Agent

    AI Native Dev, software engineering

    Turns out you can just port things now. I already attempted this experiment in the summer, but it turned out to be a bit too much for what I had time for. However, things have advanced since. Yesterday I ported MiniJinja (a Rust Jinja2 template engine) to native Go, and I used an agent to do pretty much all of the work. In fact, I barely did anything beyond giving some high-level guidance on how I thought it could be accomplished.

    Source

    I've been posting a few pieces like this from really experienced and knowledgeable software developers, sharing how they have been working with code generation and the lessons they've learned. I think there is a couple of values in paying attention to things like this: 1. These are not randos on LinkedIn pontificating about the future of coding. These are seasoned software engineers who often have expressed, like Armin has in the past, scepticism. 2. About the value of these tools and it's also about real-world work they are doing and at least some lessons they've learned along the way of doing it. So you may need proof points or convincing about the capability of these technologies. Or you may already be using them and could learn valuable lessons from people on a similar journey.

    Mike Olson – Managing AI Like You Manage People

    AI Native Dev, software engineering

    With the release of Claude Opus 4.5 (and the pace of improvement in frontier models), we’ve reached the point where AI coding assistants can handle tasks that would previously require a team of skilled developers. The capabilities are impressive enough that drawing parallels between managing AI agents and managing human teams feels less like a thought experiment and more like practical advice.

    As someone who has spent years in engineering management and who has more recently been working extensively with AI coding tools, I’ve noticed that the skills transfer remarkably well in both directions. Here are some patterns that work for both.

    Source

    Working with Agentic coding systems like Claude Code is a significant change, particularly for software engineers used to writing most if not all of the code they work with. As an increasing number of people have observed, working with these systems is much closer to managing developers than being the developer. This piece takes lessons from working as an engineering manager for with people and applies them to working with Claude code.

    The Economics of AI Coding: A Real-World Analysis

    AI Native Dev, software engineering

    My whole stream in the past months has been about AI coding. From skeptical engineers who say it creates unmaintainable code, to enthusiastic (or scared) engineers who say it will replace us all, the discourse is polarized. But I’ve been more interested in a different question: what does AI coding actually cost, and what does it actually save?

    I recently had Claude help me with a substantial refactoring task: splitting a monolithic Rust project into multiple workspace repositories with proper dependency management. The kind of task that’s tedious, error-prone, and requires sustained attention to detail across hundreds of files. When it was done, I asked Claude to analyze the session: how much it cost, how long it took, and how long a human developer would have taken.

    The answer surprised me. Not because AI was faster or cheaper (that’s expected), but because of how much faster and cheaper.

    Source

    I've seen a very few analyses like this, and I think it's really important. Here Tarek Ziadé a very experienced software engineer working on the Mozilla codebase. Documents in detail. His work on a real-world piece of software engineering using Claude code. And the impact it had in terms of costs and time saved. The things that went well and the things that maybe went less well.

    Agent Guardrails and Controls | Block Engineering Blog

    AI, AI Engineering, LLMs, MCP, security

    In our previous blog post, Securing the Model Context Protocol, we detailed the Model Context Protocol (MCP) system and discussed some security concerns and mitigations. As a brief recap, MCP provides agents with a means to accomplish tasks using defined tools; reducing the burden of using complex and varied APIs and integrations on the agent.

    However, in our prior blog post we did not cover mitigations for injection attacks against LLMs that are performed by MCPs themselves. At the time, this was because we didn’t have any security advice we believed was helpful to offer.

    However, that is the focus of this post where we outline a way of modelling this attack using the established threat model of browser security, and specifically CSRF (Cross-Site Request Forgery), to provide insights into novel mitigations we believe could help dramatically reduce the attack’s likelihood.

    Source

    More thoughts from BLoC's engineering team about security models for MCP-based systems, particularly cross-site request forgery (CSRF) attacks and how to mitigate those.

    Why AI is pushing developers toward typed languages – The GitHub Blog

    AI Native Dev, computer science, software engineering

    It’s a tale as old as time: tabs vs. spaces, dark mode vs. light mode, typed languages vs. untyped languages. It all depends!

    But as developers use AI tools, not only are they choosing the more popular (thus more trained into the model) libraries and languages, they are also using tools that reduce risk. When code comes not just from developers, but also from their AI tools, reliability becomes a much bigger part of the equation.

    Dynamic languages like Python and JavaScript make it easy to move quickly when building, and developers who argue for those languages push for the speed and flexibility they provide. But that agility lacks the safety net you get with typed languages.

    Source

    My instinct is like that of far better, more knowledgeable developers like Brett Taylor is that in time, large language models and coding agents won't write languages that have been developed for humans (whether that's JavaScript or Python or whatever). Instead, they will develop in languages that are optimised for the output of a large language model. In the meantime, it's certainly the case that we are putting languages developed for humans. The question is, which languages should we be using? There's certainly debate about whether or not typed vs untyped languages work better with large language models. Here is a discussion of the issues around that.

    All I Want for Christmas is a Better Alt Text – Part 1

    Earlier this year, I built the backend for the local alt text generation feature in Firefox. Nearly half of the images on the web still lack alternative text, creating a major accessibility barrier for screen reader users. The goal of this work is straightforward but ambitious: generate high-quality alt text entirely on device, preserving user privacy while improving access to visual content.

    Source

    There has been vocal pushback against Firefox for adopting AI in the browser, even as is currently the case. Experimental adoption often behind feature flags or otherwise not widely available. Here I think is one very powerful and positive use case. One that I use myself, though not in Firefox, but in tools that I have built. And that is the generation of alt text for images that lack them. Firefox is working to implement this directly in the browser. Helping maintain a user's privacy, and lowering the impact on resources in the environment. Two things that people are often highly critical about when it comes to AI. Here Tarek Ziadé it talks about the process of implementing these in the browser. As we've covered elsewhere, various browser developers are looking at similar approaches to incorporating AI models in the browser directly and exposing them through APIs. I think this is an excellent use case for the technology.

    Testing Pyramid of AI Agents | Block Engineering Blog

    AI Engineering, LLMs, software engineering, testing

    I’m a huge advocate for software testing and have written and spoken quite a bit about the testing pyramid. Unit tests at the bottom. Integration tests in the middle. UI tests at the top. Fewer tests as you go up, because they’re slower, flakier, and more expensive.

    That model worked really well as it gave teams a shared mental model for how to think about confidence, coverage, and tradeoffs. It helped people stop writing brittle UI tests and start investing where it mattered.

    But now that I work on an AI agent and have to write tests for it, that pyramid stopped making sense because agents change what “working” even means.

    Source

    Testing AI-based systems or large language model-based systems is challenging in no small part because their output is unlike that of traditional software systems, not deterministic. So, how do we go about testing our AI-based system? Here are some thoughts from Anthropic’s Angie Jones

    Introducing beginners to the mechanics of machine learning – Miriam Posner

    AI, LLMs

    Every year, I spend some time introducing students to the mechanics of machine learning with neural nets. I definitely don’t go into great depth; I usually only have one class for this. But I try to unpack at least some of the major concepts, so that ML isn’t quite such a black box.

    Whether you’re an AI critic or enthusiast, I find that conversations can be much more specific and productive if the participants have a basic understanding of how the tools work. That way, if students hear some kind of outlandish claim—like, that ChatGPT loves them—they can compare the claim to a mental image of how the tool actually works.

    Source

    I spent the holiday period trying to get a deeper understanding of what's going on with machine learning and large language models, going all the way back to the 1950s and the origins of artificial neurons. This is a great list of videos, articles, and so on that could help provide a broader understanding of what these technologies are doing and how they work.

    Block red-teamed its own AI agent to run an infostealer • The Register

    coding agent, security

    “Being CISO is very much about being okay with ambiguity and being uncomfortable in situations,” Nettesheim said. “We are balancing risk constantly, and having to make trade off – in the AI space in particular. Like: What is a bigger risk right now? Not taking advantage of the technology enough? Or the security downsides of it? LLMs and agents are introducing a new, very rapidly evolving space.”

    Source

    Block has taken a real leadership position in the agentic coding space, particularly with their open-source project Goose. But like anyone working deeply in this area, they recognise the potential security implications for letting an agentic system loose. Here there CISO talks about some of the implications for security and some tactics for addressing those challenges.

    “AI” is bad UX

    AI, Design, UX

    This is in many ways a worst case scenario for user experience. An application where clicking “save” deletes your files. An icon where clicking and dragging it makes thousands of copies. A sliding control where every time you move it something different happens. Perhaps the best pre-LLM analogy for the LLM user experience is the browser game QWOP, where something immediately intuitive (in the game’s case, running), is rendered dramatically and hilariously unintuitive by the mode of interaction (in the game’s case, this is fun).

    This mismatch, between this incredibly powerful user metaphor and the actual abilities of these systems, is at the heart of most of the emerging problems with ‘AI’. For most people, it is functionally impossible to break the mental connection between “this is a person you talk to” and inferences about internal states and goals. So-called ‘AI psychosis’, where the funhouse mirror agreeableness of a mindless chatbot sends people into ratiocinatory spirals of delusional ideation, stems from this failed metaphor. If somebody else is agreeing with you and expanding on and elaborating what you’re saying, it must make sense, right? They sound like they know what they’re talking about.

    Source

    It's a very thoughtful essay on user experience and AI. It's very hard to summarise, but I highly recommend everyone reads and gives some thought to.

    Attention? Attention!

    AI, generative AI, LLMs

    Attention is, to some extent, motivated by how we pay visual attention to different regions of an image or correlate words in one sentence. Take the picture of a Shiba Inu in Fig. 1 as an example.

    Human visual attention allows us to focus on a certain region with “high resolution” (i.e. look at the pointy ear in the yellow box) while perceiving the surrounding image in “low resolution” (i.e. now how about the snowy background and the outfit?), and then adjust the focal point or do the inference accordingly. Given a small patch of an image, pixels in the rest provide clues what should be displayed there. We expect to see a pointy ear in the yellow box because we have seen a dog’s nose, another pointy ear on the right, and Shiba’s mystery eyes (stuff in the red boxes). However, the sweater and blanket at the bottom would not be as helpful as those doggy features.

    Source

    I spent the last few weeks trying to get a deeper understanding of the technologies and theories that underlie modern machine learning. One tremendous source I highly recommend is "Why Machines Learn", a fantastic book that is about the mathematics of machine learning but don't be put off by that - a lot of it you could simply skip the mathematics and understand the broad ideas and get a lot of benefit from. I couldn't recommend it highly enough. I also spent some time going through a reading list from Elicit, a company that I admire, that they give new hires to get them up to speed with the broad landscape of modern machine learning. One of the key concepts of large language models is attention. Perhaps the key to making them work is this concept of attention. While this article is from several years ago now, it does a great job of giving you a sense of what attention is and how it works.

    Getting started with Claude for software development

    AI Native Dev, coding agent, software engineering

    2025 was an interesting year in many ways. One way in which it was interesting for me is that I went from an AI hater to a pretty big user. And so I’ve had a few requests for a “using Claude” guide, so I figure new year, why not give it a shot? The lack of this kind of content was something that really frustrated me starting out, so feels like a good thing to contribute to the world.

    Source

    If, like it seems, a lot of people did over the holiday period, you gave some serious thought to working with an Agentic coding system like Claude Code. This could be a really good introduction for you. I highly recommend it. I highly recommend getting a 20 US account with Claude, giving you access to Code. I highly recommend you download the desktop app and work with it that way. At that price point, you will likely run out of tokens from time to time and have to wait an hour or two while working with it, but you can still get a tremendous amount of valuable work done. If you're a software engineer, I think now it really is the time to be working with these tools.

    Origin Story: A Tale of Two Ralphs

    AI, software engineering

    Origin Story: A Tale of Two Ralphs

    To understand the “Ralph” tool is to understand a new approach toward improving autonomous AI coding performance — one that relies on brute force, failure, and repetition as much as it does on raw intelligence and reasoning.

    Because Ralph Wiggum is not merely a Simpsons character anymore; it is a methodology born on a goat farm and refined in a San Francisco research lab, a divergence best documented in the conversations between its creator and the broader developer community.

    The story begins in roughly May 2025 with Geoffrey Huntley, a longtime open source software developer who pivoted to raising goats in rural Australia.

    Source

    Friend of Web Directions and speaker at our conferences, Geoff Huntley is along with The Simpsons' Ralph Wiggum having a moment. If you had the privilege to attend our UN conferences last year, our Code Conference, or our Engineering AI Conference, you would have heard Jeff speak about his experience of working with large language models to develop software. Not much more than 12 months ago, he was quite sceptical about them. A little over 12 months later, he is one of the highest-profile people in the field. And here he is getting some attention from none other than VentureBeat.

    LLM predictions for 2026, shared with Oxide and Friends

    AI Native Dev, realms, software engineering

    In 2023, saying that LLMs write garbage code was entirely correct. For most of 2024 that stayed true. In 2025 that changed, but you could be forgiven for continuing to hold out. In 2026 the quality of LLM-generated code will become impossible to deny.

    I base this on my own experience—I’ve spent more time exploring AI-assisted programming than most.

    The key change in 2025 (see my overview for the year) was the introduction of “reasoning models” trained specifically against code using Reinforcement Learning. The major labs spent a full year competing with each other on who could get the best code capabilities from their models, and that problem turns out to be perfectly attuned to RL since code challenges come with built-in verifiable success conditions.

    Source

    Here’s a series of predictions by Simon Wilson and others about what we might see when it comes to large language models in the context of software engineering in 2026. It's not to say that Simon and Co are right about these things, but I think it is increasingly imperative, if we're software engineers, to think about what might be coming because that will shape our choices and actions. The timeframes here are relatively short, months, maybe a couple of years. And typically, we don't have to respond in such short timeframes to such very significant changes in, for example, the practise of software engineering.

    Don’t fall into the anti-AI hype –

    AI Native Dev, LLMs, software engineering

    Anyway, back to programming. I have a single suggestion for you, my friend. Whatever you believe about what the Right Thing should be, you can't control it by refusing what is happening right now. Skipping AI is not going to help you or your career. Think about it. Test these new tools, with care, with weeks of work, not in a five minutes test where you can just reinforce your own beliefs. Find a way to multiply yourself, and if it does not work for you, try again every few months.

    Source

    This is a pattern we're seeing increasingly: people who have been sceptical of or even quite resistant to the use of large language models for coding, recognising through using the tools how significant they actually are.

    Believe the Checkbook

    AI, AI Native Dev, LLMs, software engineering

    Anthropic’s AI agent was the most prolific code contributor to Bun’s GitHub repository, submitting more merged pull requests than any human developer. Then Anthropic paid millions to acquire the human team anyway. The code was MIT-licensed; they could have forked it for free. Instead, they bought the people.

    Publicly, AI companies talk like engineering is being automated away. Privately, they deploy millions of dollars to acquire engineers who already work with AI at full tilt. That contradiction is not a PR mistake. It is a signal.

    So what do you do with this as a technical leader?

    Stop using AI as an excuse to devalue your best knowledge workers. Use it to give them more leverage.

    Source

    No one knows exactly how LLMs will impact the practise of software engineering. Beyond that, no one remotely knows how they will impact. I think the things we can take as a given are: 1. There will be, there already are, significant transformations. 2. Very little ultimately will look like what came before. It's hard to even imagine which of the significant transformations and the practise of software engineering over the last 75 years the current transformation most closely rhymes with. Terms of physical form factor, the change from punch cards to screen-based coding is one place to start. But then the transformation to IDEs was similarly profound. I think what we're seeing may actually dwarf both of these, and what does that mean? I think the difference between what happens next and what came before will be more significant than are there of those other transformations that I mentioned.

    Opening and Closing Dialogs Without JavaScript Using HTML Invoker Commands

    dialog, HTML, invokers

    The native <dialog> element was a huge step forward for web developers. It gave us a standardized way to create modal dialogs with built-in backdrop handling, focus management, and keyboard interactions—no more reinventing the wheel with div containers and mountains of JavaScript.

    But we still needed JavaScript for one fundamental task: opening the dialog.

    The HTML Invoker Commands API changes that. Now you can create fully functional dialogs with just HTML.

    Source

    Having started as a declarative platform with HTML and then CSS, the web has become increasingly imperative with JavaScript and DOM APIs over the last 20 years or so. But in recent years the declarative approach has made a comeback even where traditionally we may have needed JavaScript. And one area in which this is particularly true is within focus, which we've covered quite a bit here and in our conferences. There's a new frontier for declarative web–now we can open dialog without any JavaScript.

    The rise of industrial software

    AI, LLMs, software engineering

    Downward-sloping green curve on a graph with "Cost" on the vertical axis and no labeled horizontal axis, showing costs decreasing over time; orange text and arrows indicate "Efficiencies Drive Costs Lower."

    Traditionally, software has been expensive to produce, with expense driven largely by the labour costs of a highly skilled and specialised workforce. This workforce has also constituted a bottleneck for the possible scale of production, making software a valuable commodity to produce effectively.

    Industrialisation of production, in any field, seeks to address both of these limitations at once, by using automation of processes to reduce the reliance on human labour, both lowering costs and also allowing greater scale and elasticity of production. Such changes relegate the human role to oversight, quality control, and optimisation of the industrial process.

    Source: The rise of industrial software | Chris Loy

    As the cost of producing something decreases, what happens to that product or commodity? If it's something produced by human experts, whether that be cloth in the early 19th century or software in the early 21st century, what are the implications for the producers, the labour that produces that product, that commodity? It's a particularly compelling question if you are a software engineer, as many of our readers are. Do we end up like the weavers of the Industrial Revolution who went from highly paid, skilled artisans to essentially non-existence in a decade or so? Or will the industrialization of the production of software bring real benefits for software engineers? What happens to the profession? What happens to our practises? All of these are very much open to debate. It's not clear whether this question will be closed any time soon, but I think it's one we should all very carefully consider.

    davidbau.com Vibe Coding

    Two Kinds of Vibe Coding

    There are two categories of vibe coding. One is when you delegate little tasks to a coding LM while keeping yourself as the human “real programmer” fully informed and in control.

    The second type of vibe coding is what I am interested in. It is when you use a coding agent to build towers of complexity that go beyond what you have time to understand in any detail. I am interested in what it means to cede cognitive control to an AI. My friend David Maymudes has been building some serious software that way, and he compares the second type of vibe coding to managing a big software team. Like the stories you’ve heard of whole startups being created completely out of one person vibe coding.

    Source: davidbau.com Vibe Coding

    The term "vibe coding" is often used disparagingly. And debates or discussions about the practise feel very much centred on the capability of models today. Something which is changing incredibly rapidly. From personal experience having worked with these models for more than three years now, in terms of code generation, the improvements over that time are almost unimaginable. There are many concerns about simply prompting a code generation model to produce an output and letting it run, not being overly concerned or if concerned at all about the quality of code and then using that output without checking it, particularly. Without understanding it and without even inspecting the code. Here David Bau writes in praise of vibe coding, more or less in this sense, although he does suggest providing guard rails and comprehensive testing to ensure the quality of code.  

    Believe the Checkbook

    Black and white sketch of a seesaw with a stack of papers on one end and a megaphone on the other, illustrating the balance between written content and amplified speech.

    Everyone’s heard the line: “AI will write all the code; engineering as you know it is finished.”Boards repeat it. CFOs love it. Some CTOs quietly use it to justify hiring freezes and stalled promotion paths.

    The Bun acquisition blows a hole in that story.

    Here’s a team whose project was open source, whose most active contributor was an AI agent, whose code Anthropic legally could have copied overnight. No negotiations. No equity. No retention packages.Anthropic still fought competitors for the right to buy that group.

    Publicly, AI companies talk like engineering is being automated away. Privately, they deploy millions of dollars to acquire engineers who already work with AI at full tilt. That contradiction is not a PR mistake. It is a signal.

    Source: Believe the Checkbook | Robert Greiner

    One thing I've heard repeatedly over the last year or two, when people are critical of code generation using large language models, is something along the lines of: "But writing the code is not the bottom line when it comes to software engineering." And there's some validity to that. The question is: well what is the bottleneck? People might say testing. People might say architectural decisions. Quality assurance. All those are clearly choke points in delivering software. But here Robert Greiner observes that "The bottleneck isn’t code production, it is judgment." I certainly think there's something to this, but I think sometimes what we do is we stop with an observation like that or the observation that the code generation is not the bottleneck. I think it's really important here is to think through the next steps and the consequences. So if judgement is the bottleneck, not code generation, then what are the implications for engineering leaders, which Robert Griner explores here? For software engineers themselves, whether junior, mid-career, or senior? For companies and organisations, and more broadly? And is this true only of code or is it true of other outputs of generative AI? My working hypothesis would be that it is, and so organisations and individuals should be developing and encouraging the development of judgement, what some people might call taste. Because it's that discernment, that judgement, that taste which is certainly valuable in software development, but I think in other fields will become increasingly valuable, because the models will be able to, are already able to, generate a lot of code, a lot of copy, a lot of images, a lot of legal advice. A key question will be "what is the value of any particular generation from a model?" That's where expertise comes in, that's where taste comes in, that's where discernment and judgement come in. So develop those, continue to develop those. What has long differentiated a person in terms of capability, in many respects, is not the ability to recite vast bodies of knowledge; it is the ability to know among all the vast knowledge what is the appropriate knowledge to deploy in a particular situation.

    Your job is to deliver code you have proven to work

    AI, LLMs, software engineering

    Your job is to deliver code you have proven to work
In all of the debates about the value of AI-assistance in software development there’s one depressing anecdote that I keep on seeing: the junior engineer, empowered by some class of LLM tool, who deposits giant, untested PRs on their coworkers—or open source maintainers—and expects the “code review” process to handle the rest.

    Your job is to deliver code you have proven to work

    In all of the debates about the value of AI-assistance in software development there’s one depressing anecdote that I keep on seeing: the junior engineer, empowered by some class of LLM tool, who deposits giant, untested PRs on their coworkers—or open source maintainers—and expects the “code review” process to handle the rest.

    This is rude, a waste of other people’s time, and is honestly a dereliction of duty as a software developer.

    Your job is to deliver code you have proven to work.

    As software engineers we don’t just crank out code—in fact these days you could argue that’s what the LLMs are for. We need to deliver code that works—and we need to include proof that it works as well. Not doing that directly shifts the burden of the actual work to whoever is expected to review our code.

    Source: Your job is to deliver code you have proven to work

    There's not much more to add to this observation by Simon Willison. Software engineers have a responsibility to deliver tested, verified, quality-assured code. If we use code generation to YOLO it, then what we're doing is not software engineering. There is a time and place for such code. I use it extensively myself because what matters is the job that it does. It doesn't have to necessarily be particular secure or performance or even bug-free because I'm using it internally within a sandbox environment to achieve a productivity gain. But it's entirely another thing to create something that's public-facing, that people rely on, that manages people's details and YOLO that.

    What happens when the coding becomes the least interesting part of the work

    AI, LLMs, software engineering

    That judgment is the job of a senior engineer. As far as I can tell, nobody is replacing that job with a coding agent anytime soon. Or if they are, they’re not talking about it publicly. I think it’s the former, and one of the main reasons is that there is probably not that much spelled-out practical knowledge about how to do the job of a senior engineer in frontier models’ base knowledge, the stuff that they get from their primary training by ingesting the whole internet.

    Source: What happens when the coding becomes the least interesting part of the work | by Obie Fernandez | Dec, 2025 | Medium

    Thoughts by an experienced software engineer on working with large language models. It's an irony that as we become experienced software engineers, traditionally we've written less and less software. This is a trend that is perhaps changing as large language models become increasingly capable of generating code.

    How I wrote JustHTML using coding agents

    AI, LLMs, software engineering

    Writing a full HTML5 parser is not a short one-shot problem. I have been working on this project for a couple of months on off-hours.Tooling: I used plain VS Code with Github Copilot in Agent mode. I enabled automatic approval of all commands, and then added a blacklist of commands that I always wanted to approve manually. I wrote an agent instruction that told it to keep working, and don’t stop to ask questions. Worked well!Here is the 17-step process it took to get here:

    Source: How I wrote JustHTML using coding agents – Friendly Bit

    A few weeks back, Simon and Wilson coined the term "vibe engineering" trying to create a distinction between the use of large language models to generate code that we simply run as-is, against the use of large language models as part of the software engineering process. This example he links to is an excellent example of vibe engineering. Emil Stenström he's written an HTML parser, which, if you know anything about HTML, is much more complex than it might initially appear. Here, Emil details his approach to working with large language models to produce a very complex piece of software. Emil is a software engineer, but he observes that:
    Yes. JustHTML is about 3,000 lines of Python with 8,500+ tests passing. I couldn't have written it this quickly without the agent. But "quickly" doesn't mean "without thinking." I spent a lot of time reviewing code, making design decisions, and steering the agent in the right direction. The agent did the typing; I did the thinking. That's probably the right division of labor.

    The Bet On Juniors Just Got Better

    AI, LLMs, software engineering

    Hand-drawn graph labeled "PROFIT" showing three curves starting below the x-axis and rising; one blue curve rises steeply, one red curve rises moderately, and one orange curve levels off.

    Junior developer—obsolete accessory or valuable investment? How does the genie change the analysis?

    Folks are taking knee-jerk action around the advent of AI—slowing hiring, firing all the juniors, cancelling internship programs. Instead, let’s think about this a second.

    The standard model says junior developers are expensive. You pay senior salaries for negative productivity while they learn. They ask questions. They break things. They need code review. In an augmented development world, the difference between juniors & seniors is just too large & the cost of the juniors just too high.Wrongo. That’s backwards. Here’s why.

    Source: The Bet On Juniors Just Got Better – by Kent Beck

    Kanban Beck is around in the world of software engineering, the originator of XP (Extreme Programming) and very well known and highly regarded when it comes to design patterns Here he addresses the issue that has been of concern to many people, and that is, what impact will AI have on junior developers? Will they simply not exist anymore? And then, will we ever get senior developers if we haven't got any new junior developers? Ken has a different take, and I think it's well worth considering.

    UX Is Your Moat (And You’re Ignoring It) – Eleganthack

    AI, Design

    If you’re building an AI product, your interface isn’t a nice-to-have. It’s your primary competitive advantage.

    Here’s what that means in practice:

    Make the first five minutes seamless. Users decide whether they’re staying or leaving almost immediately. If they have to think about where to click, you’ve already lost. Netflix auto-plays. TikTok starts scrolling. What does your product do the moment someone opens it?

    Source: UX Is Your Moat (And You’re Ignoring It) – Eleganthack

    Technologists often default to the idea that the best technology always wins. Over the years, we see endless debates about the technical specifications of a product and why that makes that product better. But what we should have learned by now is that technology is only one part of why something becomes successful. Category defining. Dominant. Here, Christina Wodtke brings her many years of experience to the question of what will make AI products successful, with lessons not just for the biggest technology companies, but any company, whether they use AI or not.

    How to Run a 90-Minute AI Design Sprint (with prompts)

    AI, Design

    3D-rendered coral-like structure in gradient colors from yellow to blue, overlaid on a beige grid background with blue anchor points and outlines indicating selection or manipulation in a design interface.

    Most teams still run ideation sessions with a whiteboard, a problem statement, and a flurry of post-its. To be honest, I’ve always loved a good Design sprint, especially in person and I hope those don’t go away for anyone because they’re an awesome way to learn and connect together.

    But with AI, the way we generate, evaluate, and shape ideas has fundamentally shifted. You can collapse days of thinking into a focused 90-minute sprint if you know how to structure it well.

    This is the format designed to move fast without losing the depth. It blends design thinking, systems thinking, and agent-era AI capabilities into a repeatable flow you can run any time your team needs clarity.

    Here’s the 90-minute AI Design Sprint, step by step with prompts you can copy, paste, and use today.

    Source: How to Run a 90-Minute AI Design Sprint (with prompts)

    As we've recently observed elsewhere, while a lot of the focus on generative AI and LLMs is on customer-facing features or generated content (be that text, images, or video), there is one place in which large language models can have a really valuable impact: on processes. Here M.C. Dean reimagines the design sprints, a staple of the design process, using large language models with some suggested prompts that she uses.

    What I learned building an opinionated and minimal coding agent

    AI, coding agent, LLMs, software engineering

    Table displaying performance metrics for the agent "pi (claude-opus-4-5)" on the "terminal-bench" dataset, including 428 trials, 71 errors, a mean score of 0.479, reward distribution with 213 successes and 215 failures, and a breakdown of exception types and counts.

    I’ve also built a bunch of agents over the years, of various complexity. For example, Sitegeist, my little browser-use agent, is essentially a coding agent that lives inside the browser. In all that work, I learned that context engineering is paramount. Exactly controlling what goes into the model’s context yields better outputs, especially when it’s writing code. Existing harnesses make this extremely hard or impossible by injecting stuff behind your back that isn’t even surfaced in the UI.

    Source: What I learned building an opinionated and minimal coding agent

    Mario Zechner built his own minimal coding agent. Think of a lightweight version of Claude Code or OpenAI's Codex. You can follow along here.

    Useful patterns for building HTML tools

    AI, LLMs

    I’ve started using the term HTML tools to refer to HTML applications that I’ve been building which combine HTML, JavaScript, and CSS in a single file and use them to provide useful functionality. I have built over 150 of these in the past year, almost all of them written by LLMs. This article presents a collection of useful patterns I’ve discovered along the way.

    Source: Useful patterns for building HTML tools

    One incredibly valuable use case for code generation, and a good way to explore, experiment, develop intuitions and capabilities with them is by building little utility tools for your own use, as Simon Wilson has been doing for several years. I too have been doing this. I've taken spreadsheets, Bash scripts, Java's little pieces of JavaScript that I had cobbled together over years to help in the production of our sites and content and even printing for our conferences and built special purpose mini web applications to solve the same problems much more efficiently and enjoyably. So I highly recommend it's something you try for yourself if you're not doing already. Here Simon lists a whole bunch of patterns that he has gleaned from his extensive development of such tools.

    The /llms.txt file – llms-txt

    AI, front end development, LLMs

    Markdown template with example syntax including a title (# Title), optional description in italic blockquote, placeholder text, section headers (## Section name, ## Optional), and link entries using markdown link format with optional details.

    We propose adding a /llms.txt markdown file to websites to provide LLM-friendly content. This file offers brief background information, guidance, and links to detailed markdown files.llms.txt markdown is human and LLM readable, but is also in a precise format allowing fixed processing methods (i.e. classical programming techniques such as parsers and regex).We furthermore propose that pages on websites that have information that might be useful for LLMs to read provide a clean markdown version of those pages at the same URL as the original page, but with .md appended. (URLs without file names should append index.html.md instead.)

    Source: The /llms.txt file – llms-txt

    LLMs.txt it's one of a number of proposals for how best to expose the content of a web page, site, or app to large language models. Llms.txt is a proposal initially from Jeremy Howard, well-known in the Python and AI communities, founder of FastAPI and now FastAI (as well as FastMail).  

    AI and variables: Building more accessible design systems faster

    AI, Design, Design Systems

    Website navigation interface with a dark maroon header featuring menu items: Foundation, Components, Patterns, Resources & tools, and a Search icon; below are colorful content cards labeled Patterns and Resources & Tools.

    When people talk about AI in design, they often picture flashy visuals or generative art. But my own lightbulb moment happened at a less glamorous place: in an effort to solve this accessibility challenge under pressure.

    At UX Scotland earlier this year, I shared how AI helped me transform a messy, time-consuming process into something lean, structured, and scalable. Instead of spending weeks tweaking palettes and testing contrast, I had an accessible design system up and running in just a few days. In this article, I’ll explain how I did it and why it matters.

    Source: AI and variables: Building more accessible design systems faster – zeroheight

    When it comes to AI, we overindex on output and user facing features, and I think we're somewhat asleep on workflow and process. These can be made more efficient using much language. Here's a great case study from Tobi Olowu on how he and his team used LLMs to help streamline the process of improving the accessibility of an existing design system.

    A ChatGPT prompt equals about 5.1 seconds of Netflix

    AI, environmental impact

    In June 2025 Sam Altman claimed about ChatGPT that “the average query uses about 0.34 watt-hours”.In March 2020 George Kamiya of the International Energy Agency estimated that “streaming a Netflix video in 2019 typically consumed 0.12-0.24kWh of electricity per hour” – that’s 240 watt-hours per Netflix hour at the higher end.

    Source: A ChatGPT prompt equals about 5.1 seconds of Netflix

    I found that 95% of all AI implementations had no ROI. I haven't really read the study, and I don't think many the people who quoted it have read it either. We also see numbers bandied about about the amount of water used by large language models in particular, at times, single queries, and similarly, the amount of energy required for a single query. And then yesterday I saw on a toilet that the average flush from that toilet used 3.4 litres of water. It's good to see things like this from Simon Wilson where he tries to provide some broader context for the energy and the environmental impact of large language models. It would be even better to see more solid figures from OpenAI, Google, and the other hyperscalers, but at least it's a start.

    Is your tech stack AI ready?

    AI, architecture, LLMs, software engineering

    We’re at the same inflection point we saw with mobile and cloud, except AI is more sensitive to context quality. Loose contracts, missing examples, and ambiguous guardrails don’t just cause bugs. They cause agents to confidently explore the negative space of your system.

    The companies that win this transition will be the ones that treat their specs as executable truth, ship golden paths that agents can copy verbatim, and prove zero-trust by default at every tool boundary.

    Your tech stack doesn’t need to be rebuilt for AI. But your documentation, contracts, and boundaries? Those need to level up.

    Source: Is your tech stack AI ready? | Appear Blog

    Speaker at our recent Engineering AI conference, Jakub Reidl, looks at some of the key areas of your tech stack to get ready for AI

    If You’re Going to Vibe Code, Why Not Do It in C?

    AI, LLMs, software engineering

    So my question is this: Why vibe code with a language that has human convenience and ergonomics in view? Or to put that another way: Wouldn’t a language designed for vibe coding naturally dispense with much of what is convenient and ergonomic for humans in favor of what is convenient and ergonomic for machines? Why not have it just write C? Or hell, why not x86 assembly?

    Source: If You’re Going to Vibe Code, Why Not Do It in C?

    It may seem like a facetious or ironic question, but why stop with vibe coding? If we're going to develop software with large language models, why not use C? Or, more specifically, why use a specific language? Here, Stephen Ramsey observes that programming languages are designed for human convenience, i.e., developer convenience. But if a large language model is generating the code, why generate a language that is essentially an intermediary that humans rarely have ever actually going to read? This is the question that Brett Taylor asks in a podcast that we linked to a few months back. It's one that really interests me. Just the other day, Geoff Huntley in another piece that we linked to talks about working with rather than against the grain of large language models. So, I think this fits into that way of thinking. If we are going to increasingly rely on large language models to do tasks for us, even if we restrict our focus to programming, it makes sense, it would seem, to find what they are best at rather than try to get them, as Geoff Huntley observes, to conform to approaches that humans have developed for our convenience.

    AI companies want a new internet — and they think they’ve found the key

    AI, LLMs, MCP, open source

    3D illustration of a small robot with a digital face displaying zeros interacting with a laptop, with two speech bubbles between them, set against an orange background.

    Over the past 18 months, the largest AI companies in the world have quietly settled on an approach to building the next generation of apps and services — an approach that would allow AI agents from any company to easily access information and tools across the internet in a standardized way. It’s a key step toward building a usable ecosystem of AI agents that might actually pay off some of the enormous investments these companies have made, and it all starts with three letters: MCP.

    Source: AI companies want a new internet — and they think they’ve found the key | The Verge

    In 12 months or so, MCP is gone from an internal project at Claude at Anthropic to being extremely widely used and now has found a home at the Linux Foundation alongside other related technologies such as Goose. This Verge story will give you an overview of the set of technologies and what's happening next.

    Migrating Dillo from GitHub

    front end development, performance, react

    However, it has several problems that make it less suitable to develop Dillo anymore. The most annoying problem is that the frontend barely works without JavaScript, so we cannot open issues, pull requests, source code or CI logs in Dillo itself, despite them being mostly plain HTML, which I don’t think is acceptable. In the past, it used to gracefully degrade without enforcing JavaScript, but now it doesn’t. Additionally, the page is very resource hungry, which I don’t think is needed to render mostly static text.

    Source: Migrating Dillo from GitHub

    GitHub has been undertaking a long process of re-implementing their front-end using React. This is not the only story I've read where that turns out, perhaps not to have been the best decision. Many people have observed that with large repos, it becomes unworkably slow, even in state-of-the-art MacBook Pros. This was eminently predictable and is one of the many reasons why I found myself, of late, pessimistic about the future of front-end as a vibrant, dynamic ecosystem.

    Is It a Bubble?

    AI

    Table comparing the top TMT (technology, media, and telecom) stocks by S&P 500 weight and next twelve months price-to-earnings (NTM P/E) ratios in December 1999 and the current period, showing a median NTM P/E of 41x in 1999 versus 31x currently, with notable changes in company composition and overall valuations.

    Before diving into the subject at hand – and having read a great deal about it in preparation – I want to start with a point of clarification. Everyone asks, “Is there a bubble in AI?” I think there’s ambiguity even in the question. I’ve concluded there are two different but interrelated bubble possibilities to think about: one in the behavior of companies within the industry, and the other in how investors are behaving with regard to the industry. I have absolutely no ability to judge whether the AI companies’ aggressive behavior is justified, so I’ll try to stick primarily to the question of whether there’s a bubble around AI in the financial world.

    Source: Is It a Bubble?

    Not infrequently in my conversations with people does this issue of whether or not we are in a bubble come up. Is there an AI bubble? Will we have an AI bubble? It's probably something we should think about. Even if there's this little of anything that at least we as individuals can do; perhaps we can make different decisions about how and where we invest or what might happen if we had a significant downturn of the nature of early 2000s or after the global financial crisis. In future, I think I'll just point people to this. It's a very solid read, but it's not only a thoughtful thesis; it draws on quite a range of historical experiences.

    How I Shipped 100k LOC in 2 Weeks with Coding Agents

    AI, AI Native Dev, LLMs, software engineering

    When we onboard developers, we give them documentation, coding standards, proven workflows, and collaboration tools. When we “deploy” AI agents, we give them nothing. They start fresh every time. No project context, no memory of patterns, no proven workflows.

    So I compiled AI Coding Infrastructure, the missing support layer that agents need. Five components:

    Autonomous Execution (Ralph): Continuous loops for overnight autonomous development

    Project Memory (AGENTS.md): Your tech stack, patterns, conventions that agents read automatically before every response

    Proven Workflows (Skills): Battle-tested TDD, debugging, code review patterns agents MUST follow

    Specialization (Sub-Agents): 114+ domain experts working in parallel, not one generalist

    Planning Systems (ExecPlans): Self-contained living docs for complex features

    Source: How I Shipped 100k LOC in 2 Weeks with Coding Agents | Blog

    I think we're very much in the early stages of developing patterns, practises, and approaches to working with agentic systems. I think too that different systems will likely have at least somewhat different approaches that tend to get the best from them. In the meantime, I'm finding interesting to read about how various individuals and teams go about working with these systems. I hope you might find that valuable too.

    Horses

    AI, LLMs

    Line graph titled "Anthropic technical Q&A" showing monthly counts of questions answered from mid-2024 to mid-2025; human-answered questions (pink line) gradually decline over time, while AI-answered questions (black line) rise sharply starting October 2024, surpassing human answers by early 2025.

    Engines, steam engines, were invented in 1700.And what followed was 200 years of steady improvement, with engines getting 20% better a decade.

    For the first 120 years of that steady improvement, horses didn’t notice at all.

    Then, between 1930 and 1950, 90% of the horses in the US disappeared.

    Progress in engines was steady. Equivalence to horses was sudden.

    Source: Horses

    A couple of years back, Mark Pesce gave a fantastic keynote at our summit using the analogy of the history of steam power for trying to understand where we were at and what was happening when it came to large language models and generative AI. While historical analogies can be misleading, they can also be useful in helping us to get some sense of transformation. Humans are really not intuitively great at understanding exponential change. I often quote a line from Hemingway where someone asks another character how did you go bankrupt, and the reply is, "Two ways: slowly, then suddenly." We saw during the initial outback of COVID that humans really weren't great at exponential reasoning, especially when we look at logarithmic graphs. But what this piece tries to get at is how transformations, such as the transformation from human and animal to steam power which essentially drove the Industrial Revolution, take time. In the case of that transformation, it took a century or so from the mid-18th to the mid-19th century. And for a lot of that time, if the growth is exponential, there's seemingly very little apparent change. But then something occurs, some tipping point, and something happens. Perhaps around 1820 in the UK, and between 1820 and 1850, we saw this enormous increase in the productive output of Britain's industrial capability. So I really recommend reading this article. It's relatively short. It's very entertaining and engaging. To try and develop this intuition about how the growing capability of generative AI may impact various kinds of human endeavour.

    10 Years of Let’s Encrypt Certificates – Let’s Encrypt

    HTTPS, security, TLS

    Line chart showing the growth from 2016 to 2025 of active SSL certificates (dotted orange line), fully-qualified domains (solid dark blue line), and registered domains (dashed green line), with fully-qualified domains and certificates rising sharply from 2022 onward, while registered domains grow more gradually.

    On September 14, 2015, our first publicly-trusted certificate went live. We were proud that we had issued a certificate that a significant majority of clients could accept, and had done it using automated software. Of course, in retrospect this was just the first of billions of certificates. Today, Let’s Encrypt is the largest certificate authority in the world in terms of certificates issued, the ACME protocol we helped create and standardize is integrated throughout the server ecosystem, and we’ve become a household name among system administrators. We’re closing in on protecting one billion web sites.

    Source: 10 Years of Let’s Encrypt Certificates – Let’s Encrypt

    A decade ago, very few websites in the scheme of things used HTTPS. At that stage, I'd had websites for more than 20 years and never had a secure website in that way. Why was this the case? Well, it was typically expensive and, above all, technically really painful to provision certificates for a website. So unless you were very large or conducting commerce directly and required a secure connection, you almost certainly didn't implement it. In the last decade, that's completely changed, you can now provision a certificate for a site at no cost, probably without even thinking about it. So ubiquitous are secure connections that when occasionally you visit one in a modern browser, it will provide copious warnings about the insecurity of that site. And all this is thanks to Let's Encrypt, a project that made it much easier and most importantly free to create HTTPS for any web page. So happy anniversary, and if anything, I thought it had been longer.

    Building a Social Media Agent | goose

    AI, LLMs, MCP

    Screenshot of a Bluesky social media post by user "goose @opensource.block.xyz" with the caption "vibe code with me test" and an image reading "how i used goose to migrate my codebase" on a black background with a white megaphone and colorful button.

    The Game Plan

    Here’s what we’re building: two MCP servers that work together to handle all our social media promotion automatically.

    MCP Server : Content Fetcher
    This one goes out and grabs all our content from:

    • YouTube videos
    • Blog posts
    • GitHub release notes

    Then it compares everything to a last_seen.json file to figure out what’s actually new. If nothing is new it proceeds to check an evergreen.json file and randomly pick old content to socialize.

    MCP Server : Sprout Social Integration
    Once we have new content, this server takes over and:

    • Generates captions for each platform
    • Uploads media (videos, images, or just links)
    • Creates draft posts in Sprout Social

    The goal? Wake up to social posts ready to go, without lifting a finger. Well, almost, more on that later.

    Source: Building a Social Media Agent | goose

    If like me you find the best way to learn how something is to build it, then this tutorial from Ebony Louis at Block might be the best way for you to get up to speed with building your own MCP server.