Experiences working with Claude Code

Sep 21 2025

Clarifying my viewpoint on LLMs

I sort of sit in a bubble on Mastodon where most people are heavily opposed to the current incarnation of AI: LLM-based technology.

I absolutely get it, while many would disagree, I share many values with those people.

There is a lot to criticize and I don’t want to reiterate all the points that have already been made:

The way cheap labor (aka modern-day slavery) has been used to categorize the training data
The incredible amounts of electricity and other resources necessary to power the resulting models
The fact that it destroys the democratization of software development
The way how the most relevant players so easily bend the knee to the new US autocracy, which casts a dark shadow on how the technology is being used by the main players.
The way it is being used to put pressure on employees.
…

All of this is not wrong.

But I’ve lived half a century now. I don’t hold humanity to a particularly high standard these days, myself included. As far as apes go, we’ve had an incredible run. I am actually positive towards the idea that we could develop some sort of new life-form, one that runs on thinking stones. Who am I to accuse and blame the modest, brutal and chaotic origins of something that at some point may become a new form of consciousness?

Oh, I’m with you, LLMs are not the singularity. We haven’t yet built models that can keep training themselves, and they cannot experience reality through senses and interaction. These are absolute show-stoppers when it comes to exploring the solution space through Evolution.

But I certainly think it correct that we should keep trying. We humans are not the pinnacle of evolution.

(In fact, nothing ever is—as the solution space gets explored, who has the right to say that one exploration is better than the other?)

I also don’t share the very generic sentiment that this technology is useless. Just yesterday, we took a photo my son’s assignment, highlighted relevant parts of the text and verbally asked ChatGPT to help us understand and give us an example by solving the first task. The whole interaction worked flawlessly.

This is absolutely bonkers. So the tech makes mistakes? Big deal! It has always been and will be for the foreseeable future dangerous to take decisions made by software as some kind of ultimate truth. Is your software free of mistakes? In fact, is the way I live my life free of mistakes?

Since when do we take marketing claims of VC dudes like Sam Altman at face value and let them dictate what to expect of this new technology? This stuff is far from perfect and yet it enables interactions previously deemed impossible with satisfactory accuracy. It is necessary to start understanding its constraints and put in the work to navigate them.

For about 2 months I have been using Claude Code here and there at work and in private.

The background? About 25 years of work experience as consultant and later in product development. Most of my career I have been working with C#, HTML, CSS, JavaScript and TypeScript. Our product is now almost 8 years old, and for many things that happen in the system there is prior art.

First assignment

The first assignment I wanted to do with Claude to see how it fares was an Excel Importer for our product’s new capability of importing users who are authenticated through our own IdP and not Entra. The task here was to add a new command to our admin console and use things that we already have in place these days and combine them such that we can create users from an Excel file that follows a template and is provided by a customer.

Whatever Claude generates gets a full code review as well as testing before committing the output.

This first usage went fairly well. One thing that I am happy about is that the tool can take certain chores away from me that I like thinking about, but at 50 I don’t particularly look forward to coding them out.

I can ask the tool to perform refactorings like “extract the part where we provide a row object for each line into its own class named ExcelFileReader”.

The truth is, the initial code is almost always not to the standard of quality I expect in our codebase. It usually does what you wanted it to do, but it will be up to me to bring it into the right shape. This would be annoying if you had to do it all by yourself, but with a combination of JetBrains Rider/ReSharper and telling Claude what refactoring to perform, you get to where you want. Also, the refactoring capabilities in the TypeScript part of the codebase are not as strong and Claude can perform such tasks.

Side note on speed of working

Am I faster now, having this tool? I don’t know, probably not. But that to me is somewhat beside the point. Claude can be quite slow and for very small tasks I may end up just doing it myself, as the shuffling of tokens feels wholly inefficient in that situation.

However, the tool may lead me to tackle things that I deem useful but am too lazy to start.

What is the point then?

In my position, I actually end up reviewing far more code than I write myself. I am involved in a number of activities around the software lifecycle, and actually coding is often only a small part of it. So, when I tell Claude what to do, the end result pretty much exercises the same muscles that I am already using, namely code review and giving feedback on what to improve.

Intermezzo: Many customers want access to LLM tools

Truth is: Customers ask for AI tools. Our product is what was in another era called an Intranet. Many customers have tried ChatGPT privately or in their work and are often satisfied with the results.

How can this be?

Those people often work with facts well-known to them (intra-company facts). The fact-checking is trivially done. The LLM can for example help people who aren’t used to writing craft a decent article in the tone they’re aiming for.

A proof of concept to work on how we can leverage LLMs in our product

On this POC I use Claude regularly, especially for the stuff I don’t care about, but is necessary for a nice POC, while focusing on what I do care about.

One comical error I did right when starting the POC was to ask too much in one go, like

Set up an Aspire project containing a minimal API ASP.NET project, a frontend SPA React project with Vite and a YARP-based BFF layer on top of the two projects.

This completely overwhelmed Claude and at some point it even forgot that the whole solution was meant to be set up in Aspire, deleted all Aspire dependencies and wanted to start orchestrating the processes with custom-written code. Hilarious!

Now I know better. The tasks need to stay crisp and clear and focus on specific outcomes. Makes absolute sense, right? I find Claude pretty bad at “architecting”. And you know what? That’s fine, that’s my job after all.

Also, some hard coding stuff, e.g. a nested async iterator in the frontend to stream tokens into one message, while starting a new message with a function call request, I need to do myself. Also fine, I mean, I’m paid for the hard stuff, right?

Writing and Using MCP tools

In between I also wrote two MCP tools that

Start a work item in Azure DevOps and use the work item’s content as task input.
Start a PR on a given branch and work item describing what changed.

This is the sort of thing where an LLM can lower the barrier to entry. Looking up the necessary queries and APIs and dependencies to talk to Azure DevOps? Not in this lifetime, I don’t think so. Meanwhile, Claude was able to set this part up for me. I fully expect a connoisseur of the Azure DevOps API to look at the code and say things like “This can be simplified, there is a better API for that, etc.”, but I understand the code, and the general quality is sufficient for its intended use and maintainability.

I would like to mention the MCP tool Context7, which has helped me perform certain code changes based on current documentation instead of whatever was used to train the current model in use.

What else have I used Claude for?

Code extraction refactorings
Tasks with prior art I can point to. A well-factored codebase is beneficial to good outcomes. “More of the same” works pretty well.
Writing certain tests of high complexity. Yes, some of our tests are ugly and complex, e.g. around sign-in. Yes, sometimes I code first and perform the testing steps afterwards, it happens. Claude can help write me tests, especially when I say what to test for.
Finding something back where I know there should be prior art. Claude can help me find it. “code that fits in your head” is a good mantra, but the totality of ahead’s code does not fit in anyone’s head at once, so sometimes the question is where we did a certain thing before.

What is crap?

I already mentioned some limitations. What is quite annoying is that in order to stay sharp, it makes sense to reset the context every so often. Then Claude has to spend some time again and again to “relearn” what it needs to know.

What is also often quite sad is that the technology is simply unable to learn from its mistakes. When it systematically fails at a specific problem, you can only hope that it surfaces to the developers at Anthropic at some point. You yourself can’t do much about it. One could try to place a “reminder” in the claude.md file, but, well, that is most certainly a big limitation of this technology.

In Conclusion

There is an unfortunate tension that while LLM tools are useful, the usefulness has been dwarfed by investments undertaken by big tech. The limitations are clear by now, and once more, LLMs are no silver bullet. Why some in big tech keep pretending otherwise is probably more dictated by the high financial stakes involved rather than actual outcomes of using the technology.

From my perspective, the monthly subscription cost seems appropriate. I can use the tech when

I have clear directions to give
There is prior art
The code needs to do something concrete rather than focus mainly on maintainability or other non-functional metrics.

I don’t get good results when

The directives are too vague
The thing that the code needs to do is non-trivial.