Experiences working with Claude Code
For about 2 months I have been using Claude Code here and there at work and in private.
The background? About 25 years of work experience as consultant and later in product development. Most of my career I have been working with C#, HTML, CSS, JavaScript and TypeScript. Our product is now almost 8 years old, and for many things that happen in the system there is prior art.
First assignment
The first assignment I wanted to do with Claude to see how it fares was an Excel Importer for our product’s new capability of importing users who are authenticated through our own IdP and not Entra. The task here was to add a new command to our admin console and use things that we already have in place these days and combine them such that we can create users from an Excel file that follows a template and is provided by a customer.
Whatever Claude generates gets a full code review as well as testing before committing the output.
This first usage went fairly well. One thing that I am happy about is that the tool can take certain chores away from me that I like thinking about, but at 50 I don’t particularly look forward to coding them out.
I can ask the tool to perform refactorings like “extract the part where we provide a row object for each line into its own class named ExcelFileReader”.
The truth is, the initial code is almost always not to the standard of quality I expect in our codebase. It usually does what you wanted it to do, but it will be up to me to bring it into the right shape. This would be annoying if you had to do it all by yourself, but with a combination of JetBrains Rider/ReSharper and telling Claude what refactoring to perform, you get to where you want. Also, the refactoring capabilities in the TypeScript part of the codebase are not as strong and Claude can perform such tasks.
Side note on speed of working
Am I faster now, having this tool? I don’t know, probably not. But that to me is somewhat beside the point. Claude can be quite slow and for very small tasks I may end up just doing it myself, as the shuffling of tokens feels wholly inefficient in that situation.
However, the tool may lead me to tackle things that I deem useful but am too lazy to start.
What is the point then?
In my position, I actually end up reviewing far more code than I write myself. I am involved in a number of activities around the software lifecycle, and actually coding is often only a small part of it. So, when I tell Claude what to do, the end result pretty much exercises the same muscles that I am already using, namely code review and giving feedback on what to improve.
Intermezzo: Many customers want access to LLM tools
Truth is: Customers ask for AI tools. Our product is what was in another era called an Intranet. Many customers have tried ChatGPT privately or in their work and are often satisfied with the results.
How can this be?
Those people often work with facts well-known to them (intra-company facts). The fact-checking is trivially done. The LLM can for example help people who aren’t used to writing craft a decent article in the tone they’re aiming for.
A proof of concept to work on how we can leverage LLMs in our product
On this POC I use Claude regularly, especially for the stuff I don’t care about, but is necessary for a nice POC, while focusing on what I do care about.
One comical error I did right when starting the POC was to ask too much in one go, like
Set up an Aspire project containing a minimal API ASP.NET project, a frontend SPA React project with Vite and a YARP-based BFF layer on top of the two projects.
This completely overwhelmed Claude and at some point it even forgot that the whole solution was meant to be set up in Aspire, deleted all Aspire dependencies and wanted to start orchestrating the processes with custom-written code. Hilarious!
Now I know better. The tasks need to stay crisp and clear and focus on specific outcomes. Makes absolute sense, right? I find Claude pretty bad at “architecting”. And you know what? That’s fine, that’s my job after all.
Also, some hard coding stuff, e.g. a nested async iterator in the frontend to stream tokens into one message, while starting a new message with a function call request, I need to do myself. Also fine, I mean, I’m paid for the hard stuff, right?
Writing and Using MCP tools
In between I also wrote two MCP tools that
- Start a work item in Azure DevOps and use the work item’s content as task input.
- Start a PR on a given branch and work item describing what changed.
This is the sort of thing where an LLM can lower the barrier to entry. Looking up the necessary queries and APIs and dependencies to talk to Azure DevOps? Not in this lifetime, I don’t think so. Meanwhile, Claude was able to set this part up for me. I fully expect a connoisseur of the Azure DevOps API to look at the code and say things like “This can be simplified, there is a better API for that, etc.”, but I understand the code, and the general quality is sufficient for its intended use and maintainability.
I would like to mention the MCP tool Context7, which has helped me perform certain code changes based on current documentation instead of whatever was used to train the current model in use.
What else have I used Claude for?
- Code extraction refactorings
- Tasks with prior art I can point to. A well-factored codebase is beneficial to good outcomes. “More of the same” works pretty well.
- Writing certain tests of high complexity. Yes, some of our tests are ugly and complex, e.g. around sign-in. Yes, sometimes I code first and perform the testing steps afterwards, it happens. Claude can help write me tests, especially when I say what to test for.
- Finding something back where I know there should be prior art. Claude can help me find it. “code that fits in your head” is a good mantra, but the totality of ahead’s code does not fit in anyone’s head at once, so sometimes the question is where we did a certain thing before.
What is crap?
I already mentioned some limitations. What is quite annoying is that in order to stay sharp, it makes sense to reset the context every so often. Then Claude has to spend some time again and again to “relearn” what it needs to know.
What is also often quite sad is that the technology is simply unable to learn from its mistakes.
When it systematically fails at a specific problem, you can only hope that it surfaces to the
developers at Anthropic at some point. You yourself can’t do much about it.
One could try to place a “reminder” in the claude.md
file, but, well, that is most certainly a big limitation of this technology.
In Conclusion
There is an unfortunate tension that while LLM tools are useful, the usefulness has been dwarfed by investments undertaken by big tech. The limitations are clear by now, and once more, LLMs are no silver bullet. Why some in big tech keep pretending otherwise is probably more dictated by the high financial stakes involved rather than actual outcomes of using the technology.
From my perspective, the monthly subscription cost seems appropriate. I can use the tech when
- I have clear directions to give
- There is prior art
- The code needs to do something concrete rather than focus mainly on maintainability or other non-functional metrics.
I don’t get good results when
- The directives are too vague
- The thing that the code needs to do is non-trivial.