I wonder how much it's changing the learning curve vs just making the experience more comfortable.
>For someone who spent a decade as a “Ruby developer,” becoming a multi-language developer in less than a year feels revolutionary.
Revolutionary? They've snitched they have no frame of reference to make that claim. It would have taken "less than a year" with or without AI. They just spent 10 years not trying.
Everyone's first language learning experience is also learning to program. Learning a new language once you have years of professional programming practice is completely different.
Same here. Reading the article, I could not really relate to the experience of being a single-language developer for 10 years.
In my early days, I identified strongly with my chosen programming language, but people way more experienced than me taught me that a programming language is a tool, and that this approach is akin to saying "well, I don't know about those pliers, I am a hammerer."
My personal feeling from working across a wide range of programming languages is that it expands your horizons in a massive way (and hard to qualitatively describe), and I'm happy that I did this.
The idiosyncrasies of Ruby, like Perl and JavaScript, lead to a certain kind of brain damage that make it difficult to build correct mental models of computing that can then generalize to other languages.
Unless you’re writing instructions for a Turing machine the impedance mismatch between the real world and “computing” is always going to have idiosyncrasies. You don’t have to like a language to understand its design goals and trade offs. There are some very popular languages with constraints or designs that I feel are absurd,
redundant, or counterproductive but I cannot think of a (mainstream) language where I haven’t seen someone much smarter than me do amazing things.
The language I consider the lamest, biggest impediment to learning computer science is used by some of the smartest people on the planet to build amazing things.
What you may have missed, from the perspective of your vertically scaled horse, is that you compare learning certain models to a mental disability. It makes calling my comment racist similar to the whole pot/kettle thing.
However, I do appreciate reading about such opinions because it offers a peek into the elitism that surrounds programming languages.
Also, as a person from a non-traditional and non-privileged background, Im a little unsure about how to proceed. Shall we cut our losses and move on?
<< It would have taken "less than a year" with or without AI. They just spent 10 years not trying.
I suppose we can mark this statement as technically true. I can only attest to my experience using o4 for python mini projects ( popular, so lots of functional code to train on ).
The thing I found is that without it, all the interesting little curve balls I encountered likely would have thrown a serious wrench into the process ( yesterday, it was unraid specific way of handling xml vm ). All of sudden, I am not learning how to program, but learning how qemu actually works, but it is a lot more seamless than having to explore it 'on my own'. And that little detour took half a day when all was said and done. There was another little detour at dockers ( again unraid specific isseus ), but all was overcome, because now I had 4o guide me.
It is scary, because it can work and work well ( even when correcting for randomness). FWIW, my first language was basic way back when.
Counter point: AI makes mainstream languages (for which a lot of data exists in the training data) even more popular because those are the languages it knows best (ie, has the least rate of errors in) regardless of them being typed or not (in fact, many are dynamic, like Python, JS, Ruby).
The end result? Non-mainstream languages don't get much easier to get into because average Joe isn't already proficient in them to catch AI's bugs.
People often forget the bitter lesson of machine learning which plagues transformer models as well.
It’s good at matching patterns. If you can frame your problem so that it fits an existing pattern, good for you. It can show you good idiomatic code in small snippets. The more unusual and involved your problem is, the less useful it is. It cannot reason about the abstract moving parts in a way the human brain can.
>It cannot reason about the abstract moving parts in a way the human brain can.
Just found 3 race conditions in 100 lines of code. From the UTF-8 emojis in the comments I'm really certain it was AI generated. The "locking" was just abandoning the work if another thread had started something, the "locking" mechanism also had toctou issues, the "locking" also didn't actually lock concurrent access to the resource that actually needed it.
Yes, that was my point. Regardless of the programming language, LLMs are glorified pattern matchers. A React/Node/MongoDB address book application exposes many such patterns and they are internalised by the LLM. Even complex code like a B-tree in C++ forms a pattern because it has been done many times. Ask it to generate some hybrid form of a B-tree with specific requirements, and it will quickly get lost.
"Glorified pattern matching" does so much work for the claim that it becomes meaningless.
I've copied thousands of lines of complex code into an LLM asking it to find complex problems like race conditions and it has found them (and other unsolicited bugs) that nobody was able to find themselves.
Oh it just pattern matched against the general concept of race conditions to find them in complex code it's never seen before / it's just autocomplete, what's the big deal? At that level, humans are glorified pattern matchers too and the distinction is meaningless.
> The counter point is how LLMs can't find a missing line in a poem when they are given the original.
True, but describing a limitation of the tech can't be used to make the sort of large dismissals we see people make wrt LLMs.
The human brain has all sorts of limitations like horrible memory (super confident about wrong details) and catastrophic susceptibility to logical fallacies.
Have you not had this issue with LLMs? Because I have. Even with the latest models.
I think someone upthread was making an attempt at
> describing a limitation of the tech
but you keep swatting them down. I didn’t see their comments as a wholesale dismissal of AI. They just said they aren’t great at sufficiently complex tasks. That’s my experience as well. You’re just disagreeing on what “sufficiently” and “complex” mean, exactly.
> humans are glorified pattern matchers too and the distinction is meaningless.
I'm still convinced that this is true. The more advances we make in "AI" the more i expect we'll discover that we're not as creative and unique as we think we are.
I suspect you're right. The more I work with AI, the more clear is the trajectory.
Humans generally have a very high opinion of themselves and their supposedly unique creative skills. They are not eager to have this illusion punctured.
Whether or not we have free will is not a novel concept. I simply side on us being more deterministic than we realize, that our experiences and current hormone state shape our output drastically.
Even our memories are mutable. We will with full confidence recite memories or facts we've learned just moments ago which are entirely fictional. Normal, healthy adults.
Well, how do you verify any bug? You listen to someone's explanation of the bug and double check the code. You look at their solution pitch. Ideally you write a test that verifies the bug and again the solution.
There are false positives, and they mostly come from the LLM missing relevant context like a detail about the priors or database schema. The iterative nature of an LLM convo means you can add context as needed and ratchet into real bugs.
But the false positives involve the exact same cycle you do when you're looking for bugs yourself. You look at the haystack and you have suspicions about where the needles might be, and you verify.
Not suggesting you are doing any of that, just curious what's going on and how you are finding it useful.
> But the false positives involve the exact same cycle you do when you're looking for bugs yourself.
In my 35 years of programming I never went just "looking for bugs".
I have a bug and I track it down. That's it.
Sounds like your experience is similar to using deterministic static code analyzers but more expensive, time consuming, ambiguous and hallucinating up non-issues.
And that you didn't get a report to save and share.
Oh, I go bug hunting all the time in sensitive software. It's the basis of test synthesis as well. Which tests should you write? Maybe you could liken that to considering where the needles will be in the haystack: you have to think ahead.
It's a hard, time consuming, and meandering process to do this kind of work on a system, and it's what you might have to pay expensive consultants to do for you, but it's also how you beat an expensive bug to the punchline.
An LLM helps me run all sorts of considerations on a system that I didn't think of myself, but that process is no different than what it looks like when I verify the system myself. I have all sorts of suspicions that turn into dead ends because I can't know what problems a complex system is already hardened against.
What exactly stops two in-flight transfers from double-spending? What about when X? And when Y? And what if Z? I have these sorts of thoughts all day.
I can sense a little vinegar at the end of your comment. Presumably something here annoys you?
> It can show you good idiomatic code in small snippets.
That's not really true for things that are changing a lot. I got a terrible experience last time I've tried to use Zig, for example. The code it generated was an amalgamation between two or three different versions.
And I've even got this same style of problem in golang where sometimes the LLM generates a for loop in the "old style" (pre go 1.22).
In the end LLMs are a great tool if you know what needs to be done, otherwise it will trip you up.
Its not scaffolding if the intelligence itself is adding it. Humans can make their own diagrams ajd maps to help them, LLM agentsbneed humans to scaffold for them, thats the setup for the bitter lesson
Yes, I worry that we're in for an age of stagnation, where people are hesitant to adopt radically new languages or libraries or frameworks because the models will all be bad at them, and that disadvantage will swamp any benefit you might get from adopting an improved language/library/framework.
Alternatively every new release will have to come with an MCP for its documentation and any other aspects that might make it easier for an LLM to talk about it and use it accurately.
From what I can tell, LLMs tend to hallucinate more with minor languages than with popular ones. I'm saying this as a Scala dev. I suspect most discussions about the LLM usefulness depend on the language they use. Maybe it's useful for JS devs.
I write primarily Scala too. The frontier models seem reasonably good at it to me. One approach I sometimes use is to ask for it to write something complicated in Python, using only the standard library, and then once I have the Python refined I ask for a translation into Scala. This trick may not be applicable to your problems, but it works for me when I find an interesting algorithm in a paper and I'd like to use it but there is no existing Java/Scala implementation.
I use AI assistance to generate code and review code, but I haven't had success trying to use it to update a substantial existing code base in Scala. I have tried using both Claude Code and Cursor and the handful of times I tried there were so many oversights and mistakes that resolving the mess was more effort than doing it manually. I'll probably try again after the next big model releases.
Current frontier models have been the least useful to me when I'm asking them to review performance-critical code inside Scala. These are bits of my code base that I have written to use a lot of mutable variables, mutable data structures, and imperative logic to avoid allocations and other performance impediments. I only use this style for functions highlighted by profiling. I can ask for a code review and even include the reasoning for the unusual style in the docstring, but Claude and Gemini have still tried to nudge me back to much slower standard Scala idioms.
Most people who work in non-mainstream languages are, to some extent, making a statement. They care more about X than mere "popularity". (Sometimes X is money, hence why I still have Anki flashcards in rotation on OCaml, Intersystems Cache and Powershell.)
If they do want "popularity" then the counter-counter-point is that it should be easier to get than ever. Just have one proficient person write a lot of idiomatic, relatively isolatable code, and then have an AI generate terabytes upon terabytes of public domain licensed variations and combinations on that code. If you make programming in the small a breeze, people will flock to your language, and then they can discover how to program in the large with it on their own time.
I’m having a good time with claude and Elm. The correctness seems to help a lot. I mean it still goes wonky some times, but I assume that’s the case with everyone.
I'm not sure, I have a custom config format that combines a CSV schema with processing instructions that I use for bank CSVs and Claude was able to generate a perfect one for a new bank only based on one config plus CSV and the new bank's CSV.
I'm optimistic that most new programming languages will only need a few "real" programmers to write a small amount of example code for the AI training to get started.
> I'm optimistic that most new programming languages will only need a few "real" programmers to write a small amount of example code for the AI training to get started.
CSV is not a complex format.
Why do you reach this conclusion from toying with CSV?
More people who are not traditionally programmers are now writing code with AI assistance (great!) but this crowd seems unlikely to pick up Clojure, Haskell, OCaml etc... so I agree this is a development in favor of mainstream languages.
Even for small projects the optimisation criteria is different if the human's role in the equation shifts from authoring to primarily a reviewing based one.
Imo there's been a big disconnect between people who view code as work product vs those who view it as a liability/maintenance burden. AI is going to cause an explosion in the production of code, I'm not sure it's going to have the same effect on long term maintenance and I don't think rewriting the whole thing with ai again is a solution.
> Otoh, if you wanted to create an entirely new programming language in 2025, you might be shit outta luck.
This just made me really sad. That effectively means that we'll plateau indefinitely as a civilization (not just on programming languages, but anything where the LLM can create an artificial Lindy effect).
goimports makes everything look the same, the compiler is a nitpicky asshole that won’t let the program even compile if there is an unused variable etc.
Interesting, my experience lerning Zig was that Claude was really bad at the language itself to the point it wrote obvious syntax errors and I had to touch up almost everything.
Syntax and type errors gets instantly picked up by type checker and corrected, and as long as these failures stay in context, LLM doesn’t make those mistakes again. Not something I ever have to pay attention to.
See this is what kills me about these things. They say they built this system that will build apps for you, yet they advertise it using a website that chews through my CPU and GPU. All this page does is embed a YouTube video, why is my laptop's fan going full blast? And I'm supposed to trust the code that emanates from their oracle coding agent? What are we doing here people??
I taught Digital Design this semester - all models output nonsensical VHDL. The only exception is reciting “canonical” components available on technical and scientific literature (e.g., [1]).
By what metric? I still see vastly more Python and Typescript being generated, and hell, even more golang. I suppose we are all in our own language bubbles a bit.
Python code generated by LLM is like a landmine; it may compile, but there could be runtime errors lurking that will only detonate when the code is executed at some undetermined point in the future.
Rust code has the property that if it compiles, it usually works. True there are still runtime errors that can occur in Rust, but they're less likely going to be due to LLM hallucinations, which would be caught at compile time.
I mean, that is true for any interpreted language. That's why have type checkers, LSPs, tests and so on. Still not bullet proof, but also not complete time bomb like some commenters make it out to be. Hallucinations are not an issue in my day to day, stupid architecture decisions and overly defensive coding practices, those more so.
Can you elaborate a bit here? In my experience, most code I come into contact with isn't nearly defensive enough. Is AI generated code more defensive then the median?
Right, that's why good language design is still relevent in 2025. e.g. type checking only saves you if the language design and ecosystem is amenable to type checking. If the LLM can leverage typing information to yield better results, then languages with more type annotations throughout the code and ecosystem will be able to extract more value from LLMs in the long term.
I don’t have hard data to back it up, but LLMs make writing code super easy now. If the code compiles, you’ve basically filtered out the hallucinations. That’s why writing in Python or TypeScript feels kind of pointless. Rust gives you memory safety, no garbage collector, and just overall makes more sense, way better than Go. Honestly, choosing anything other than Rust feels like a risky gamble at this point.
Rust only really makes sense in settings where you would have otherwise used C or C++, i.e. you need the best possible performance and/or you can't afford garbage collection. Otherwise just use Go, Java or C#. There is no gamble with picking any of these.
Rust is fantastic for writing HTTP servers, microservices, and desktop applications.
OpenAI uses Rust for their service development as do a lot of other big companies.
It's a lot like Python/Flask, or even a bit like Go. It's incredibly easy to author [1] and deploy, and it runs super fast with no GC spikes or tuning. Super predictable five nines.
Desktop apps sing when written in Rust. A lot of AI powered desktop apps are being written in Rust now.
If you're going to reach for Go or Java (gRPC or Jetty or something) or Python/Flask, Rust is a super viable alternative. It takes the same amount of time to author, and will likely be far more defect free since the language encourages writing in a less error prone way and checks for all kinds of errors. Google did a study on this [2,3].
[1] 99.9% of the time you never hit the borrow checker/lifetimes when writing server code as it's linear request scoped logic. You get amazing error handling syntax and ergonomics and automatic cleanup of everything. You also have powerful threading and async tools if you need your service to do work on the side, and those check for correctness.
[2] "When we've rewritten systems from Go into Rust, we've found that it takes about the same size team about the same amount of time to build it," said Bergstrom. "That is, there's no loss in productivity when moving from Go to Rust. And the interesting thing is we do see some benefits from it. So we see reduced memory usage in the services that we've moved from Go ... and we see a decreased defect rate over time in those services that have been rewritten in Rust – so increasing correctness." https://www.theregister.com/2024/03/31/rust_google_c/
If you use an LLM with C or C++, stuff like pointer arithmetic or downcasting can be tricky. The code might compile just fine, but you could run into problems at runtime. That's why Rust is the only way...
> The code might compile just fine, but you could run into problems at runtime.
Obviously, this can also happen with Rust, but certainly less so than with C or C++, I'll give you that. But how is Rust the only way when there's also Java, Go, C#, Swift, Kotlin, etc.?
Does nobody write business logic in Rust? All you ever hear is “if it compiles it works” but you can write a compiling Rust program that says “1 + 1 = 3”. Surely an LLM can still hallucinate.
I’m blown away by how good Gemini Pro 2.5 is with Rust. Claude I’ve found somewhat disappointing, although it can do focused edits okay. Haven’t tried any of the o-series models.
Can confirm, you can do some good vibe coding with JavaScript (or TypeScript) and Claude Code.
I once vibe coded a test suite for a complex OAuth token expiry issue while working on someone else's TypeScript code.
Also, I had created a custom Node.js/JavaScript BaaS platform with custom Web Components and wanted to build apps with it, I gave it the documentation as attachment and surprisingly, it was able to modify an existing app to add entire new features. This app had multiple pages and Claude just knew where to make the changes. I was building a kind of marketplace app. One time it implemented the review/rating feature in the wrong place and I told it "This rating feature is meant for buyers to review sellers, not for sellers to review buyers" and it fixed it exactly right.
I think my second experience (plain JavaScript) was much more impressive and was essentially frictionless. I can't remember it making a single major mistake. I think only once it forgot to add the listener to handle the click event to highlight when a star icon was clicked but it fixed it perfectly when I mentioned this. With TypeScript, it sometimes got confused; I had to help it a lot more because I was trying to mock some functions; the fact that the TypeScript source code is separate from the build code created some confusion and it was struggling to grep the codebase at times. Though I guess the code was also more complicated and spread out over more files. My JavaScript web components are intended to be low-code so it's much more succinct.
I treat it as a partner that has a "wide and shallow" initial base, but the ability to "dive deep," when I need it. Basically, I do a "shallow triage," to figure out what I want to focus on, then I ask it to "dive deep," on my chosen topic.
I haven't been using it to learn new languages, but I have been using it to learn new concepts and techniques.
Right now, I'm learning up on implementing a webauthn backend and passkey integration into my app. It's been instrumental. Coming along great. I hadn't had any previous experience, and it's helping me to learn.
I should note that it has given me wrong examples; notably, it assumed a deprecated dependency version, that I had to debug and figure out a fix. That was actually a good thing, as it helped me to learn the "ins and outs" a bit better.
I'm still not convinced that I'd let AI just go ahead and ship an application from scratch, without any intervention on my part. It often makes mistakes; not serious ones, but ones that would be bad, if they shipped.
I've been diving into a (new language to me) Swift codebase over the last week and AI has been incredibly helpful in answering my questions and speeding up my learning
But meaningfully contributing to a complex project without the skills? Not a chance I'd put my name on the contributions it makes. I know how many mistakes these tools make in the languages I know well - it also makes them in the ones I don't. Only now I can't review its output
Yeah, I don’t understand how people feel confident with that at all.
Whenever I dive into a programming language I don’t know, I realize the amount of stuff I need to get used to before I feel confident reviewing code in it.
Also supposedly language barriers are smaller than ever, yet WhatsApp is killing its desktop app on Windows in favor of using the Web-based version.
Am I the only one that remembers how Microsoft tried to convince everyone to adopt .Net because this way you could have teams where one member could use J#, another use Fortran.Net (or whatever the name was) and old chaps could still contribute by writing Cobol# and everything would just magically work together and you would quadruple productivity just by leveraging the untapped pool of #Intercal talent out there?
I think AI will push programming languages in the direction of stronger hindly milner type type checking. Haskell is brutally hard to learn but with enough of a data set to learn from, its the perfect target language for a coding agent. its high level, can be formally verified using well known algos and a language server could easily be connected with the ai agent via some mcp interface.
IMO, Haskell is less helpful for an LLM because of its advanced language features. The LLM is reasoning about the language textually. Since Haskell is very tense, the LLM would need a very strong model of how the language works.
I think languages with more minimal features and really good compile time errors would work well with LLMs. In particular, I've heard multiple people say how good LLMs are at generating Go.
Personally, I like languages with type inference so this wouldn't be my preference.
I wish but the opposite seems to be coming - Haskell will have less support from coding AIs than mainstream languages.
I think people, who care about FP, should think about what is appealing about coding in natural language and is missing from programming in strongly typed FP languages such as Haskell and Lean. (After all, what attracted me to Haskell compared to Python was that the typechecking is relatively cheap thanks to type inference.)
I believe that natural language in coding has allure because it can express the outcome in fuzzy manner. I can "handwave" certain parts and the machine fills them out. I further believe, to make this work well with formal languages, we will need to use some kind of fuzzy logic, in which we specify the programs. (I particularly favor certain strong logics based on MTL but that aside.) Unfortunately, this line of research seems to have been pretty much abandoned in AI in favor of NNs.
I used a LSP MCP tool for a LLM and was so far a bit underwhelmed. The problem is that LSP is designed for human consumption and LLMs have different constraints.
LLMs don't use the LSP exploratory to learn the API, you just give it to it as a context or MCP tool. LLMs are really good at pattern matching and wont make type errors as long as the type structure and constructs are simple.
If they are not simple it is not said that the LLM can solve and the user understand it.
Is there any large formally verified project written in Haskell? The most well known ones are C (seL4 microkernel) and Coq+OCaml (CompCert verified C compiler).
Well, Haskell has GADTs, new type wrappers and type interfaces which can be (and are often) used to implement formal verification using meta programming, so I get the point he was making.
You pretty much don’t need to plug another language into Haskell to be satisfied about certain conditions if the types are designed correctly.
Those can all encode only very simplistic semantics of the code. You need either a model checker or dependent types to actually verify any kind of interesting semantics (such as "this sort function returns the number in a sorted order", or "this monad obeys the monad laws"). GADTs, newtypes and type interfaces are not significantly more powerful than what you'd get in, say, a Java program in terms of encoding semantics into your types.
Now, I believe GHC also has support for dependent types, but the question stands: are there any major Haskell projects that actually use all of these features to formally verify their semantics? Is any part of the Haskell standard library formally verified, for example?
And yes, I do understand that type checking is a kind of formal verification, so in some sense even a C program is "formally verified", since the compiler ensures that you can't assign a float to an int. But I'm specifically asking about formal verification of higher level semantics - sorting, monad laws, proving some tree is balanced, etc.
We might see wider adoption of dependently typed languages like Agda. But limited corpus might become the limiting factor, I’m not sure how knowledge transfers as the languages get more different.
It's getting cheaper and cheaper to generate corpora by the day, and Agda has the advantage of being verifiable like Lean. So you can simulate large amounts of programs and feed these back into the model. I think this is a major reason why we're seeing remarkable improvements in formal sciences like the recent IMO golds, and yet LLMs are still struggling to generate aesthetically pleasing and consistent CSS. Imagine a high schooler who can win an IMO gold medal but can't center a div!
It seems like "generating" a corpus in that situation is more like a search process guided by prompts and more critically the type checker, rather than a straight generation process right? You need some base reality or you'll still just have garbage in, garbage out.
This is great, and I think this is the right way to use AI: treat it as a pair programming partner and learn from it. As the human learns and becomes better at both programming and the domain in question (eg. a Ruby JIT compiler), the role of the AI partner shifts: at the beginning it's explaining basic concepts and generating/validating smaller snippets of code; in later stages the conversations focus on advanced topics and the AI is used to generate larger portions of code, which now the human is more confident to review to spot bugs.
> The real breakthrough came when I stopped thinking of AI as a code generator and started treating it as a pairing partner with complementary skills.
I think this is the most important thing mentioned in the post. In order for the AI to actually help you with languages you don't know you have to question its solutions. I have noticed that asking questions like why are we doing it like this and what will happen in the x,y,z scenario, really helps.
My experience is that each question I ask or point I make produces an answer that validates my thinking. After two or three iterations in a row in this style I end up distrusting everything.
This is a good point. Lately I have been experimenting with phrasing the question in a way that it makes it believe that I prefer what I am suggesting, while the truth is that I don't.
For example:
- I implement something.
- Then I ask it to review it and suggest alternatives. Where it will likely say my solution is the best.
- Then I say something like "Isn't the other approach better for __reason__ ?". Where the approach might not even be something it suggested.
And it seems that sometimes it gives me some valid points.
This is very true. Constant insecurity for me. One thing that helps a little is asking it to search for sources to back up what its saying. But claude has hallucinated those as well. Perplexity seems to be good at being true to sources, but idk how good it is at coding itself
yes, this. biggest problem and danger in my daily work with llms. my entire working method with them is shaped around this problem. instead of asking it to give me answers or solutions, i give it a line of thought or logical chain, and then ask it to continue down the path and force it to keep explaining the reasoning while i interject, continuing to introduce uncertainty. suspicion is one of the most valuable things i need to make any progress. in the end it's a lot of work and very much reading and reasoning.
I'm addition, I frequently tell it to ask clarifying questions. Those often reveal gaps in understanding or just plain misunderstanding that you can then bip in the bud before it has generated a million tokens.
I wanted to test Gemini's code generation so I asked it for a bash script iterating through an array of directory names and executing a command for each one.
It got it wrong. The command was generated outside of the for, and never updated inside the loop effectively making it useless.
Luckily I know bash so I spotted it immediately. But I wonder how it's "removing programming language barriers" when one that does not know its output language can not spot even such a glaring issue.
I’ve done similar tools with Go and haven’t had any issues.
One goes through a tree of dirs, finds zip files with images, unzips, benchmarks which image format is the most compact and creates a new zip with that.
I think this is more of a criticism of bash than of Gemini.
Also, I out-of-hand reject any criticism of an AI that specifies only the brand ("ChatGPT") and not the specific model version, settings, etc...
So, for example, Gemini 2.5 Pro (Jan 2025) with temperature 0.15 prompted with: "Write a short, idiomatic bash script that takes a file containing a list of directory names, changes to each one, runs a command, and then returns to the original working directory."
Spits out this:
#!/bin/bash
# A script to run a command in a list of directories.
# Exit if the wrong number of arguments are provided.
if [[ $# -lt 2 ]]; then
echo "Usage: $0 <file_with_dirs> <command> [args...]" >&2
exit 1
fi
DIR_FILE="$1"
shift # The rest of the arguments ("$@") are now the command.
# Ensure the directory list file exists and is readable.
if [[ ! -r "$DIR_FILE" ]]; then
echo "Error: Cannot read file '$DIR_FILE'" >&2
exit 1
fi
# Read the file line by line.
while IFS= read -r dir; do
# Skip empty lines or non-existent directories.
if [[ -n "$dir" && -d "$dir" ]]; then
echo "==> Entering '$dir'..."
# Run command in a subshell to keep the 'cd' local.
# '&&' ensures the command only runs if 'cd' succeeds.
(cd "$dir" && "$@")
echo "<== Finished with '$dir'."
else
echo "--> Skipping invalid directory: '$dir'" >&2
fi
done < "$DIR_FILE"
echo "All done."
That worked for me, but I was testing it in WSL 2 where I got a gibberish error... which was because I edited the file in Windows Notepad and the line endings were confusing bash. Gemini helpfully told me how to fix that too!
Something that I found amusing, and again, is a criticism of bash instead of the AI, is that this fails to process the last line if it isn't terminated with a \n character.
PS: This is almost a one-liner in PowerShell, and works with or without the final terminator character:
Yes, well... are you "anyone", or an IT professional? Are you using the computer like my mother, or like someone that knows how LLMs work?
This is a very substantial difference. There's just no way "anyone" is going to get useful code out of LLMs as they are now, in most circumstances.
However, I've seen IT professionals (not necessarily developers!) get a lot of utility out of them, but only after switching to specific models in "API playgrounds" or some similarly controlled environment.
> Yes, well... are you "anyone", or an IT professional? Are you using the computer like my mother, or like someone that knows how LLMs work?
I have more than 15 years of programming experience. I do not trust the output of LLMs a single bit. This just proved my point. I honestly don't care if I used the "wrong" model or the "wrong" query, which was already quite descriptive of what I wanted anyway.
No need to get super defensive, you can keep spending your time playing code golf with Gemini if you want. My experience just corroborates what I already thought; code generation is imprecise and error prone.
AI tools make it so much easier to shift gears between two or more languages. Before this year, it would take me at least a week to adjust going from Python to Rust to TS. Now, AI will just fill in the gaps and I know enough to recognize poor AI patterns.
LLMs learn and apply pattern. You can always give some source code examples and language docs as context and it will apply those adapted patterns to the new language.
Context windows are pretty large (Gemini 2.5 pro with 1 mill tokens (~ 750k words the largest) so it does not really matter.
What about the part of programming and software development that relies on programmatic/systemic thinking? How much is the language syntax itself part of any 'program' solution?
>We can start contributing meaningfully from day one, learning what we need as we go.
Can you though? Or is just not bad enough for your coworkers to bother telling you how bad it is?
I use AIs daily. But that doesn't mean I don't get mad when I'm reviewing a coworker's work and have to fight whatever bullshit an AI convinced them. I can't just brush it off as AI nonsense because 1) it might be their honest attempt at work without AI and 2) if it is AI they've already proven they don't know how to improve it.
Take a look at this Haskell program that LLM wrote. I do not write Haskell, but I can review the code just fine to say that this is doing what I want.
-- Simple multiplication function
multiply :: Num a => a -> a -> a
multiply x y = x * y
-- Main function for running the program
main :: IO ()
main = do
putStrLn "Enter the first number:"
input1 <- getLine
putStrLn "Enter the second number:"
input2 <- getLine
let num1 = read input1 :: Double
let num2 = read input2 :: Double
let result = multiply num1 num2
putStrLn $ "Result: " ++ show num1 ++ " * " ++ show num2 ++ " = " ++ show result
If I had to write this by myself, it'd have taken atleast 20mins. First I have to be learn how main function is setup, how type definitions work, what putStrLn is, how to get an input, how to define a multiple function etc etc.
I've noticed this at work where I use Python frameworks like Flask/FastAPI/Django and Go, which has the standard library handlers but within that people are much less likely to follow specific patterns and where there are various composable bits as add ons.
If you ask an LLM to generate a Go handler for a REST endpoint, it often does something a bit out of step with the rest of the code base. If I do it in Python, it's more idiomatic.
Agree. My team and I were just discussing that the biggest productivity unlock from AI in the dev workflow is that it enables people to more easily break out of their box. If you're an expert backend developer, you may not see huge lift when you write backend code. But when you need to do work on infrastructure or front-end, you can now much more easily unblock yourself. This unlocks a lot of productivity, and frankly, makes the work a lot more enjoyable.
I wonder, are some programming languages more suitable for AI coding agents (or, rather LLMs) than the others? For example, are heavy on syntax languages at disadvantage? Is being verbose a good thing or a bad thing?
P.S. Maybe we will finally see M-expressions for Lisp developed some day? :)
true. doing pair programming with AI for last 10 months I got my skills from zero to sufficient profficiency (not expert yet) in totally new language — Swift. entry barrier is much lower now. research advanced topics is much faster. typing code (unit tests, etc.) is much faster. code review is automated. it is indeed makes barrier for new languages and tools lower.
I don't think I've ever seen an experienced software engineer struggling to adapt to a new language.
I have worked in many, many languages in the past and I've always found it incredibly easy to switch, to the point where you're able to contribute right away and be efficient after a few hours.
I recently had to do some updates on a Kotlin project, having never used it (and not used Java in a few years either), and there was absolutely no barrier.
They don't struggle to write code, but they struggle to write idiomatic code.
An experienced programmer from a different language introduced to another will write lots of code that works, but in a style idiomatic to their favoured language, be that C, C++, rust, python, etc.
Every language has its quirks, and mastery is less about being able to write a for loop or any given algorithm in any given language, but more about knowing which things you should write, and which things you should be using the standard libraries for.
I've literally seen C# consultants waste time writing their favourite .NET LINQ methods into a javascript library, so they can .Select(), .Where(), etc, rather than using .filter, .map, etc.
Likewise I've seen people coming from C struggle to be as productive as they ought to be in C#, because they'd rather write a bunch of static methods with loops than get to grips with LINQ.
Fully understanding a runtime (or compiler behaviour for AOT languages) and what is or isn't in standard libraries isn't something that can be mastered in a few hours.
We learn natural languages by listening and trying things to see what responses we get. Some people have tried to learn programming the same way too. They'd just randomly try stuff, see if it compiles then see if it gives what they were expecting when they run it. I've seen it with my own eyes. These are the worst programmers in existence.
I fear that this LLM stuff is turning this up 11. Now you're not even just doing trial and error with the compiler, it's trial and error with the LLM and you don't even understand what it's output. Writing C or assembly without fully reasoning about what's going on is going to be a really bad time... No, the LLM does not have a working model of computer memory, it's a language model, that's it.
This is why the most seasoned of engineers will be employed back to clean up the mess that these AI agents and vibe-coders have created.
I suggest that the author should properly read up on the technicals of these compiled languages before having to be fully dependent on an AI bot which by his own admission can lead him and the chatbot into the wrong direction.
Each of these languages all have different semantics and have complete differences between them; especially compiled languages like C/C++,Rust verses Ruby and Javascript (yuck).
I’ve been enjoying doing a bunch of assembly language programming - something I never had the experience of or capability to learn to competence or time to learn previously.
I was thinking the same the other day. No need for high-level languages anymore. AI, assumming it will get better and replace humans coders. has eliminated the labour constraint. Moores law death will no longer be a problem as performance gains are realised in software. The days of bloated electron apps are finally behind us.
>For someone who spent a decade as a “Ruby developer,” becoming a multi-language developer in less than a year feels revolutionary.
Revolutionary? They've snitched they have no frame of reference to make that claim. It would have taken "less than a year" with or without AI. They just spent 10 years not trying.
Everyone's first language learning experience is also learning to program. Learning a new language once you have years of professional programming practice is completely different.
In my early days, I identified strongly with my chosen programming language, but people way more experienced than me taught me that a programming language is a tool, and that this approach is akin to saying "well, I don't know about those pliers, I am a hammerer."
My personal feeling from working across a wide range of programming languages is that it expands your horizons in a massive way (and hard to qualitatively describe), and I'm happy that I did this.
The language I consider the lamest, biggest impediment to learning computer science is used by some of the smartest people on the planet to build amazing things.
https://m.youtube.com/watch?v=d696t3yALAY
What you may have missed, from the perspective of your vertically scaled horse, is that you compare learning certain models to a mental disability. It makes calling my comment racist similar to the whole pot/kettle thing.
However, I do appreciate reading about such opinions because it offers a peek into the elitism that surrounds programming languages.
Also, as a person from a non-traditional and non-privileged background, Im a little unsure about how to proceed. Shall we cut our losses and move on?
I suppose we can mark this statement as technically true. I can only attest to my experience using o4 for python mini projects ( popular, so lots of functional code to train on ).
The thing I found is that without it, all the interesting little curve balls I encountered likely would have thrown a serious wrench into the process ( yesterday, it was unraid specific way of handling xml vm ). All of sudden, I am not learning how to program, but learning how qemu actually works, but it is a lot more seamless than having to explore it 'on my own'. And that little detour took half a day when all was said and done. There was another little detour at dockers ( again unraid specific isseus ), but all was overcome, because now I had 4o guide me.
It is scary, because it can work and work well ( even when correcting for randomness). FWIW, my first language was basic way back when.
The end result? Non-mainstream languages don't get much easier to get into because average Joe isn't already proficient in them to catch AI's bugs.
People often forget the bitter lesson of machine learning which plagues transformer models as well.
Just found 3 race conditions in 100 lines of code. From the UTF-8 emojis in the comments I'm really certain it was AI generated. The "locking" was just abandoning the work if another thread had started something, the "locking" mechanism also had toctou issues, the "locking" also didn't actually lock concurrent access to the resource that actually needed it.
This is one of the "here be demons" type signatures of LLM code generation age, along with comments like
// define the payload struct payload {};
I've copied thousands of lines of complex code into an LLM asking it to find complex problems like race conditions and it has found them (and other unsolicited bugs) that nobody was able to find themselves.
Oh it just pattern matched against the general concept of race conditions to find them in complex code it's never seen before / it's just autocomplete, what's the big deal? At that level, humans are glorified pattern matchers too and the distinction is meaningless.
The counter point is how LLMs can't find a missing line in a poem when they are given the original.
PAC learning is basically existential quantification...has the same limits too.
But being a tool to find a needle is not the same as finding all or even reliability finding a specific needle.
Being being a general programming agent requires much more than just finding a needle.
True, but describing a limitation of the tech can't be used to make the sort of large dismissals we see people make wrt LLMs.
The human brain has all sorts of limitations like horrible memory (super confident about wrong details) and catastrophic susceptibility to logical fallacies.
Have you not had this issue with LLMs? Because I have. Even with the latest models.
I think someone upthread was making an attempt at
> describing a limitation of the tech
but you keep swatting them down. I didn’t see their comments as a wholesale dismissal of AI. They just said they aren’t great at sufficiently complex tasks. That’s my experience as well. You’re just disagreeing on what “sufficiently” and “complex” mean, exactly.
I'm still convinced that this is true. The more advances we make in "AI" the more i expect we'll discover that we're not as creative and unique as we think we are.
Humans generally have a very high opinion of themselves and their supposedly unique creative skills. They are not eager to have this illusion punctured.
Even our memories are mutable. We will with full confidence recite memories or facts we've learned just moments ago which are entirely fictional. Normal, healthy adults.
“Pattern matching” is thought of as linear but LLMs are doing something more complex, it should be appreciated as such.
[1]https://ai.vixra.org/pdf/2506.0065v1.pdf
The paper is satire, but it's a pretty funny read.
How did you evaluate this? Would be interested in seeing results.
I am specifically interested in the amount of false issues found by the LLM, and examples of those.
There are false positives, and they mostly come from the LLM missing relevant context like a detail about the priors or database schema. The iterative nature of an LLM convo means you can add context as needed and ratchet into real bugs.
But the false positives involve the exact same cycle you do when you're looking for bugs yourself. You look at the haystack and you have suspicions about where the needles might be, and you verify.
You do or you don't.
Recently we've seen many "security researchers" doing exactly this with LLM:s [1]
1: https://www.theregister.com/2025/05/07/curl_ai_bug_reports/
Not suggesting you are doing any of that, just curious what's going on and how you are finding it useful.
> But the false positives involve the exact same cycle you do when you're looking for bugs yourself.
In my 35 years of programming I never went just "looking for bugs".
I have a bug and I track it down. That's it.
Sounds like your experience is similar to using deterministic static code analyzers but more expensive, time consuming, ambiguous and hallucinating up non-issues.
And that you didn't get a report to save and share.
So is it saving you any time or money yet?
It's a hard, time consuming, and meandering process to do this kind of work on a system, and it's what you might have to pay expensive consultants to do for you, but it's also how you beat an expensive bug to the punchline.
An LLM helps me run all sorts of considerations on a system that I didn't think of myself, but that process is no different than what it looks like when I verify the system myself. I have all sorts of suspicions that turn into dead ends because I can't know what problems a complex system is already hardened against.
What exactly stops two in-flight transfers from double-spending? What about when X? And when Y? And what if Z? I have these sorts of thoughts all day.
I can sense a little vinegar at the end of your comment. Presumably something here annoys you?
Thanks for your responses.
Really sorry about the vinegar, not intentional. I may have such personality disorder idk. Being blunt, not very great communication skills.
That's not really true for things that are changing a lot. I got a terrible experience last time I've tried to use Zig, for example. The code it generated was an amalgamation between two or three different versions.
And I've even got this same style of problem in golang where sometimes the LLM generates a for loop in the "old style" (pre go 1.22).
In the end LLMs are a great tool if you know what needs to be done, otherwise it will trip you up.
Things that most teams don’t do or half-ass
Alternatively every new release will have to come with an MCP for its documentation and any other aspects that might make it easier for an LLM to talk about it and use it accurately.
I use AI assistance to generate code and review code, but I haven't had success trying to use it to update a substantial existing code base in Scala. I have tried using both Claude Code and Cursor and the handful of times I tried there were so many oversights and mistakes that resolving the mess was more effort than doing it manually. I'll probably try again after the next big model releases.
Current frontier models have been the least useful to me when I'm asking them to review performance-critical code inside Scala. These are bits of my code base that I have written to use a lot of mutable variables, mutable data structures, and imperative logic to avoid allocations and other performance impediments. I only use this style for functions highlighted by profiling. I can ask for a code review and even include the reasoning for the unusual style in the docstring, but Claude and Gemini have still tried to nudge me back to much slower standard Scala idioms.
If they do want "popularity" then the counter-counter-point is that it should be easier to get than ever. Just have one proficient person write a lot of idiomatic, relatively isolatable code, and then have an AI generate terabytes upon terabytes of public domain licensed variations and combinations on that code. If you make programming in the small a breeze, people will flock to your language, and then they can discover how to program in the large with it on their own time.
I'm optimistic that most new programming languages will only need a few "real" programmers to write a small amount of example code for the AI training to get started.
CSV is not a complex format.
Why do you reach this conclusion from toying with CSV?
And why do you trust a LLM for economic planning?
When the code is done, it not like the LLM can secretly go flip columns at random
Even for small projects the optimisation criteria is different if the human's role in the equation shifts from authoring to primarily a reviewing based one.
But I'm not noticing that anymore, at least with Elixir. The gap has closed; Claude 4 and Gemini 2.5 both write it excellently.
Otoh, if you wanted to create an entirely new programming language in 2025, you might be shit outta luck.
This just made me really sad. That effectively means that we'll plateau indefinitely as a civilization (not just on programming languages, but anything where the LLM can create an artificial Lindy effect).
Strong typing drastically reduces hallucinations and wtf bugs that slip through code review.
So it’ll probably be the strongly typed languages that receive the proportionally greatest boost in popularity from LLM-assisted coding.
goimports makes everything look the same, the compiler is a nitpicky asshole that won’t let the program even compile if there is an unused variable etc.
That is a really big advantage in the AI era. LLMs are pretty bad at identifying what is and what isn't relevant in the context.
For developers this decision is pretty annoying, but it makes sense if you are using LLMs.
Zig changes a lot. So LLMs reference outdated data, or no data at all, and resort to making a lot of 50% confidence guesses.
With Rust OTOH Claude feels like a great teacher.
I also use coding agents with Elixir daily without issues.
some HDLs should fit the bill: VHDL, Verilog or SystemC
[1] https://docs.amd.com/r/en-US/ug901-vivado-synthesis/Flip-Flo...
Rust code has the property that if it compiles, it usually works. True there are still runtime errors that can occur in Rust, but they're less likely going to be due to LLM hallucinations, which would be caught at compile time.
Can you elaborate a bit here? In my experience, most code I come into contact with isn't nearly defensive enough. Is AI generated code more defensive then the median?
OpenAI uses Rust for their service development as do a lot of other big companies.
It's a lot like Python/Flask, or even a bit like Go. It's incredibly easy to author [1] and deploy, and it runs super fast with no GC spikes or tuning. Super predictable five nines.
Desktop apps sing when written in Rust. A lot of AI powered desktop apps are being written in Rust now.
If you're going to reach for Go or Java (gRPC or Jetty or something) or Python/Flask, Rust is a super viable alternative. It takes the same amount of time to author, and will likely be far more defect free since the language encourages writing in a less error prone way and checks for all kinds of errors. Google did a study on this [2,3].
[1] 99.9% of the time you never hit the borrow checker/lifetimes when writing server code as it's linear request scoped logic. You get amazing error handling syntax and ergonomics and automatic cleanup of everything. You also have powerful threading and async tools if you need your service to do work on the side, and those check for correctness.
[2] "When we've rewritten systems from Go into Rust, we've found that it takes about the same size team about the same amount of time to build it," said Bergstrom. "That is, there's no loss in productivity when moving from Go to Rust. And the interesting thing is we do see some benefits from it. So we see reduced memory usage in the services that we've moved from Go ... and we see a decreased defect rate over time in those services that have been rewritten in Rust – so increasing correctness." https://www.theregister.com/2024/03/31/rust_google_c/
[3] https://news.ycombinator.com/item?id=39861993
In which world is Rust fantastic for writing desktop applications? Where are the mature Rust UI frameworks?
> Desktop apps sing when written in Rust.
What does this even mean?
> A lot of AI powered desktop apps are being written in Rust now.
For example? And what do you mean by "AI powered desktop apps"?
Obviously, this can also happen with Rust, but certainly less so than with C or C++, I'll give you that. But how is Rust the only way when there's also Java, Go, C#, Swift, Kotlin, etc.?
Also, I had created a custom Node.js/JavaScript BaaS platform with custom Web Components and wanted to build apps with it, I gave it the documentation as attachment and surprisingly, it was able to modify an existing app to add entire new features. This app had multiple pages and Claude just knew where to make the changes. I was building a kind of marketplace app. One time it implemented the review/rating feature in the wrong place and I told it "This rating feature is meant for buyers to review sellers, not for sellers to review buyers" and it fixed it exactly right.
I think my second experience (plain JavaScript) was much more impressive and was essentially frictionless. I can't remember it making a single major mistake. I think only once it forgot to add the listener to handle the click event to highlight when a star icon was clicked but it fixed it perfectly when I mentioned this. With TypeScript, it sometimes got confused; I had to help it a lot more because I was trying to mock some functions; the fact that the TypeScript source code is separate from the build code created some confusion and it was struggling to grep the codebase at times. Though I guess the code was also more complicated and spread out over more files. My JavaScript web components are intended to be low-code so it's much more succinct.
That's how I've been using it.
I treat it as a partner that has a "wide and shallow" initial base, but the ability to "dive deep," when I need it. Basically, I do a "shallow triage," to figure out what I want to focus on, then I ask it to "dive deep," on my chosen topic.
I haven't been using it to learn new languages, but I have been using it to learn new concepts and techniques.
Right now, I'm learning up on implementing a webauthn backend and passkey integration into my app. It's been instrumental. Coming along great. I hadn't had any previous experience, and it's helping me to learn.
I should note that it has given me wrong examples; notably, it assumed a deprecated dependency version, that I had to debug and figure out a fix. That was actually a good thing, as it helped me to learn the "ins and outs" a bit better.
I'm still not convinced that I'd let AI just go ahead and ship an application from scratch, without any intervention on my part. It often makes mistakes; not serious ones, but ones that would be bad, if they shipped.
But meaningfully contributing to a complex project without the skills? Not a chance I'd put my name on the contributions it makes. I know how many mistakes these tools make in the languages I know well - it also makes them in the ones I don't. Only now I can't review its output
Whenever I dive into a programming language I don’t know, I realize the amount of stuff I need to get used to before I feel confident reviewing code in it.
Also supposedly language barriers are smaller than ever, yet WhatsApp is killing its desktop app on Windows in favor of using the Web-based version.
I think languages with more minimal features and really good compile time errors would work well with LLMs. In particular, I've heard multiple people say how good LLMs are at generating Go.
Personally, I like languages with type inference so this wouldn't be my preference.
I think people, who care about FP, should think about what is appealing about coding in natural language and is missing from programming in strongly typed FP languages such as Haskell and Lean. (After all, what attracted me to Haskell compared to Python was that the typechecking is relatively cheap thanks to type inference.)
I believe that natural language in coding has allure because it can express the outcome in fuzzy manner. I can "handwave" certain parts and the machine fills them out. I further believe, to make this work well with formal languages, we will need to use some kind of fuzzy logic, in which we specify the programs. (I particularly favor certain strong logics based on MTL but that aside.) Unfortunately, this line of research seems to have been pretty much abandoned in AI in favor of NNs.
LLMs don't use the LSP exploratory to learn the API, you just give it to it as a context or MCP tool. LLMs are really good at pattern matching and wont make type errors as long as the type structure and constructs are simple.
If they are not simple it is not said that the LLM can solve and the user understand it.
Is there any large formally verified project written in Haskell? The most well known ones are C (seL4 microkernel) and Coq+OCaml (CompCert verified C compiler).
You pretty much don’t need to plug another language into Haskell to be satisfied about certain conditions if the types are designed correctly.
Now, I believe GHC also has support for dependent types, but the question stands: are there any major Haskell projects that actually use all of these features to formally verify their semantics? Is any part of the Haskell standard library formally verified, for example?
And yes, I do understand that type checking is a kind of formal verification, so in some sense even a C program is "formally verified", since the compiler ensures that you can't assign a float to an int. But I'm specifically asking about formal verification of higher level semantics - sorting, monad laws, proving some tree is balanced, etc.
I think this is the most important thing mentioned in the post. In order for the AI to actually help you with languages you don't know you have to question its solutions. I have noticed that asking questions like why are we doing it like this and what will happen in the x,y,z scenario, really helps.
For example: - I implement something. - Then I ask it to review it and suggest alternatives. Where it will likely say my solution is the best. - Then I say something like "Isn't the other approach better for __reason__ ?". Where the approach might not even be something it suggested.
And it seems that sometimes it gives me some valid points.
It got it wrong. The command was generated outside of the for, and never updated inside the loop effectively making it useless.
Luckily I know bash so I spotted it immediately. But I wonder how it's "removing programming language barriers" when one that does not know its output language can not spot even such a glaring issue.
I’ve done similar tools with Go and haven’t had any issues.
One goes through a tree of dirs, finds zip files with images, unzips, benchmarks which image format is the most compact and creates a new zip with that.
Got it right the first go (hehe pun)
Also, I out-of-hand reject any criticism of an AI that specifies only the brand ("ChatGPT") and not the specific model version, settings, etc...
So, for example, Gemini 2.5 Pro (Jan 2025) with temperature 0.15 prompted with: "Write a short, idiomatic bash script that takes a file containing a list of directory names, changes to each one, runs a command, and then returns to the original working directory."
Spits out this:
That worked for me, but I was testing it in WSL 2 where I got a gibberish error... which was because I edited the file in Windows Notepad and the line endings were confusing bash. Gemini helpfully told me how to fix that too!Something that I found amusing, and again, is a criticism of bash instead of the AI, is that this fails to process the last line if it isn't terminated with a \n character.
PS: This is almost a one-liner in PowerShell, and works with or without the final terminator character:
Gemini also helped me code-golf this down to:I can write correct bash; Gemini in this instance could not.
> Also, I out-of-hand reject any criticism of an AI that specifies only the brand ("ChatGPT") and not the specific model version
Honestly I don't care, I opened the browser and typed my query just like anyone would.
> PS: This is almost a one-liner in PowerShell, and
Wonder how this is related to "I asked Gemini to generate a script and it was severely bugged"
Yes, well... are you "anyone", or an IT professional? Are you using the computer like my mother, or like someone that knows how LLMs work?
This is a very substantial difference. There's just no way "anyone" is going to get useful code out of LLMs as they are now, in most circumstances.
However, I've seen IT professionals (not necessarily developers!) get a lot of utility out of them, but only after switching to specific models in "API playgrounds" or some similarly controlled environment.
I have more than 15 years of programming experience. I do not trust the output of LLMs a single bit. This just proved my point. I honestly don't care if I used the "wrong" model or the "wrong" query, which was already quite descriptive of what I wanted anyway.
No need to get super defensive, you can keep spending your time playing code golf with Gemini if you want. My experience just corroborates what I already thought; code generation is imprecise and error prone.
It almost never misses on explaining how certain syntax works.
Context windows are pretty large (Gemini 2.5 pro with 1 mill tokens (~ 750k words the largest) so it does not really matter.
AI is a much better, so in some case worse, language lawyer than humans could ever be.
The robot responds, "Can you?"
I still spend a few days studying Rust to grasp the basic things.
Can you though? Or is just not bad enough for your coworkers to bother telling you how bad it is?
I use AIs daily. But that doesn't mean I don't get mad when I'm reviewing a coworker's work and have to fight whatever bullshit an AI convinced them. I can't just brush it off as AI nonsense because 1) it might be their honest attempt at work without AI and 2) if it is AI they've already proven they don't know how to improve it.
Take a look at this Haskell program that LLM wrote. I do not write Haskell, but I can review the code just fine to say that this is doing what I want.
If I had to write this by myself, it'd have taken atleast 20mins. First I have to be learn how main function is setup, how type definitions work, what putStrLn is, how to get an input, how to define a multiple function etc etc.It really is an NP problem, come to think of it.
AI code review helps you catch issues you've forgotten about and eliminates the repetitive work
These tools are helping developers create quality software - not replace them
If you ask an LLM to generate a Go handler for a REST endpoint, it often does something a bit out of step with the rest of the code base. If I do it in Python, it's more idiomatic.
Compare it yourself with letting it generate js/python or something it trained alot on, versus something more esoteric, like brainfuck.
And even in a common language, you'll hit brick walls when the LLM confuses different versions of the library you are using, or whatever.
I had issues with getting AI generated rust code to even compile.
It's simple: The less mainstream language, the less exposure in the training set leads to worse output.
P.S. Maybe we will finally see M-expressions for Lisp developed some day? :)
If AI had really a multiplying factor here, I'd expect you to BE an expert.
I have worked in many, many languages in the past and I've always found it incredibly easy to switch, to the point where you're able to contribute right away and be efficient after a few hours.
I recently had to do some updates on a Kotlin project, having never used it (and not used Java in a few years either), and there was absolutely no barrier.
They don't struggle to write code, but they struggle to write idiomatic code.
An experienced programmer from a different language introduced to another will write lots of code that works, but in a style idiomatic to their favoured language, be that C, C++, rust, python, etc.
Every language has its quirks, and mastery is less about being able to write a for loop or any given algorithm in any given language, but more about knowing which things you should write, and which things you should be using the standard libraries for.
I've literally seen C# consultants waste time writing their favourite .NET LINQ methods into a javascript library, so they can .Select(), .Where(), etc, rather than using .filter, .map, etc.
Likewise I've seen people coming from C struggle to be as productive as they ought to be in C#, because they'd rather write a bunch of static methods with loops than get to grips with LINQ.
Fully understanding a runtime (or compiler behaviour for AOT languages) and what is or isn't in standard libraries isn't something that can be mastered in a few hours.
Just shellcheck the hell out of it until it passes all tests.
I fear that this LLM stuff is turning this up 11. Now you're not even just doing trial and error with the compiler, it's trial and error with the LLM and you don't even understand what it's output. Writing C or assembly without fully reasoning about what's going on is going to be a really bad time... No, the LLM does not have a working model of computer memory, it's a language model, that's it.
I suggest that the author should properly read up on the technicals of these compiled languages before having to be fully dependent on an AI bot which by his own admission can lead him and the chatbot into the wrong direction.
Each of these languages all have different semantics and have complete differences between them; especially compiled languages like C/C++,Rust verses Ruby and Javascript (yuck).