Vibe coding is the buzzword of the moment. What is it? The practice of writing software by describing the problem to an AI large language model and using the code it generates. It’s not quite as simple as just letting the AI do your work for you because the developer is supposed to spend time honing and testing the result, and its proponents claim it gives a much more interactive and less tedious coding experience. Here at Hackaday, we are pleased to see the rest of the world catch up, because back in 2023, we were the first mainstream hardware hacking news website to embrace it, to deal with a breakfast-related emergency.
Jokes aside, though, the fad for vibe coding is something which should be taken seriously, because it’s seemingly being used in enough places that vibe coded software will inevitably affect our lives. So here’s the Ask Hackaday: is this a clever and useful tool for making better software more quickly, or a dangerous tool for creating software nobody quite understands, containing bugs which could cause a disaster?
Our approach to writing software has always been one of incrementally building something from the ground up, which satisfies the need. Readers will know that feeling of being in touch with how a project works at all levels, with a nose for immediately diagnosing any problems that might occur. If an AI writes the code for us, the feeling is that we might lose that connection, and inevitably this will lead to less experienced coders quickly getting out of their depth. Is this pessimism, or the grizzled voice of experience? We’d love to know your views in the comments. Are our new AI overlords the new senior developers? Or are they the worst summer interns ever?
From my recent experince with AI getting 19 out of 20 attempts wrong. Worse than the worst summer intern.
The answers look okay at first glance, enough that supposed experts will say “that’s amazing”. Queue painful argument about why it isnt right before you can fix the screw ups.
LLMs are the wrong approach for a lot of things.
It’s comments like this that remind me prompt engineer is a real job, I regularly forget that and start thinking that it’s stupid to expect an ability to articulate one’s thoughts be in any way shape or form valuable to society, but in practice it just so happens the inverse experience to your own is a meer matter of proper formulation.
So far, I’ve found GitHub Copilot to be good for 10-20% performance boost, depending on the complexity of the code base. I tried it at work after my employer bought everyone a subscrription, and liked it enough to buy my own subscription for personal projects.
I mostly just use the autocomplete aspect… You have to be pretty adept at evaluating its suggestions, because the majority of them are trash. But if you can evaluate at a glance and just keep typing, the bad suggestions have no impact, and the good ones save a bit of time. About 1 in 50 are something I prefer to what I was planning to write, and that feels like magic.
The chat UI has been equally hit-or-miss. Sometimes it doesn’t understand what I want, sometimes it seems to understand but suggests something that doesn’t actually work… but with that approach, the time spent formulating the question and reading the answer is not trivial enough to ignore.
I’ve heard that some of the other tools are great for bootstrapping new projects, but haven’t had a new project to try them with yet.
Anyway, the “vibe coding” idea sounds like wishful thinking today, but I’m intrigued by the trajectory things are on. The reasoning / chain-of-thought stuff has given me vastly better results than the ChatGPT hallucination factory that kicked off the LLM craze, so it’ll be interesting to see what happens when that gets merged with an IDE and a code base.
I work very much the same way with copilot. It’s a great tool and it does speed up dev time for me. It really shines with repetitive work. CRUD functions, for example, might have to write one and then from there it’s basically function name, tab, tweak. I also really enjoy Ctrl+I for repetitive tasks. Highlight a couple rows of an html form, “repeat this structure for field1, field2, field3, …” type stuff. I find it handles it pretty well. Also really great for the oddball “helper” type functions – generate a random string or format this string a certain way – one liners you need in the moment but aren’t universal enough to warrant adding another import.
It’s definitely far from perfect. It’s extremely rare that a tab completion is exactly what I wanted, but often it’s about 80% of the way there. If you’re adept at scanning the code and quickly picking out the spots where it’s wrong or needs minor refactoring to make it more what you were going for it can be a real help.
Same here, started with copilot, but lately have been using augment code. Far from perfect, but it has some very nice features over copilot. Support for large codebases, better reasoning, and it can actually execute and read the results of commandline prompts and things like that. Those prompts you have to push a button to execute btw.
Grok and Gemini 2.5 are super useful to create a solid framework that can be expanded upon. I often create the structure and then feed individual function back to the AI to keep writing the details. Overall, all the manual labor like error checking, parsing, APIs, and so on, is easily done by the AI while the construction of the app itself is done in words by me. I like that approach a lot as, at least for me personally, the tools have gotten so good that the code they output is better than my own in a fraction of the time.
Obviously, I still need to understand and follow along to make the most use of the tool but that’s the same with a hammer and a nail. Without understand how to use the tool and what the architecture needs to be, the house won’t be stable or can’t be built at all.
all typos are powered by my auto-correction AI and probably cost me 2% of my battery life.
Agree with many of the comments above.
For me, a hobbyist programmer (but technically competent and comfortable with computers) and using Python for programs just for me, I’ve found Grok to be HUGELY useful. In the past, I knew a small amount of python but had to Google very often to learn how to accomplish a specific task in python. That often involved reading lots of example code across many websites with most of the examples NOT being relevant to my need.
Now, I describe in detail a function I need to Grok to code and then inspect and use its output. If the code doesn’t generate what I want or kicks out an error, I give my program output and error messages back to Grok (along with my perspectives on what went wrong) and it almost always nails the solution.
My productivity is up I would guess 25-50X. I had written a custom web scraper by the old method a few years ago for something specific I wanted to do … it took me several weeks of painstaking hand coding. I recently worked on something similar with the new method and got it done in a day or two.
For me, the VIBE programming method has been revolutionary and door opening.
I like outlining software with an AI. I specifically ask it to not generate code. I go back and forth asking if there are security considerations I am not seeing or if there are better ways of solving problems.
I then take the outline and feed it into a new prompt (getting into overly long discussions seems to increase hallucinations) and ask it to generate code. Sometimes it takes an approach I didn’t think of, sometimes it is hot garbage. It is nice to have input though. Also it tends to name variables better than I would.
Never use here. As said above trying to articulate something …. but by that time, I can write the code. To me it’s a cop out for wanna-be programmers rather than put in the time to learn by education and experience…. If programming is the discipline you want to learn of course. On the other hand, search engines are the cats meow. Search for function or technique or example how to use ‘x’ and ‘there it is’. Works great. Way faster than looking it up in a book like we used to do. Anyway, no need for ‘so called’ AI in my programming tool box or life for that matter. I am not into moving toward more idiocracy in society… Note, I am reading a book on the history of Calculus from 1939 and revised in 1949. Yes just for fun. The author uses words on just about every other page that I have to go look up in a dictionary (and yes the words are there). It seems my education wasn’t as broad when it came to the use of words as back then. Giving me a glimpse of where we are heading I think….
GitHub Copilot’s autocomplete kinda solves the need to articulate. You just write, pretty much as you always did, and that articulates your needs well enough for it to be surprisingly helpful.
“Search engines are the cat’s meow.” They certainly were. Today the cat’s meow is an LLM that gives you the benefits of doing a Google search as you write every line of code, but without the overhead of doing a Google search as you write every line.
Granted, most of the time that’s not helpful. But occasionally it suggests something better than what I was going to write, because I thought I knew the best way to do something (I’ve been writing code for a living for decades), so of course I wasn’t going to Google it… but I didn’t know what I didn’t know. Those are the moments that make it worth the pittance they charge for a personal license.
Good summary. To add a bit, we are only at the beginning of the revolution. Those that make use of the current abilities,despite the imperfections, are gaining a baseline expertise that will carry forward as AI improves. It’s not much different than the early days of PCs. There will be USERS that can access the surface levels, and there will be EXPERTS that grow up when nothing works quite as it should and have to learn through pain and suffering… you don’t learn from successes and comfort so much as through failures and suffering.
I would suggest that education was (and still is) just fine, but often and especially in books written for a narrow audience the language used is to a higher level than usual.
The use of language also changes over time and dictionaries are cumulative works so it’s hardly a surprise you found the definitions but a book written today would likely have someone from 1939 reaching for a dictionary and likely failing to find an entry for the word they stumbled on.
Absolutely. Plus I know what I am doing. I dont know what ai is trying to do.
I agree with your point about language. Reading literature from 50-80 years ago provides ample evidence that written materials are now far less expressive. I’m sure that my own writing is less adept than that of my grandparents.
They say AI will make us lazy we should use search engines.
They said search engines would make us lazy, we should use libraries.
They said libraries will make us lazy with easy access to books. (Yes really)
And then there is Socrates who said this to plato about Plato’s scheme of writing things down:
“You have invented an elixir not of memory, but of reminding; and you offer your pupils the appearance of wisdom, not true wisdom.”
There’s nothing new under the sun. AI is no solution to human think nor the end of it.
I would like to like your comment :)
A non-coder should never use vibe coding or AI-assisted programming. First the code produced will be very low quality – the person will never be able to debug the code.
So did everyone say that AI would make us lazy and we should be using search engines – odd I never heard of that and I am an independent thinker and would never listen to such nonsense but it to have been fabricated in your mind.
This is odd – you posted “They said libraries will make us lazy with easy access to books. (Yes really). Was this the majority of those you associate with – because the first library in the US was opened in 1731. That’s pretty funny.
AI won’t make those curious and wants to build a solid foundation in STEM – but those using LLMs now are using them as basic search engines and are clueless about prompts.
I recently tried Copilot’s typealong autocomplete feature on some safety-critical embedded firmware. Naturally, I audited the heck out of the thing. It was absolutely correct 90% of the time, and the rest was mistakes that could easily have killed people. Biggest one was when it offered up a fairly long function to control the operation of a solenoid valve. It assumed that logical high was “valve closed” and low was “open”. This was very much not the case.
Like all generative AI stuff, Copilot’s output is only useful in situations where quality isn’t a concern.
I’ve found that it is very useful for generating individual routines, particularly when I carefully have it make the building blocks that I’m going to have it assemble later. It is also great for introducing me to capabilities of APIs that I’ve never really dug into. That said, I need to carefully look over its code because it does some crazily inefficient things if you let it… bit operations by converting the bytes into strings and applying boost regex? seriously? I’ve never had success letting it build above the small module level. The time to quality check and correct its code is just longer than it would take for me to write it myself.
Isaac Asimov explored this, he called it a positronic brain designing a more complex brain.
My bigger concern is the next generation of developers. “How will they learn”. Programming is two parts mental effort (as described in the first post) and one part coding, shake well until you have a working program.
‘We’ can make use if AI, be it Vibe coding or AI assisted, because we know stuff. We can mentally break down the problem. We can see what it did and why it is right (or wrong). Years of exerience tought us thay.
Now we get kids that skip all that and expect to know what the AI produces is wrong, and worse fix it.
AI is here to stay, and it certainly has its uses, but keep it out of schools (so to say). There we also see, chalk, pen and paper are still far more effective then tablets and screens in learning.
A sevond concern is that all of the efficiency and sustainability gains we have had in the last decade will be flushed down the toilet because of jow AI worls, and what it generates. Who cares we need a super computer to turn on a light switch …
The new generation of “programmers” who have used ChatGPT from day 1 of univ may be less competent than we ever imagined, but I still give them a pass. Mostly for ensuring my job security for decades to come
I remember learning to write code. What I wrote the first few years was absolute trash. It was a great day if it felt like I made any progress at all, but was usually cut short by not understanding what I was writing (or just as often copy/pasting) and screwing everything up within 5 minutes of making that progress. Which, yes, forced me to hunker down and hopefully figure out what was actually happening.. but.. usually the real breakthroughs came when I was able to ask someone more experienced and have them explain my failures.
Using tools like copilot (or I assume cursor, haven’t actually tried it) isn’t that far from this exact process, except it can do it in a significantly accelerated timeframe. Yes, you can sit down and tell it what you want and hope it will spit out good code.. but that’s not learning. That’s not even attempting to learn. It’s like sitting down at a keyboard (piano), pressing the demo song button and thinking you’ll become a virtuoso. Anyone who thinks they’ll succeed with this mindset will fail out and not be our problem.
On the other hand, those that do want to actually learn will have immediate access to code examples that probably aren’t exactly what they needed, but closer to what they were looking for than a random stack overflow page. They’ll also have the chat functions to ask when they’ve hit a wall. Granted, today, neither of these are super great options – but they will get better with time. And, even today, it might help point them in the right direction. For me, those first few projects were a serious hurdle. I’ve known many people that have gotten the basics down, but failed out when it came time to actually create something. That first hill is a steep one. I think, if used the right way, AI tools have the potential to be a huge bonus in those early days. Just takes a small change in perspective.
+1, I think it revolutionizes learning for those who have the self-awareness that what they’re looking for in these tools is not the final end-all ready-to-be-shipped solution, but a sound-board that can get them to the first starting point.
Although I think “vibe-coding” has a very specific meaning which is basically just accepting the LLM output without checking, and if it fails or tests regress just using it again until it fixes it. So that is a completely different usage where the programmer is just focused on the output but doesn’t care about the process. Fine for web apps, not so great for embedded systems.
But if you sit down and write good tests for the desired action of your embedded system, you can vibe-code the implementation until it passes all of the tests. If you strictly define your tests and can even test for optimization/speed, then I don’t see a problem. Everything moves up a level of abstraction.
The electronics hardware engineer uses op amps but doesn’t normally care about the implementation of matched differential pairs and everything else inside an IC. They just use the datasheet to validate the black box works how it should.
“Programming is two parts mental effort … and one part coding”
…and three parts debugging and testing. And maybe another part porting over to the next new system.
twenty years ago i started worrying in the abstract about how kids will ever learn without having a limited sandbox during their formative years (a god-awful 16 bit processor with segmented memory, RAM that costs so much you only have a few hundred kB of it, an OS that leaves you struggling with every detail, and the constant need to open the thing up for repairs or upgrades). now i have kids and they are absolutely hopeless. and i thought, well, maybe my kids just aren’t into computers (though one of them doesn’t do anything but play videogames every waking moment). but now i’m seeing “genx went from helping boomers set up their vcr to helping zennials set up their vcr” sort of memes and i get the feeling it’s pretty universal.
my kid can’t read what is on the screen and interpret it. he just looks for icons to tap. public middleschool makes him use ios so he is learning to recognize icons that are too subtle for my eyes (did you know, apple’s much-vaunted UX design is actually just a myth?). but he just blindly clicks on things. he seems to believe the functionality he wants will simply be there on the screen somewhere at all times and all he has to do is tap every square centimeter of it and he’ll find it. really disheartening.
this is an evergreen theme in scifi and in a few generations they’ll have a totally different perspective on this phenomenon than i do. kind of curious how it will shake out.
You sound like a great parent.
Well, let me summarize it in one sentence:
QA people will have the time of their life.
What companies still have QA teams?
I haven’t worked for one for at least a decade. I still have mixed feelings about it.
You do know AI can write tests too – you just have to ask it.
I’ve got 2 systems with vibe code in production right now. Not bug free, but workable enough to get my point across and be more useful than not.
I code(d) a lot and the stuff that it writes would take me days or weeks to sort out. Being able to read the code and see what’s going on is a huge help. I’m glad I struggled so much before so I have the experience and ability. Not sure the ramifications to figure generations. This is probably like describing python to the old assembly code writers.
Before I left work yesterday I was able to prototype a tool in less than an hour and give it to my teammates. For a project like that, I could have easily spent 3-4, days working on things. Time I did not have. Well worth the $20/month I spend on chatGPT.
If it doesn’t have to be correct, then I could code it a lot faster and it would execute a lot faster.
I’ve been porting/re-implementing a project started in Python to Rust while learning Rust. Was able to get help early on from GPT to help pedantically porting stuff, not so much use later when things were just too complex (and too weird) and too incomplete for it to grasp usefully and I had to wean off of that crutch — other than asking it to explain various compiler complaints and such every now and then.
But since then as things have closed in toward completion I’ve found GPT once again useful as a sounding-board and a test of API and documentation comprehensibility: I murder it (delete it’s memory) and give it all the current (barely commented ) code with no documentation or explanation, file after file after file after file.
Then I say “discuss”.
I see how much of the what and why it can deduce without any prose explanation, how much it gets right, then dump my current working draft documentation text on it, say “discuss” again and then spend a few days using it as a sounding board while I work.
But then after a few days it starts agreeing with me too much and not making interesting suggestions. Then I have to murder it again and start from scratch.
I don’t know if this has increased my productivity, but I’ve found it rather useful none the less – I couldn’t ask for a better proof-reader.
The general principle of vibe coding is to
1) get an LLM to emit something that probably compiles
2) if it doesn’t “look right”, complain, accept profuse apology, and goto 1
Of course companies will love it, since it will allow them to replace competent people with cheap code monkeys who can’t tell (and probably don’t care) when something is wrong..
What could possibly go wrong?
The companies are welcome to hire those cheap code monkeys to do a shotty job. It might even work for some of them. Good on them for saving a few bucks. But when they’re losing money because their competition has a significantly better product then the problem will correct itself.
We’ve had things like wikihow and youtube tutorials for 20+ years, yet when my toilet stops working and the fix involves more than a plunger or a new stopper in the tank then you better believe I’m calling the plumber. Skilled trade will always have a place.
It’s been my experience that the competition hires the same type of workers to produce a shotty job and thus has just as bad a product.
It has been my experience that companies who’s product is actual delivered software care to some degree, because they have to.
If your expecting good software from an insurance company, your going to have a bad time.
It’s real simple: If you want to report to a marketer, work for a company that trades in a commodity.
Misquote the judge: ‘The world needs report column adjusters too!’
We will ship no code until it compiles AND links!
It should be considered a ‘tool’ like anything else.
Google will give you the wrong answers if you ask the wrong question, and this works the same with Vibe coding.
My experience has been mostly positive using Cursor (leveraging the ChatGPT o3-mini-high LLM), and I would say that something that could take me a month to write might be reduced to a week or less.
As someone else stated, it makes you think more clearly about the goal and articulate your scope much clearer, and moves away from the jump-in-and-code approach that some (many) of us all-to-frequently adopt.
It’s also incredible useful if you are not an expert in a particular framework or language as it can quickly fill the gaps and act as a learning tool as well.
It’s never going to replace developers as you need to be a developer to understand where and when it’s wrong, but as another tool for the swiss-army-knife… yes, I’ll take that.
I had a real mixed bag with micropython, a few break throughs but a couple of VERY wasteful dead ends.
Anyone who it helps as much as is claimed by definition does not understand the output well enough for it to be safe.
IRINA: Do you know all these things?
CHEKOV: What I do not know I find out from the computer banks. If, if I knew nothing at all, I could navigate the ship simply by studying what is stored in there. They contain the sum of all human knowledge. They solve our problems of navigation. Of control. Of life support.
IRINA: They tell you what to do, and you do what they tell you.
CHEKOV: No. We use our own judgment also.
IRINA: I could never obey a computer.
“Herbert!”
– Star Trek TOS, ‘The Way to Eden’
Right you are!
Something seems familiar about being led into destruction by a madman…
For my needs, it’s fine, I find it really useful because I just don’t code often enough so the knowledge falls out of my head (when I do get a good week or so I find I know an awful lot) but I’m not relying on coding for my income.
I’d be really worried if I was employing coders who were ‘vibe coding’ (and if I find out I am then there will be conversations had)
I find current genAI is a decent rubber duck – good for getting me to explain to myself the problem I’m trying to solve, but I’ll have to essentially write the code myself (or vet it so thoroughly I might as well have written it myself). It’s also sometimes useful for learning a new API (usually just the rough layout, with a lot of hand-holding), or discovering tools I didn’t know existed. The autocomplete (at least as found in VS) is… interesting. I’d say it does the right thing 10-20% of the time it tries, but it often doesn’t try, and even when it tries, most of it is garbage. Luckily, ignoring the garbage is pretty easy.
I’m a hardware guy but isn’t the goal to use AI in “pair programming”? The human is the Navigator keeping verification/testing/versioning/schedule of the project to specification and overall direction in mind. And AI in the Driver role. Knows language syntax/optimization and software development tools. Does the actual coding. If I understand the software process.
We’ll be OK as long as AI never learns VHDL/Verilog …
That’s a reasonable long-term goal, but right now the AI’s role is better described as Intern. :)
While I haven’t used it for production work (yet), I have found Gemini to be extremely capable at producing fairly complex calculus solutions, well beyond what I got by trying to solve the same problems directly in python with symbols math. The subsequent functions it generated to wrap the problem included well defined interfaces and decent comments. The whole thing was much faster to get to an initial proof of concept than I would be, though would still require a fair bit of massaging.
I think a small number of good programmers using vibe coding will wipe out the need for a large number of low and intermediate developers.
Symbolic, not symbols math. #!$& Android autocorrect.
Please vibe code an edit comment feature. 🤣
Perhaps the bennefit is in the documentation.
You need to describe something technical that the machine can understand.
Use it as a rough first draft, or just toss the software out, but keep the prompt. That’s the heart of you technical documentation.
It’s not writing software.
It’s proof reading documentation.
That’s one thing I could see open source using. Other is AI helping with a systems view of everything and breaking things down to a lower level.
The first and only LLM that made this work for me was Grok 3. Openai ChatGPT is a pathetic piece of lying dumb crap when it comes to anything including coding. Man how sick and tired I got from formally describing code and getting a piece of garbage code with none existing libraries and functions that after multiple tries didn’t work. None of that with Grok 3 beta. OnlyOffice macros, Python, bash and whatever WORK! Much better explaining, reasoning, and WORKING code. Excellent milestone and amazing work!
👍
As a coding hobbies by night and a teacher by day this has been revolutionary for me, but I’ve had this same thought and experience! I have spent hours to days working on relatively simple projects with the latest an ironic game called Click for Dopamine. I posted it on itch.io if you want to check it out. I swear I could’ve done this in much less time on my own, but there are way more programming languages out there than I have time to learn. In this project I ran into the weirdest errors and found that I could hop from one LLM to another and learn more about the errors and the coding process within different environments far faster than I could’ve done on my own. The BIG drawback though, I have no idea how to navigate my own code now and will have to do my homework to be successful. My question is then, how many people are willing to do the homework before making something mainstream? In my experience that’s around 20% of my students at most. In the event we become reliant on AI instead of using AI will we end up with legacy programs that no one understands 10-20 years down the line?
i’ve been impressed with AI…i think it’s fair to compare its capabilities with some of the people i’ve worked with over the years. but i agree with the common sentiment here that if you really can produce a good result with AI, it’s only because you could produce a good result without it too.
but every now and then a benchmark jumps out at me…and the one that crossed my mind lately is i have long wanted a website that just sells zippers, so it will have a good parametric search. so far, i have always had to use ebay to buy zippers (incidentally, ebay’s fuzzy-parametric-search has gotten really good lately). so i was excited last week when i found that the biggest ebay zipper seller also has their own .com. but i looked at it, and the search is absolutely unusable. it’s effectively just a list of 1000 products with no way to navigate it. i wound up buying from that vendor, but on ebay.
when AI really upgrades the code-producing ecosystem, websites like that will have custom search and hierarchies. as long as there are people moving a bunch of units of product unable to buy enough technical expertise to actually make a minimally-usable website, i will be sure that AI hasn’t made a dent in code-productivity yet.
Whenever I think of AI writing code, it reminds me of Geordi, Data, and Prof. Moriarty on the holodeck.
I started by using ChatGPT en Claude to do some coding tasks, and that worked reasonably well, but today’s tools like Aider and Cursor really make vibe coding possible. It’s amazing what you can do in a few iterations and this is only the beginning. These tools can support different models for different stages of the project, for instance a reasoning model in the architecture fase and a cheaper/faster model for the coding. It’s really fun to think of a feature and have it coded for you. Also remember this is still at the early stages and already quite good, but these tools will get better very fast.
The biggest pitfall is that it requires a LOT of discipline to use.
The chatbots are hardwired Yes Man and will obey whatever you ask it. So it gets very tempting to have it either do all the work for you, which is the main expectation with a description like “Vibe coding”. This is how you end up with shoddy, barely functional code you don’t understand that will inevitably screw you over later down the line. This yes man attitude is also a problem if you try to use it as a Tutor as a real tutor will know when to take away the training wheels and encourage you to solve it yourself, where as the chat-bot doesn’t and will just chew things out non-stop. Tempting one to not engage in the increasingly rare act of “critical thinking”.
If you have the discipline to know it really is just an assistant to offload simplistic stuff on or as a little boost while writing via auto-complete, you will be fine. But a LOT of people don’t have that. They expect the bots to be essentially their own personal senior coder.
I made a project 10 years ago ant it took me 6 months, I asked AI to do the same project, it took 8 minuts to complete with same functionality
I’m still “meh” on the AI. It’s useful but still a small fraction of code. It’s good at specific tasks like generating boilerplate, generating a few lines, or converting from one language to another. It saves some time but doesn’t cut out the hard thinking parts.
If I relying on AI to the point of “vibe coding” it will result in not understanding the overall picture. It’s like trying to help a friend with their sloppy codebase. “Why does XYZ bug exist?” I dunno, let me spend 2 hours reading and internalizing your entire program and how the pieces interact.
I’m thinking I should use test-driven development to validate the AI results instead of wasting brainpower reading the slop.
Your statement is exactly what most are missing. As they seem to believe AI is a magical blackbox and ignore the fact that LLMs hallucinate and use Reddit and Discord as sources. But several LLMs take Reddit as fact.
Your excellent analysis of AI. “If I relying on AI to the point of “vibe coding” it will result in not understanding the overall picture”
The thing to remember is the context window on LLM’s may be impressive, but for large projects where you experience a “stack overflow”, the LLM is going to struggle much sooner. My best analogy is trying to use it for large projects is like trying to paint an average sized canvass with a microscope. Its doable, but clunky. Instead, you try to give context for what you can see under the microscope. That too is clunky though, so you try to give enough to get it to use the correct API calls, and fix the incorrect assumptions the AI made. When you need the AI to add something (for some api you don’t know) you copy and paste your work back in and ask for the fix. Being that to the point, its usually pretty good.
Beware loops though. If the ai keeps giving the same bad suggestion, what you’re asking for probably doesn’t exist. You can try rephrasing, or try to add what you think is missing context, but if you say I need a javascript API call, not a new function, to calculate the energy from fusing hydrogen, it wil give you stupid answers. You can try to adjust, give context all you want, its not going to say, that function doesn’t exist (why would it exist?) LLM’s are designed to give an answer no matter what though, and usually not the answer of that doesn’t exist. That one may be so contrived that it would tell you it doesn’t exist, but try something that seems like it should exist, and watch what you get.
I think they’re useful for stepping out of your domain, and writing boilerplate stuff. Boilerplate stuff speeds up areas you’re familiar with. Outside your domain, the speedup is massive. Software techniques apply no matter the language or domain. So you still need your experience as a software engineer, but it can help. So mostly I’m in the limited but useful camp.
Where it really gets interesting is stepping outside your field all together. Say you’re doing arduinos, but you’ve mostly just pieced little bits together. Now you want to make a semi-complex analog front end to your “arduino” project. Keeping in mind that you know what you don’t know, and you know how you had to wrangle the LLM to do what you wanted for your field of expertise, you can get some pretty functional stuff, and even debugging help out of the llms! They have impressive breadth and depth of knowledge, but can only work through a microscope. Know the limitation, and you can really make use of the systems!
I seriously think I am observing a decline in general intelligence in most of the human species (my exposure has its limits of course and I don’t observe all groups and regions, coupled with an attempt at optimism I use ‘most’, to avoid saying ‘all’), and so pehaps the rise of AI might be just in time to hold our hands and keep things going somewhat on some level, for a while at least.
I’m not a software developer though have found ChatGPT really useful for generating functions (or chained formulae) in Power BI and Excel very handy. Some of the issues noted in the piece above don’t apply then: I’m looking at 1 to 20 lines of output so having a grip on the totality of what I’m doing in my analysis isn’t an issue, and I know what the statement needs to achieve so it speeds up the “manual” labour part of finally writing it. Of course, it sometimes comes up with wild solutions – once a better solution, more often just odd. Clearer prompts help and again having it write one segment at a time, not the entirety, means I don’t miss anything creeping in. Helpful tool, though to what degree it would help someone without any Power BI or Excel experience do what I do… Not sure.
I have been trying vibe coding on various platforms for the last three weeks. I’ve been programming computer for 60 years. Here is my opinion
First, let me say that when I started out programming, we had to concentrate on design before coding. We had to make a flowchart or decision table and convert the design into code. I am a strong believer ind “design first, then code”. The design tools got better (structured pseudocode, DFD, Warnier-Orr). When UML came out, with products like Rational Rose and Embarcadero I could design the system and the tool generated the code. I could load a program into the tool and it would generate the design. I loved it!
The stuff I have been able to do with ChatGPT, Claude, Copilot, Gemini, AIStudio, NotebookLM, Manus, and Genspark over the last three weeks convinces me that this is the next step up. Design is still important: There is no substitute for clear requirements and being able to communicate those requirements to the AI tool. I was able to take projects that originally took me two-three months to achieve, and get similar output in only 3 days. Manus even improved my original project.
I am not done experimenting, but I have high hopes for higher productivity.
While I only code for fun I think you should definitely try Qwen for me it blew others away it was the only one that could add functionality to generated code without screwing the existing code up. It was good with explanations of how things work which is usable as program documentation (the stuff I hate doing). I tried GPT, Claude, Gemini, Copilot and Deepseek and will continue to evaluate new ones.
Thanks for the tip. There are so many options that I might have overlooked Owen, but now I will give it a try. Are you saying that you tried it on existing code and it made improvements?
I’ve been using Gemini to help with some personal programming stuff and small servers. And it’s just a tool, a very useful tool, but still a tool you need to know how to use.
You need to know enough to define a problem, and you need to interact with its solution enough to learn it.
I have absolutely adored it for helping with CLI. Every bit of it is broken down how it works and actually applies to what I’m doing. The old “googling” alternative was a stackoverflow post that’s kind of related written by someone with 30 years experience in their own completely arcane and unexplained way, or the worst written documentation you could possibly find.
With a program, I love it for functions and a bit for defining structure. If you go beyond that it gets confused and loses track. It’s just like working with another person really, but they don’t fit into the senior/junior role well. They know a lot, but struggle with the big picture.
End of the day, a poor craftsman blames their tools. If you use it wrong, your product will suck. But it can be very useful.
The Problem with LLM is that with doesn’t generate anything It is regurgitating what an actual human did and if you are not careful you open yourself up copyright liability if you don’t know what you are doing.
For most code snippets it probably fine, but if the scope of what you use is enough to encompass an original process you are pretty much in violation of copyright. Using the original query and the model data you can pretty much identify which source material was used as the source.
It’s getting pretty close! Very close! Still occasionally get stuck in a, “It doesn’t work.” Loop and sometimes access to the latest documentation is inaccessible, or unstickable to the model.
I use AI to write some simple VBA code, and very simple functions can be described twice again to complete a correct code. More complex functions may require multiple additional descriptions based on their errors, and cannot complete reliable code, ultimately requiring manual inspection and modification of some small details. But for me, who has a basic understanding of the basics, AI coding has been of great help to me.
“I’m no big fan of LLM generated code, but the fact that GP bluntly states “AI will never produce a deletion” despite this being categorically false makes it hard to take the rest of their spiel in good faith.”
https://news.ycombinator.com/item?id=43619759
If the code it spits out works, and doesn’t break anything, I’ll use it.
At the end of the day, if it works, it works.
I have used it to create simple PS scripts, and I can see myself using it to string together things & outputs to string other pieces of spaghetti together.
I think this tool will be used as a way of rough drafting something to then pass onto an actual dev team to refine.