Hot take: All the AI techbros arguing that LLMs have real intelligence or are the path to AGI don't realize they are just building a big data prediction machine optimized for tricking humans into believing it's intelligent by sheer brute force, without actually having any significant intelligence.
Call me when an LLM can make deductions and have seemingly original thoughts without being fed the entirety of the Internet first. That's not how actually intelligent humans work, and you're just fooling yourself with a big statistics machine if you think this is the way. Of course compressing the entirety of human knowledge into a fuzzy completion machine will make it sound intelligent. You're basically making a lossy encoded compressed version of things actually intelligent people have said, with a modicum of ability to generalize by simple association and statistics.
The Turing test breaks down when you throw enough Big Data at it. Congratulations, you've beat the test. You haven't created real intelligence though, you just cheated with big data.
@marcan Well LLMs are certainly not on the path towards AGI, but when used right(which no one does) it can do useful things as it can summarise and interpret huge amounts of text. If LLMs can be local, then we can get actually good applications of ML. For example imagine a a widget on ur phone which shows u the bullet points of the latest news with no click bait or fluff just raw information that it summarised by going to a bunch of news websites and summarising their articles. Imagine an app which would summarise all the movie reviews and tell u which is good and which is bad based on movie critics and user reviews in just a few sentences.
@marcan
>That's not how actually intelligent humans work
But how do they work, do we know? Isn't my own opinions is a neuron network combinations of hundreds of books and thousands of articles, videos and posts I read?
Did I deduct facts in my life with my pure conciseness or are they based on other humans observations?
@artem You ingested a tiny fraction of the information LLMs are trained on over the course of your life. If you were listening to speech constantly until age 18, you'd have listened to around 600 million words. GPT-3 was trained on around 1000 times more data.
An actually intelligent model needs to be able to perform like a human with a comparable corpus size to the information a human perceives during childhood, not cheat by throwing 1000x more data at the problem.
Being very generous, LLMs like GPT-3 are trained on over 1000 times the amount of words a human could possibly perceive until age 18, assuming they spent every waking minute of their life listening to speech. GPT-4 probably used more.
No, that's not how humans work, nor how intelligence works. You are not creating intelligence, you are just throwing a big SSD at the problem (and a fancy fuzzy compression algorithm). If you need 1000x more data to come up with output similar to a human, that means your model is 1000x worse at generalizing and actual intelligence than a human (and it's worse, because we know LLMs fail catastrophically when given tests that require actual intelligence, not just appearing knowledgeable). At that point you are so far off it is silly to assume that throwing more compute at the problem in the future will bridge that gap, with the same approach.
@spongefile @artem Our education system is largely based on words, and as much as we like to say "a picture is worth a thousand words", that's not how it works in terms of learning. LLMs are also not doing any of the other things humans can do that need their own knowledge and experience (like walk, avoid injury, keep yourself sustained, object persistence, working with physical tools, etc.)
@spongefile @marcan I wonder if there are any researches on how much data we consume through non-orthodox sources. Like when you overhear some conversation or read a billboard in a subway. Though @marcan's point makes a lot of sense, amount of data consumed by LLMs larger by orders of magnitude.
@artem @spongefile Keep in mind I was being *very* generous. We don't listen to speech, nor read at an equivalent rate to speech, every waking hour of our lives. If you overhear some conversation, that's speech. If you read a billboard, as long as you aren't consistently reading fast enough to average faster than would be possible with speech overall, that's still well within my estimate.
I was not at all trying to quantify how much text you listen to in school or anything like that. I lliterally just divided human waking hours until 18 by a typical rate of speech.
@marcan LLMs are pretty much fancy search engines. If you don't tell it 2+2 is 4 it's just gonna do its best to make up something that looks like the right answer with what it knows.
@Yuki The sad thing is learning to add isn't particularly impressive, we can do that with tiny neural networks. That LLMs can't even learn to reliably add just shows how restricted and limited they are outside the problem space of "convince a human that you can write intelligent text".
@marcan i agree with this like 95%, with the tiny caveat that what makes AI transformers and diffusers "intelligent" isn't what they can do, but how they learn. it's the closest we've gotten to being able simulate how a brain learns on a neurological level, connections between neurons being strengthened and that sort of thing.
but clearly it's still very far from actually simulating how human brain forms ideas since GPTs are essentially markov chains at their core. i think we need to go back to calling it machine learning, and avoid investor-bait terms like "artificial intelligence" to describe what is essentially a giant probability matrix
@keat Oh sure, it's useful to inspire ourselves by biological brains, and ML in general is extremely effective at solving certain real, constrained problems.
I don't think we can be certain this is the path to AGI though. I do agree that by definition AGI is possible (on purely physical arguments, unless you believe in some kind of metaphysical system outside of the laws of physics) but our understanding of biological brains is still woefully simplistic and incomplete, and there's no way to know we haven't missed a fundamental mechanism so far.
@marcan @spongefile I see your point, that's correct. If we continue this strategy of generousity we can slightly up amount of data for humans since reading speed is 2x-3x of speech speed, but nonetheless.
@spongefile @artem There are clearly other problems with that thought experiment (e.g. that humans are clearly not optimized to thrive and learn in that sort of environment).
However, the fact that deaf people and blind people display normal intelligence (if not stunted due to poorly tailored upbringing) shows that it isn't any particular sense that matters.
@marcan What I realised that made more sense with LLM is to understand first that its job is to guess
It will never know it's correct and you can't know whether it will be or not, but if you happen to do something where it's quicker to double check a guess (cause it's close) than to do everything, then at least it helps, but that's the limit of what it can do.
So anything where it's too intricate to benefit or creating / discovering things, it just can't do much. It's kinda annoying that people try really hard to convince themselves it can do much, but it...just can't. It might guess what doing those things LOOKS like, but never get there because that's not what it does...
@marcan I saw a headline about google using an LLM to solve a math problem. a LLM??? LLMs are the worst thing ever, like other kinds of ML models exist too, we don't have to use a LLM just because it "sounds smart".
@marcan @artem You've ingested way more data than that, it's not just a bag of words. You have eyes and ears and touch that have been feeding your brain for many years. I don't think an intelligence is possible without an enormous amount of data, especially since an intelligence is essentially measured by how it can interact with the world, and it needs data about that world to be able to do that.
@gazan @artem Blind people are not less intelligent than fully abled people, despite having a major reduction in volume of information input. That argument doesn't really work.
You certainly need more than words to cover the entirety of human learning, but that includes a large number of things that an LLM cannot and is not intended to be able to do. LLMs are about human language, and therefore it is reasonable to approach the comparison in terms of human language (and note that my estimate is very generous). Of course you need other senses to learn how to walk or cook, but an LLM is not expected to be able to walk or cook.
Intelligent people can acquire knowledge and understanding on any given subject purely through written word, without any hands-on interaction (hands-on interaction certainly helps but is not required). This, for example, covers essentially the entire field of mathematics, especially once you move on from the absolute basics (I'm considering mathematical notation written word here; it's trivially mappable to plain text if required).
@marcan LLMs are not 1000x worse at generalizing than humans!
They’re infinitely worse. They fundamentally can’t do it at all https://garymarcus.substack.com/p/elegant-and-powerful-new-result-that
An LLM is just a paraphrasing search engine. It's fed the same google cache, but instead of providing links (many of which are false positives, stale, or simply wrong information) it mixes up a text summary of what those pages collectively said.
Sometimes you mostly get wikipedia mixed with cnn.com and sci-hub. Sometimes you mix breitbart.net, project gutenberg, archive of our own, livejournal, deviantart, and the time cube guy.
The back box provides no links to sources, nothing to evaluate.
The only reason any LLM looks like it can hold conversations is fanfiction.net and obsidianportal.com and so on have a LOT of conversations the plagiarism engine can statistically splice together.
Many of them with supposed AIs, often ones having a crisis. Heck, https://clarkesworldmagazine.com/kritzer_01_15/ won a hugo. And there it is, in plain text online, already repeatedly scraped by the people making Eliza 2023 because the investors rerolled the TLA dice and LLM replaced NFT as the passphrase into Club VC.