That's not how nerds think. You can believe there's a high chance of what you're working on being dangerous and still be unable to stop working on it. As Oppenheimer put it, "when you see something that is technically sweet, you go ahead and do it".
What it's like - the gestalt of a bat (or other thing) as it engages its sensing-deciding-reacting loop. This gestalt isn't just for biological organisms, but any system for which its decision making engages with representations of the external environment unified with a self-representation to form a coherent representation of a persistent entity engaged with an external world.
Why do such systems need this gestalt? Why consciousness instead of everything happening in the dark? The recognition of oneself as situated in the world is crucial to coherent engagement with the world. It is how an entity can ensure its body parts are moving towards the same goal. It's how behavior over time doesn't undermine its purpose. Fragmented, incoherent behavior does not serve self-preservation.
LLMs as they are currently constructed probably aren't conscious, but we are a hop skip and a jump away from ones that are.
I agree that evolution could not produce a rational agent who would still reliably respond to lower level imperatives (such as pain, hunger, lust) without consciousness and feeling. The primitive parts of the brain have to be able to override the higher functions to ensure survival and reproduction. But an LLM isn't evolved in this way; its fitted to a functional output. It is entirely possible there will never be anyone home. I sure hope there isn't, because at the scale we're using them it would be a moral catastrophe.
> This gestalt isn't just for biological organisms, but any system for which its decision making engages with representations of the external environment unified with a self-representation to form a coherent representation of a persistent entity engaged with an external world.
This doesn't seem quite right, or at least underspecified. We can talk about this stuff concretely these days, at least in the context of digital systems. E.g. i can draw up a diagram of a system that takes in some camera and audio data (and tactile, proprioceptive, etc.), tokenizes it then runs that + past state data through some autoregressive VLM to drive an inference process. The state being passed around can be written out analytically for a given trained model - the external and internal environmental representations, the linear algebra that transforms them into latent action representations, the process by which that is transformed into control signals. It seems difficult to claim that the computational process that implements this has any more or less of a gestalt then one multiplying two matrices together. So it's not just the existence of certain representations or computational loops that seems to lead to possessing a gestalt.
> It seems difficult to claim that the computational process that implements this has any more or less of a gestalt then one multiplying two matrices together. So it's not just the existence of certain representations or computational loops that seems to lead to possessing a gestalt.
I've thought a lot about what is lacking in modern VLMs that preclude consciousness. In my view the difference is that their talk of "self" is a simulacrum of the real thing. Current models are feed forward and so self-talk is driven by some parameter that turns on when the network detects context that possibly references the model, and this parameter drives downstream self-talk. It's a very good simulacrum, but it is a far cry from a model with recurrent self-reference around which the inference process is organized. The richness of the self-model in a hypothetical recurrent network with capabilities of modern LMs is much greater than the parameter on/off representation in feed forward networks.
Completely agree. This is what Hofstadter means by a strange loop. Our current LLMs have no attentional autonomy by design. The recursion is superficial and without its own Now. Adding attentional autonomy is The frightening alignment issue.
> The recognition of oneself as situated in the world is crucial to coherent engagement with the world. It is how an entity can ensure its body parts are moving towards the same goal. It's how behavior over time doesn't undermine its purpose. Fragmented, incoherent behavior does not serve self-preservation.
Why would movement towards a goal be incoherent if it happened "in the dark"? Our brains perform many critical functions "in the dark" (and do so coherently) which do not rise to the level of consciousness.
Presumably the question you're asking is why does a unified self representation require consciousness. (Split brain cases are easy examples of how a break in unification results in incoherent behavior.) The brain nominally performs functions as cascading behavior of atoms whose structural relationships correspond to various functions. But there is no unification at the unconscious/atomistic level therefore a new representational regime is required that can ground the higher level unification.
A successful organism exhibits a high level of competence at reacting appropriately to environmental/sensory states. The "light's being on" is how the brain represents being situated in a world and the significant features therein. Representations within this gestalt are inherently meaningful. For example, phenomenal pain brings with it competence at protecting bodily integrity. The memory of pain becomes part of the explanatory narrative for the monitoring function that tracks progress towards goals ensuring coherent behavior (imagine being fearful of a stove but not knowing why). The contents of consciousness is the semantic engine that induces competent behavior over time on otherwise naive entities.
> For example, phenomenal pain brings with it competence at protecting bodily integrity. The memory of pain becomes part of the explanatory narrative for the monitoring function that tracks progress towards goals ensuring coherent behavior (imagine being fearful of a stove but not knowing why).
But this isn't true! It has been repeatedly shown that patients without inner brain function react to stimuli (such as being pinched or pricked with a needle) by recoiling from the pain, as do babies with no experience of pain. So qualia and consciousness seem like they have nothing to do with ensuring coherent behavior. To put this another way, your experiences and interactions with the world could be sufficient to associate the stove with danger, but how does that explain why the experience of touching the stove has qualia, as opposed to simply the pain-reaction of a patient without inner brain function or a baby?
Another counterargument is that our brains carry out lots of "coherent" functions "in the dark". Consider, for example, thermoregulation; most of the time, there is no conscious experience associated with it, but yet it is happening constantly and coherently.
Let's simplify it further: to use a famous example, do you believe that a thermostat is conscious? After all, a theremostat is able to coherently regulate its temperature over time in response to changes in its environment.
>But this isn't true! It has been repeatedly shown that patients without inner brain function react to stimuli (such as being pinched or pricked with a needle) by recoiling from the pain, as do babies with no experience of pain.
Yes, reflexive avoidance behavior doesn't require conscious experience. But as the environment of the organism gets more complex, reflexive avoidance behavior isn't sufficient for competence. For an agent in a complex environment, competent damage avoidance requires engaging with negative valence as a cognitive entity to be planned around and weighed against other interests. This requires unification and consciousness.
>Another counterargument is that our brains carry out lots of "coherent" functions "in the dark". Consider, for example, thermoregulation
This isn't an example of coherent behavior in the sense being used here. The issue is one of voluntary behavior being coherently executed as to achieve some goal without undermining itself.
> But as the environment of the organism gets more complex, reflexive avoidance behavior isn't sufficient for competence. For an agent in a complex environment, competent damage avoidance requires engaging with negative valence as a cognitive entity to be planned around and weighed against other interests. This requires unification and consciousness.
But why does engaging with negative valence, planning, and weighing actions against other interests require subjective experience? That sounds simply like a mathematical function (perhaps using our own past experiences as inputs). Reinforcement Learning is a great counterexample here: AI systems weigh negative valence and execute long-term plans without any qualia.
If thermoregulation is too "reflexive" for you, consider that there are many examples in which humans are able to perform very complex tasks in the absence of qualia. Consider, for instance, the phenomena of highway hypnosis, blindsight or sleepwalking - humans can do incredibly complicated things without qualia.
> This isn't an example of coherent behavior in the sense being used here. The issue is one of voluntary behavior being coherently executed as to achieve some goal without undermining itself.
This argument is circular. The original claim is that behaving coherently in a a complex environment requires consciousness. By shifting the goalposts to say that only voluntary behaviors qualify, you are begging the question. The entire notion of "voluntary" implies conscious intent, so your argument has become "consciously willed behaviors require consciousness".
>But why does engaging with negative valence, planning, and weighing actions against other interests require subjective experience?
I have a few different answers here. None are rock solid. Lets take it as a given that planning requires a unified representation of all inputs to the planning apparatus. Now, going with the example from earlier: an organism touches a hot stove and recoils. We can imagine this behavior without any accompanying qualia. But to plan subsequent behavior around the hot stove, the damaging hotness must be represented in the unified representation in a way that intrinsically carries the semantics of negative valence. Phenomenal pain just is "semantics of negative valence featured in a unified representation". My claim is that this is a conceptual identity; you can't have one without the other. This gives the planning apparatus competence at engaging with signals of bodily damage.
Without intrinsic semantics/phenomenality all you have is a signal with no intrinsic meaning and some context to select behavior downstream of the signal. But planning in dynamic environments requires much more flexible signaling than this kind of static context can provide.
>AI systems weigh negative valence and execute long-term plans without any qualia.
AI systems are highly fragmented representations. It's why you can get them to contradict themselves in the same session, or even one sentence after another. They are not an exemplar of coherent behavior. There's also no negative valence in LLMs. At most they have a representation of good/bad and this spectrum influences the valence/quality/alignment in their behavior. But valence as such is external to the LLM.
>consider that there are many examples in which humans are able to perform very complex tasks in the absence of qualia. Consider, for instance, the phenomena of highway hypnosis, blindsight or sleepwalking - humans can do incredibly complicated things without qualia.
Complexity is relative. The complexity of tasks sans qualia are always starkly deficient compared to comparable tasks with qualia. A wide look at cognitive science demonstrates the inherent value of qualia to highly complex tasks or tasks executed over long timescales.
>This argument is circular. The original claim is that behaving coherently in a a complex environment requires consciousness. By shifting the goalposts...
The goalposts aren't shifted, I'm clarifying the target of the term behavior as there was clearly a disagreement in meaning.
>to say that only voluntary behaviors qualify, you are begging the question. The entire notion of "voluntary" implies conscious intent, so your argument has become "consciously willed behaviors require consciousness".
This misunderstands the debate. The philosophical issue of consciousness is how to explain consciousness given the in principle completeness of physical descriptions and their categorical distinction from phenomenal descriptions. In this context, voluntary behavior is just higher order/complex behavior, it is not taken as downstream of consciousness in principle. There is a parallel conversation in psychology/cognitive science where consciousness is largely understood as wakefulness, attention, reportability, intentionality, etc. In this context "consciousness" (in this restricted sense) is a pre-requisite of voluntary behavior. But that's neither here nor there with regards to the philosophical debate.
I don't think that it is appropriate to use "gestalt" here. The word used in the field is "qualia", it has a precise meaning and is precisely what Nagel was writing about. Gestalt, to my understanding, is quite different, even when used in english psychology writing.
My usage of gestalt isn't without precedent[1]. I like gestalt better than qualia as a neutral description of the explanandum. Qualia is an atomistic view of consciousness and so is heavily theory-laden. I had just read a comment from the previous thread[2] on how this paper was translated into other languages and the lack of an equivalent "what its like" phrasing. The translations struck me as missing the virtue of the what it's like phrasing, namely identifying the intrinsic perspectivalness of cognitive systems without taking a stand on how to cash it out. I was trying to think of a better phrasing that could translate well and I landed on gestalt.
Seems like a rather ad hoc restriction. The issue is one of inferring the structure of the processes generating the output. I suppose given enough time and an adversarial style of interaction one could in principle determine the computational structure of any system with high confidence. So probably yes, modulo real-world concerns.
>At all times the LLM is, indeed, predicting the next token
The point is that saying they're just "predicting the next token" is not at all explanatory nor providing insight. Saying the brain is just firing action potentials gives you no understanding about how the brain does what it does or what the space of its capabilities are. Similarly, predicting the next token tells you nothing about the capabilities of LLMs.
True, but that is a great fact to start from, and understand.
Then the next question becomes "HOW do they predict the next token?" There are many ways that can be done, why is this particular algorithm so GOOD?"
When people say "We don't understand how LLM works" isn't it really saying we don't understand how this specific algorithm used to predict the next token works? No, it is not, because "we" do understand how all those algorithms work there are many descriptions of them available.
So the question then really is "Why is the prediction this algorithm makes, so good, as compared to some other statistical algorithms?"
It's not about "Why does AI work so well?". It should be "Why does this particular XYZ algorithm work so well?"
I think it's a perfectly fine one liner explanation.
If a kid asks why grass is green, do you stop explaining when you say chlorophyll is green, or do you go on to explain electron hybridization and all the spectra stuff, or do you go further to explain the structure of our eyes and why we perceive that reflected light as green? Also why green? Why not red? Do you have to explain that? It all depends on the audience, the context, and how much space you have to explain as well as how much you know.
For you and more experienced people of course this is not sufficient and so you need to know more being "predict tokens" and so that opens up follow up questions like "how does it do that".
The point is that the output is text that is statistically correlated with the input.
The capability of the LLM is not to reason, it's to generate text that matches the patterns seen in the training corpus. It's possible that all you need to "reason" is plausible text generation. I'm not saying it's not. But nothing the LLM does fails to be explained by plausible-text-generation.
I contend that the best way to understand an LLM's capabilities is to understand the nature of the probability distribution that produced it. For instance, why does an "angry" prompt tend to produce more help than a "polite" one? Trying to explain that in terms of emotions or reasoning doesn't make sense, but it's readily possible to explain through the connections between text in the training corpus...
>The point is that the output is text that is statistically correlated with the input.
But we can simply note that this description applies to any machine learning algorithm. Yet LLMs are lightyears better than, say, Markov chains. What people are after is something that elucidates the features of LLMs that allow them to be so productive over what came before.
There is absolutely nothing stopping someone from distilling a modern LLM into a very effective Markov chain. The physical size of the model would explode because a context window containing C tokens of size B would need B^C Markov prior states, but the actual output would be a deterministic version of the LLM's with top-n n=1 sampling.
In other words, a Markov chain and a Transformer model are exactly equivalent in power (there is NOTHING that can be done with one and not the other). The Transformer model is just better pretrained and a more efficient compression/generation.
>In other words, a Markov chain and a Transformer model are exactly equivalent in power
Nonsense. Markov chains treat the past context as a single unit, an N-tuple with no internal structure. LLMs leverage the internal structure of the context which allows a large class of generalization that Markov chains necessarily miss.
Both are a lookup table whose key is the entire context window and whose value is a probability distribution for what the next token should be.
You can say the choice of probability distribution in the value is "leveraging the internal structure of the context" or not, but the same tokens in two different orders are two different lookup keys and saying it's impossible to achieve some result with a Markov chain is factually incorrect.
That paper doesn't prove the equivalence of Transformers and Markov chains, it uses Markov chains as a theoretical model to understand the behavior of Transforms. The expressivity of the model matters, and Transformers just are more expressive than Markov chains.
>but the same tokens in two different orders are two different lookup keys
This is necessarily true for Markov chains and not necessarily true for Transformers. Transformers learn invariance over certain kinds of semantically irrelevant transformations. The Markov chain simply has to learn each input variant independently, resulting in an explosion of state space and data requirements compared to the functionally equivalent transformer. Expressive power matters.
I really don't get people's love for saying X is "just" Y (it's just a Markov chain, it's just a Kernel method). It's a strange pathology to focus on the superficial similarity while downplaying the boost in expressive power from where the models diverge.
The paper presents a constructive transformation from any finite-input (finite vocab, bounded length) transformer to an equivalent Markov chain.
Do you have some concrete example of a transformer that cannot be represented as a mapping from inputs to probability distribution of outputs?
I say they're equivalent because it is possible to losslessly convert one to the other by wasting massive amounts of disk space and time.
As a second example proving the point, imagine you sampled a transformer's output for a certain context 85 trillion times, and put the output token frequencies in a table. Repeat for all possible inputs (of which there are a finite number). Then you built literally a hash map looking up the context and spitting out the distribution. That certainly is NOT a transformer any more (it's a hash map!!!), but the output approaches indistinguishability as the sample count increases - if the transformer is reasoning, so is the hash map built from it.
I'm not talking hot air here, they really are provably equivalent because a 1:1, onto mapping exists.
For the record, "X is more expressive than Y" means "there exists at least one thing that Y cannot represent and X can". Nothing to do with size or time.
>I say they're equivalent because it is possible to losslessly convert one to the other by wasting massive amounts of disk space and time.
There is a classical algorithm for every quantum algorithm if you're willing to waste a massive amount of space and time. There is a finite-state automata that can recognize any string some Turing machine can recognize. Yet we recognize these as distinct classes of computation. Mathematicians can get away with ignoring the tractability of finding an object with such and such properties. The rest of us can't.
Sure, there is a formal equivalence between LLMs and Markov chains, and this formal equivalence is useful for analysis. But this equivalence is not a constraint on the nature of the computations LLMs are doing. The formal equivalence does not mean that LLMs are "just predicting the next token". A probability distribution is a formal characterization of the statistical relationships between inputs and outputs. But this formalization does not undermine potentially further structure underlying the probability distribution (e.g. a deterministic mapping from inputs to outputs).
>if the transformer is reasoning, so is the hash map built from it.
Definitely not. "Formal" reasoning is making deductions based on the "form" or shape of some statement. In other words, transitioning from some token sequence to another sequence in virtue of the semantic structure of the token sequence (as opposed to its semantic content). Thus a necessary condition for reasoning is the ability to inspect the structure of the input rather than see it as a formless blob. Transformers can plausibly do this; lookup tables, Markov chains, etc necessarily cannot.
>For the record, "X is more expressive than Y" means "there exists at least one thing that Y cannot represent and X can".
Maybe expressive is the wrong word. But when a model has to wait for someone else to do the work then copy the answer, I call bullshit on it being (computationally) equivalent.
Just to make sure I've understood you... Are you arguing that with a set of identically-behaving black boxes, one could be "reasoning" and one could be "not reasoning", and a person would need to look inside the boxes at how they function to decide?
Remember, if the mapping from input to output is identical, there exists no test operating on the machines' output that can differentiate them. You can't tell from "conversing with" a machine whether it is or is not doing what you say around "inspecting" the input.
>Are you arguing that with a set of identically-behaving black boxes, one could be "reasoning" and one could be "not reasoning", and a person would need to look inside the boxes at how they function to decide?
Absolutely! Inside one of the black boxes could be an audio device replaying a tape. The other could be a person thinking and responding. The massive lookup table construct people like to reference is just another kind of recorder, it takes every possible conversation that could happen in some finite sequence of characters and produces the precomputed continuation on demand. No one ever asks where those conversations came from. If God has to imagine them in his mind, conversing with the lookup table is just conversing with God.
Okay, understood. You are making a variant of the Chinese Room argument in which you allow some types of computer programs (but not others) to have reason/sentience. I'm not entirely sure what specific lines you're drawing between the programs (what makes a deterministic transformer with sampling temperature zero "not a recording" but a hash table "a recording"?) but that's not super important.
There is nothing wrong about having that philosophy, and I respect it, but personally I think if it's impossible to tell two things apart using any external observation there is not a meaningful difference between those two things. "Smells like a rose" and all that.
Well, as I suggested, working through the implementation yourself will give you that intuition. That said, I think the simplest way to explain why positional encodings are useful is that it gives the transformer just enough information to make attention meaningful without negatively impacting any parallel, content-based comparisons.
A vanilla self-attention layer is just a set of token vectors. Without positional info, swapping two identical embeddings changes very little about what attention can compute. We can "fix" this problem by using positional encodings. Text that has meaning isn't just a set of characters; the location and order of those characters is what provides meaning.
Of course for humans words have no inherent meaning either, they're just sequences of characters or patterns of sounds. It is what words are associated with that carries meaning. A large part of this is how words relate to other words. LLMs can capture this in principle. What LLMs lack is the direct association of a word with sensory experience. But it's an open question how relevant this is in practice to understanding.
Fair point. Humans experience reality and use words to reflect that. LLMs only have the words. And it's an open question how much of a limitation that is to understanding.
I wish people would do even the most basic amount of research into LLMs before opining about what they can or cannot do. There are very principled reasons why LLMs do not know how many letters are in words, and it says nothing about their facility for understanding meaning.
Tokens are the most basic input unit of an LLM. But tokens don't generally correspond to words or letters, rather sub-word sequences. So Strawberry might be broken up into two tokens 'straw' and 'berry'. It has trouble distinguishing features that are "sub-token" like specific letter sequences because it doesn't see letter sequences but just the token as a single atomic unit. 'Straw' and 'r' are two tokens but an LLM is entirely blind to the fact that 'straw' has one 'r' in it.
As an analogy, I might ask you to identify the relative activations of each of the three cone types on your retina as I present some solid color image to your eyes. But of course you can't do this, you simply do not have cognitive access to that information. Individual color experiences are your basic vision tokens.
The widespread mistake people keep making is assuming the development of intelligence in LLMs should follow the same trajectory that human intelligence takes as it develops into adult levels of intelligence. Thus deficiency in some capacity that we take for granted in humans is an indictment on LLM intelligence. But this is specious. LLMs are entirely alien; their developmental paths do not and should not look anything like ours. Your intuition from human intelligence just works against understanding the potential for intelligence in LLMs.
>The widespread mistake people keep making is assuming the development of intelligence in LLMs should follow the same trajectory that human intelligence takes as it develops into adult levels of intelligence.
To be fair, almost everyone who claims LLMs are conscious tends to claim that they are conscious in exactly the way that humans are, to the point of stating that human brains are also just complex next-token prediction machines with a random seed. It's basically religious arguments on both sides.
I have seen people say "you're a next token prediction machine" but only in a similar way one might say "you're a cup of old lard". Not actually meaning it literally.
I have seen people interpret the request to show that they are not next token prediction machines to be a claim that they are, but this is almost always an argument to show certainty is difficult in this area.
People like Hinton have declared that they believe them to be conscious, but clealy indicate that they do not mean just like us.
Eh, I’ve seen it. I’m not entirely sure it’s entirely wrong either. Humans are certainly more than just next token predictors but it’s not clear that our typical language behavior is significantly different. We call it “stream of consciousness” when we just spew words out without thinking and that seems to be the default operating mode.
Given the fact that large language models are trained on human language, it shouldn't be surprising that the text they output resembles human language. That is what they're designed to do after all. But similarity in output doesn't necessarily map to similarity in process.
And it seem obvious to me that language behavior does differ significantly between humans and LLMs based on the frequency and nature of failure states. LLMs routinely hallucinate, or get "AI strokes" or get obsessed about not talking about goblins, etc. This isn't typical language behavior for humans unless they have severe neurological or psychological impairment.
People tend not to "spew words out without thinking" and certainly not all the time by default - we call that glossolalia and (outside of some fringe Christian sects) it's considered a "bug" not a "feature" of the human brain. Human language by default always has intent behind it, even if that intent isn't readily apparent to the speaker. People can recite by rote memory, but that isn't blind token prediction, it's the neurological equivalent of muscle memory. People can have conversations then forget about them because their attention was focused elsewhere, but that doesn't indicate that they were simply "spewing words out without thinking" at the time.
> LLMs routinely hallucinate, or get "AI strokes" or get obsessed about not talking about goblins, etc. This isn't typical language behavior for humans unless they have severe neurological or psychological impairment.
People imagine details all the time. Eyewitness testimony is notoriously untrustworthy.
Our brains seem wired to confidently fill in gaps. We all have a literal blind spot we aren’t aware of because our brains convincingly lie to us and fill in the gap.
I don’t know what an “AI stroke” is, but I’ve definitely seen human beings in good health be in the middle of talking and suddenly forget what they are going to say.
> People tend not to "spew words out without thinking" and certainly not all the time by default - we call that glossolalia and (outside of some fringe Christian sects) it's considered a "bug" not a "feature" of the human brain.
Glossolalia is spouting gibberish, not comprehensible speech.
Kind of weird that you speak so confidently when you don’t apparently know the difference between steam of consciousness and “speaking in tongues”. Almost like you’re AI hallucinating.
> There are very principled reasons why LLMs do not know how many letters are in words, and it says nothing about their facility for understanding meaning. … Tokens are the most basic input unit of an LLM. But tokens don't generally correspond to words or letters, rather sub-word sequences. So Strawberry might be broken up into two tokens 'straw' and 'berry'.
This sounds like a description of a child who has not learned to read yet. You ask a child who is not aware of the alphabet and of "words" how many r's are in strawberry you'd get a non-sense answer too. So what you're really pointing out is that the LLMs have not been trained on "the english language" and how words are constructed and what they are composed of. That they operate by tokens that don't correspond to words or letters is irrelevant as an answer to why they can't count the letters in a word. It's not that I know how many r's are in strawberry because of how I'm understanding the word "strawberry", I know how many r's are in strawberry because I know how to spell strawberry. The LLM needs to be trained on this the same way someone who is learning to read would be trained on it. No one should be surprised that an LLM can't "read" in the same way no one should be surprised that a child can't "read".
>That they operate by tokens that don't correspond to words or letters is irrelevant as an answer to why they can't count the letters in a word.
This interpretation takes things too far away from how LLMs are constituted and so misses important explanatory power. The issue of counting letters in a word isn't about an ability to spell, it's about the nature of one's perception. We perceive words as sequences of individual letters. LLMs do not. I can ask you to tell me how many r's are in some nonsense word sequence and you're fully capable of doing that. LLMs do not see sequences of letters so they are intrinsically at a disadvantage for this kind of question. But this says nothing about its capacity for intelligence anymore than not naturally being able to distinguish frequencies of photons hitting your retina has anything to say about human intelligence.
> But this says nothing about its capacity for intelligence anymore than not naturally being able to distinguish frequencies of photons hitting your retina has anything to say about human intelligence.
I disagree with this pretty strongly, because I don't think you're correct that I don't have the ability to distinguish frequencies of photons hitting my retina. We have a lot of tools that can determine the frequency of light and I can use those on any source of light that I wish to measure that may hit my retinas.
If you ask an LLM how many Rs are in strawberry, it wouldn't think like this. It would confidently state that there are two Rs. Even though it "knows" that it can write a python script to count the number of Rs in strawberry, it doesn't do that. Why not? Is it maybe because it isn't intelligent? Yeah, you can prompt an LLM to write a script to count the number of Rs in strawberry, but that's a use of your intelligence, not the LLM's.
>We have a lot of tools that can determine the frequency of light and I can use those on any source of light that I wish to measure that may hit my retinas.
Yes, which is why I said naturally distinguish. Have you asked a frontier model how many r's are in strawberry recently? They get it right now. Either through RHLF to ensure they spell out the word letter by letter or some other means. Humans and LLMs both use tools or alternative means to overcome perceptual limitations. I don't see an in principle difference here.
> I think there is a flaw in the logic of saying that human text have a pattern of "consciousness mechanism" and therefore LLM will learn "consciousness mechanism" in order to return sentence continuation that is convincing.
There is no independent "consciousness mechanism" that one might imagine humans have learned or evolved for its own sake. Evolution learns various solutions to optimization problems, and so if consciousness evolved then it was either useful instrumentally, or it is a byproduct of some organization that is useful instrumentally. The point is that as a solution to certain kinds of optimization problems, consciousness can conceivably be the solution to the optimization problem of predicting the next token of text written by humans who themselves have complex phenomenology. There is nothing that a priori constrains token prediction from the domain of consciousness.
>For me, one element that shows it is the case is the absence of world model (or "human-like" world model) despite the fact that the sentence continuation is convincing
World models don't have to be rich and detailed to count as a world model. Lower life forms might be conscious but they only model the part of the world useful for their existence in their ecological niche.
> The point is that as a solution to certain kinds of optimization problems, consciousness can conceivably be the solution to the optimization problem of predicting the next token of text written by humans who themselves have complex phenomenology.
Yes, I agree with that. Consciousness is a good way of generating convincing human text.
What I don't agree with is that consciousness is the only way to generate convincing human text and that because we have convincing human text, it can only imply we have consciousness.
There is a huge probability that generating convincing human text can be done without consciousness. Either because there are efficient mechanisms as efficient as the way the human brain deal with this problem and that the LLM found one of them (and these mechanism may be quite difficult to imagine for a human). Or even because the LLM found a local minimum and is stuck there.
To re-use the evolution approach: evolution solved the "flying problem" with bird feathers, but also with insect wings or bat wings. The fact that evolution ended up using feather does not imply that everything that flies can only fly with feathers.
> World models don't have to be rich and detailed to count as a world model
I agree in general, but here, we are talking about machine that reproduce all human language. The argument I'm answering to is pretending that "all of human knowledge" is understood, which include every single human concept. This has to be everything, because LLM is able to provide convincing text about every subject. If on some subject, the LLM is able to provide convincing text without "understanding" it, then the argument that it is impossible to provide convincing text without understanding it collapse.
> There is no independent "consciousness mechanism" that one might imagine humans have learned or evolved for its own sake.
> There is nothing that a priori constrains token prediction from the domain of consciousness.
We don’t know either of these are true or false though. We simply don’t know. There is no agreed upon definition of consciousness, aside from maybe _the having of qualia_, so arguing that some can or cannot be conscious a priori can’t be done.
>There is no agreed upon definition of consciousness
No one genuinely engaged with the topic is confused about the target of the term (phenomenal) consciousness. Definitions come once the theoretical work is complete, to be articulated as part of a fully worked out theory. The lack of a definition doesn't prevent us from investigating the subject or offering conjectures. What we can do is offer a precise description of the target and argue for or against whether LLMs reach the description. We will of course debate whether the offered description captures the relevant phenomena. But this is all just part of the process.
The parent said it, it's a historical document about events and beliefs of people that shaped most of the modern world. I was never one for history, but as I've gotten older I've come to appreciate history as a study of the present in terms of events, ideas, and other influences that made the present what it is. You can't understand the present without understanding the past.
It shouldn't cause you so much friction to hold an idea in your head you don't believe to be true. Read it as anthropology rather than metaphysics.
>The lucrative industry shows few signs of waning–from the spike in well-compensated diversity consultants and czars; to online courses and degree programs at prestigious schools; to professional organizations and conferences; to the commissioning of ever more studies, task forces and climate surveys. The buzzword is emblazoned on blogs and books and boot camps, and Thomson Reuters, a multinational mass-media and information firm, even created a Diversity and Inclusion Index to assess the practices of more than 5,000 publicly traded companies globally.
reply