I believe what is missed here is that the brain assimilate things outside of it (not in the physical sense of course). Use a hammer for some time and the brain start to dedicate some networks to simulate the hammer internally and to integrate the hammer as parte of your body. The brain starts using the hammer in the same way it uses your hand. It becomes part of what the higher level processes use to read from and to manifest into the world.
Nothing new here, this is called tool embodiment. A little time after assimilation, you stop consciously thinking about the tool. You are the hammer, the hammer is you.
So what is being missed here is that the brain operates on, well, mental constructs. Ideas and ways of thinking. But those are not the world or the brain, those are tools.
The higher level processes into your mind use ways of thinking the same way it uses the body. Its unconscious (because it has been doing this for enough time) and automatic. The brain just gets guided by the tools. It wants to hammer that nail.
So, what does it have to do with crossing the street and not being able to transmit this knowledge?
You can’t transmit the incorporation. You can describe how to do things, how to think about things, but you can’t reconnect other people’s neurons to establish a way of thinking or a tool as part of the brain image of the self. Yet.
You can’t teach a baby how to embody his spine. You can’t teach someone how to become his thoughts. But you can’t certainly guide then on the use and this usage will build the neural networks. Once established, they’ll get it.
When I was 17 I was hired by a startup to write a book. The end product was a complete disaster (don't hire a 17 year old to write a book, also don't enter into contracts with 17 year old high school students w/o informing parents.)
The book was on 3d modeling in Rhino 3d. I was really good at Rhino3d at that time, to the extent that using it felt like a natural extension of my hands. IMHO every other 3d modeling program has a trash UI compared to the absolutely amazing UI that Rhinoceros 3d has.
I had to learn how to translate my absolute love of Rhino3D onto a page and explain it to other people. It was hard. It made my brain work in ways it was not used to, but it was an incredibly valuable experience.
The only remaining copy of the book sits behind me on a bit rotted CDR.
I have had 3 types of math teachers in life. American teachers, who generally teach rules from a book according to a curriculum. Russian teachers, who have a passion and a love for the field and who teach how to intuitive the answer to math problems first before going all in on the formulas. And East Asian math teachers who show off the beauty of the equations themselves.
I had one math teacher who couldn't speak English. He didn't need to, he had an incredible ability to communicate math through pure equations. It was lovely, one of the best math classes I've ever had. Math was truly used as a universal language.
I had another teacher (Russian) who got so excited solving equations and explaining DiffEq that he'd break his chalk in half and he'd go diving under desks to pick up the pieces.
But it is artists who are some of the best at transmitting intuitive knowledge. They have centuries of best practices of how to train students to rewrite their brains to literally see the world differently. (And yes a lot of it does involve drawing boring still lives of fruit bowls! But, hey, it works)
And this is also why all institutions rot over time. When the original experts and founders are gone, only the codified superficial knowledge remains. The result is bureaucracy coasting on the resources and reputation built by the original founders. Works until it encounters real problems that are not solved by superficial rules.
Right.. Countless times I see people struggle with something then complain about the fact that nobody wrote anything down ahead to hand hold them through the problem. As if the old experts would've known to write that. And often times they did write stuff, but it either wasn't read or it ran into this issue of instruction being incapable of transmitting everything you actually need.
Reminds me of when I was learning to drive. Used to get very creative with my mistakes. Also reminiscent of speaking a foreign language; you may know all the relevant grammar rules, but fail to apply them in real-time.
That said, procedural knowledge remains suspicious because it could hinge on a cheap mental shortcut. A very experienced pedestrian may unconsciously make terrible decisions based on overfitted train data or causally irrelevant variables. Expressing the model "formally" can help expose those terrible decisions (e.g. I feel good about 1980s Fiat Pandas so I cross more confidently when I see them...). Problem is that introspection does not always work well.
I would call the difference intuitive knowledge versus rational knowledge.
I've never seen the word calibration used this way:
different modes of learning. The first is instruction: the transfer of explicit models, rules, and relationships from one person to another through language. The second is calibration: the development of internal models through repeated exposure to feedback in a specific environment.
Judgement is learnable through calibration. It is not transmissible through instruction.
Unfortunately the word "intuition" has been debased.
You raise an interesting question. How do we keep the meanings of words from diverging so dramatically and so rapidly?
A little bit is natural and expected, but this kind of change in meaning feels like a consequence of a culture that in the last decade has accelerated the practice of re-framing specific words and concepts as something that's "actually a positive" or "actually quite negative if you think about it".
Part of this is a result of our (in the US) culture wars and hijacking of popular terms, but it's also a symptom of social media culture that's always seeking a hot take and creators who are looking to distinguish themselves with (what seems to me) clever re-framing.
The result is a culture that is increasingly fragmented and in which a word can have dramatically different meaning and insinuations depending on it's use in certain social groups or intellectual cliques.
It increasingly feels like I need to download a massive amount of linguistic context before I step into the world of a niche online community because their tight-knit dialogues and shared experiences have now re-framed a word or concept that was largely understood to mean something else.
It's always been like this, just on a smaller scale. Every time you join a group, some people can read the room, learning and sensing the cultural implications, while others step in all the landmines and don't even hear the explosions. How do you do this? Not sure how to explain it, mostly calibration through experience!
> How do we keep the meanings of words from diverging so dramatically and so rapidly?
We don’t engage. It’s the only shot we have.
There was a useful article at 404 Media recently about our failure to prevent those on the extreme edges of culture from normalizing their language and behavior: We Have Learned Nothing About Amplifying Morons[0]. See the article, but essentially by engaging we cede ground. Sorta like how both-sides journalism gives space to anti-science nuts and lets them spread falsehoods.
I believe the author was arguing that “calibration” is also rational but it cannot be transmitted. You cannot learn it from reading or following a framework. Books and frameworks are too lossy. The author cited the example of doctors in their residency as an example of this second mode of learning. They are learning from hands on experience what other doctors had also learned before. With residency there are others who oversee the residents.
You're arguing against something I wasn't trying to imply.
Choosing a good abstract dichotomy is hard (mine is also faulty, as you have noted).
They chose "instruction" versus "calibration" which I feel is a terrible splitting plane (muddying whatever they are trying to articulate).
I have been fascinated listening to a smart nursing friend of mine explain some of the intuitions they learnt through observation (not explicitly taught). I believe they had an outlier skill for noticing patterns. They might have been able to teach the patterns they saw, but they probably couldn't teach the skill of discovering patterns ≈intelligence.
Language changes though . He's directionally correct about calibration. People have some intuition about how something works, and then calibrate this tgrough feedback.
I think despite the presentation this has some good ideas in it:
1. Formally calling out a concept for judgment-based skills that cannot be easily taught. I think everybody understands this, but having a word for this would be useful.
2. Opening up the conversation on the topic of which types of skills can/should be codified and how.
That said, everything else in the article is suss. "Dimensionality" is largely a distraction to try to sound smart, most of the claims in the article are unwarranted (e.g. processes & checklists can be great, even for disciplines with true experts like airline pilots).
For example saying that skill learning cannot be accelerated is just patently false in many domains -- take something like learning chess. If you have a coach and other tools you'll learn a LOT faster. But certainly I've worked places that wished they could automate away reliance on experts because it gives organization power to a few non-management individuals.
> “Street smarts” refers to models that are too high-dimensional for linguistic transmission and were therefore acquired through calibrated experience. The street-smart person cannot explain why they know what they know, which makes them look inarticulate to the book-smart person, which leads the book-smart person to conclude that the street-smart person’s knowledge is inferior. This conclusion is precisely backwards in domains where judgement matters. The inability to articulate the model is not evidence of a crude model. It is evidence of a model too sophisticated for the transmission channel.
I disagree to a degree. Yes, what the author says is accurate about people dismissing street-smarts as a lower level of intelligence than it deserves. But a sufficiently skilled communicator can absolutely articulate many of the factors being evaluated when they judge a situation and how their descision-making process works.
> They evaluate intelligence through the lens of articulacy
There was an earlier instance of the author using a word such as unability (or similar) and it should have been inability and I let it go, but this misuse of language is making my head hurt. However, I confess that I thought the word should have been articularity and it turns out that’s not a real word either. But I at least pay attention to spellcheck. I don’t understand how someone could take the time to write a long and thoughtful essay about intelligence and not use spellcheck to proof it.
> But a sufficiently skilled communicator can absolutely articulate many of the factors being evaluated when they judge a situation and how their descision-making process works
That sounds right but I suspect it is wrong. Watching smart intuition has been a personal interest of mine for years. Few people avoid the manifold traps.
1: people hallucinate their reasoning or are self-deceptive (or even intentionally deceptive). Watching AI has helped hone watching people.
2: you need to be sufficiently close in skills and language for someone to be able to communicate the nuances. E.g. sportspeople.
3: Judging whether an intuitive statement is true is hardhard. We need to identify a correct intuition (and ignore incorrect intuitions) before judging whether some explanation is valid.
What is wrong in that quoted sentence? Do you mean "articulacy" should instead be "articulateness"? "Articulacy" is also a word, and correct in this context.
I have terrible news for you. Linguistics is descriptive, not prescriptive. We will torment you with word game playing until such time as you loosen up.
This jives with something that’s occupied my brain a couple times in the last year, the separation of art and science.
Science is empirical knowledge and processes which can be transferred, art is gut feeling and subconscious knowledge applied automatically, which can’t be transferred.
Roughly I think this corresponds to how our minds perform cognitive offloading of repeated tasks. New tasks that require instruction following occupy our attention, but the more we do them, the more our minds wire the behavior into our “muscle memory”. Practitioners of the arts (or even the art of science, one might say) have a built a neural network that offloads tasks so that higher cognitive functions can focus on applying those tasks in expert ways.
It’s sort of like when we start out our brains have to bitbang all tasks (muscle movement, speech, etc.) but over time our brains develop their own TCP offloading, or UART peripherals. And you can’t just download a TCP offloading engine, it has to be built into the silicon. Hence why “expert knowledge” isn’t transmissible.
Which is why spaced repetition is an effective learning method. You’re hacking your brain to wire facts into the hardware.
Thinking about this in the context of machine learning.. We can discover the dimensions and relationships between them through training over a set of examples.
What we are generally getting though is a network with extremely high dimensionality trained on many domains at once, at least as far as the commonly used ones like LLMs and VLMs.
We do have mixture of experts which I guess helps to compress things.
Going back to the idea that this stuff just can't be represented by language, I wonder if someday there could be a type of more concise representation than transmitting for example a LoRA with millions of bytes.
Maybe if we keep looking at distillation of different models over and over we might come up with some highly compressed standardized hierarchical representation that optimizes subdomain or expert selection and combination to such a degree that the information for a type of domain expertise can be transmitted maybe not orally between humans but at least in very compact and standard way between models.
I guess if you just take something like a 1B 1 bit model and build a LoRA for a very narrow domain and then compress that. That's something like the idea. Or maybe a quantized NOLA.
But I wonder if someday there will be a representation that is more easily interpretable like language but is able to capture high dimensional complex functions in a standard and concise way.
This has me considering law school vs something like, officiating a sport. A foul in basketball, or a balk in baseball are very much just judgement calls by the respective officials, kinda you know it when you see it. You can write down what a foul is, or what a balk is, and if you play by the letters of the rules in place, you will likely have a miserable game and it will be entirely the official's fault, you can also have a game where clear "fouls" are being committed but not called unless they go beyond what has been set as the baseline acceptable level of play(not in the spirit of defense or whatever), and it could be considered a wonderful game by all parties involved (let them play in action).
Kinda has me wondering about the implications of the BAR being the end all be all of a law school. Contrast it with a Doctor's residency and i think law school is very much crafting an overly binary right/wrong profession, and perhaps they should have something beyond it more akin to something like officiating a sports game, where they see potential implications of being too stringent applying their rule system when there is certainly room for being charitable.
It is a complex issue though, because the charitable interpretation of a law gives way for bad actors to abuse that interpretation.
Now bridge this all with all the weird 1st level and 2nd level stuff surrounding medicine that is placed there by people outside of the field of medicine and imposed on an expert of the medical field. They have to apply their expertise to the patient, decide a course of action, and then describe that action in those 1st/2nd level terms to a non expert who for some reason is the deciding authority, despite downgrading the expert's actual thoughts by design. I know im all over the place, but it was a pretty good article that made me think about a lot of different applications of the ideas.
Fully agree and this something I only fully realised quite late in life.
One of the implications is that at any given point in time, the vast majority of human knowledge is living in people's brains and cannot be stored. The seemingly ineluctable and almost mechanical progression of technology is happening on a thin margin between generational losses.
isn't that also related to active vs passive knowledge?
active knowledge i can produce on command. passive knowledge only comes up when it is triggered from the outside.
a lot of things we know are only accessible when they are triggered. i could not describe the path through a forest, but i know it when i walk it. same for the process to solve a particular problem. i could not describe the whole way. only if i follow the steps can i remember the next steps. and if there are multiple possible steps at some point, only for the choices that are actually triggered will i remember what comes after.
you can follow me and learn the steps by observing me, but you will only learn the steps that we actually do. and i couldn't possibly give you a list of all other potential steps.
in the latest video by tom scott, he watches people creating bells. the guy observing the molten metal knows when it has the right temperature by just looking at it. he learned that from years of practice, and his future replacement will learn it by observing him.
Absolutely agree. And this is why America cannot simply bring outsourced industries like manufacturing back. Expertise is built through hard-won experience and there is no easy way to transfer or replicate it. And this is why the best forms of instruction are akin to apprenticeship. You can’t teach expertise through a book, but you can guide a person to develop their own expertise and speed up the process.
Saying certain thing hurts them like a scientist's knife that opens up an animal for a study but also kills it. Thus certain knowledge about life cannot be expressed in an analytical way: it loses the very subject it tries to catch. But if you use the knife merely to point to the animal... that is, if you use words not as an explanation but as means to guide the listener's attention, then there is a chance to convey that understanding. A parable works this way; or a work of real art.
"Language is a serial, low-bandwidth channel. It transmits one proposition at a time, sequentially. Each proposition can relate a small number of variables: “if X and Y, then Z.” Complex conditionals can extend this to perhaps five or six variables before the sentence becomes unparseable: “if X and Y but not Z, unless W and V, then Q.”
...
"This is not a claim about human cognitive limitations. It is an information-theoretic claim about the channel capacity of natural language relative to the complexity of the models being transmitted."
No.
This definitely is about human cognitive limitations. Consider *why* that 6-variable rule is unparseable: Human working memory is typically about 7 items. His unparseable example has by a generous accounting 10 entities--6 variables and 4 relationships. (And by a strict accounting you need to count the "then", for 11 entities.) Very, very few humans could learn that. Knowledge must be chopped up into chunks small enough to fit into our working memory to be understood and incorporated into our models.
Once one chunk has been incorporated into our model we can then add another chunk to it, repeating until we have built up a complex model. And it's not just read, store, read another--each bit must be worked with to actually be modeled. (But this process does not need to be strictly linear--to take his stupid pedestrian, they can separately learn speed, distance, stopping distance etc and tie them together. But if his pedestrian cares if the driver is inattentive his model is wrong--why are you doing something that involves a moving driver being aware of you in the fist place??)
"The second, more fundamental reason is that the relevant features are not given in advance. A large part of what the expert learns through experience is which features of the environment matter."
Yes.
"This is the deepest reason why experience cannot be compressed."
No. Finding relevant features and compressing their valuation is what ML systems do. Especially in vision systems. Early attempts at machine vision had human-chosen features and detectors for them. There used to be papers with suggested features - horizontal lines, vertical lines, diagonal lines, patterns of dots, colors, and such. That's where things were in the 1990s.
Those have now been replaced by learning-generated low level feature detectors, which work better.
This does need a lot of training content. The training process is inefficient. It definitely compresses, though.
ai;dr: experience is denser than words can articulate. Also (I don’t think it was mentioned) tacit knowledge is reflexive: it’s recalled in the scenario it was acquired, which is usually when it’s needed, whereas one must think through to recall book knowledge.
I agree with the article’s main point, but it’s flagrantly AI and (as usual with AI) way too verbose.
I think it read well, perhaps too intimidating for some to get into after the first couple paragraphs, but i think it's pretty much tailored at the audience here. I'm resistant to calling it obviously AI as well, the ideas are well thought out and it flows just fine, its certainly not a "here my article title and 2 sentence overview of whats its about, now write it", level of assistance, which i think your comment is kinda making it seem as though it is.
I believe what is missed here is that the brain assimilate things outside of it (not in the physical sense of course). Use a hammer for some time and the brain start to dedicate some networks to simulate the hammer internally and to integrate the hammer as parte of your body. The brain starts using the hammer in the same way it uses your hand. It becomes part of what the higher level processes use to read from and to manifest into the world.
Nothing new here, this is called tool embodiment. A little time after assimilation, you stop consciously thinking about the tool. You are the hammer, the hammer is you.
So what is being missed here is that the brain operates on, well, mental constructs. Ideas and ways of thinking. But those are not the world or the brain, those are tools.
The higher level processes into your mind use ways of thinking the same way it uses the body. Its unconscious (because it has been doing this for enough time) and automatic. The brain just gets guided by the tools. It wants to hammer that nail.
So, what does it have to do with crossing the street and not being able to transmit this knowledge?
You can’t transmit the incorporation. You can describe how to do things, how to think about things, but you can’t reconnect other people’s neurons to establish a way of thinking or a tool as part of the brain image of the self. Yet.
You can’t teach a baby how to embody his spine. You can’t teach someone how to become his thoughts. But you can’t certainly guide then on the use and this usage will build the neural networks. Once established, they’ll get it.
When I was 17 I was hired by a startup to write a book. The end product was a complete disaster (don't hire a 17 year old to write a book, also don't enter into contracts with 17 year old high school students w/o informing parents.)
The book was on 3d modeling in Rhino 3d. I was really good at Rhino3d at that time, to the extent that using it felt like a natural extension of my hands. IMHO every other 3d modeling program has a trash UI compared to the absolutely amazing UI that Rhinoceros 3d has.
I had to learn how to translate my absolute love of Rhino3D onto a page and explain it to other people. It was hard. It made my brain work in ways it was not used to, but it was an incredibly valuable experience.
The only remaining copy of the book sits behind me on a bit rotted CDR.
I have had 3 types of math teachers in life. American teachers, who generally teach rules from a book according to a curriculum. Russian teachers, who have a passion and a love for the field and who teach how to intuitive the answer to math problems first before going all in on the formulas. And East Asian math teachers who show off the beauty of the equations themselves.
I had one math teacher who couldn't speak English. He didn't need to, he had an incredible ability to communicate math through pure equations. It was lovely, one of the best math classes I've ever had. Math was truly used as a universal language.
I had another teacher (Russian) who got so excited solving equations and explaining DiffEq that he'd break his chalk in half and he'd go diving under desks to pick up the pieces.
But it is artists who are some of the best at transmitting intuitive knowledge. They have centuries of best practices of how to train students to rewrite their brains to literally see the world differently. (And yes a lot of it does involve drawing boring still lives of fruit bowls! But, hey, it works)
>IMHO every other 3d modeling program has a trash UI compared to the absolutely amazing UI that Rhinoceros 3d has.
It's not just you. There is something about it that is qualitatively different.
And this is also why all institutions rot over time. When the original experts and founders are gone, only the codified superficial knowledge remains. The result is bureaucracy coasting on the resources and reputation built by the original founders. Works until it encounters real problems that are not solved by superficial rules.
Right.. Countless times I see people struggle with something then complain about the fact that nobody wrote anything down ahead to hand hold them through the problem. As if the old experts would've known to write that. And often times they did write stuff, but it either wasn't read or it ran into this issue of instruction being incapable of transmitting everything you actually need.
Reminds me of when I was learning to drive. Used to get very creative with my mistakes. Also reminiscent of speaking a foreign language; you may know all the relevant grammar rules, but fail to apply them in real-time.
That said, procedural knowledge remains suspicious because it could hinge on a cheap mental shortcut. A very experienced pedestrian may unconsciously make terrible decisions based on overfitted train data or causally irrelevant variables. Expressing the model "formally" can help expose those terrible decisions (e.g. I feel good about 1980s Fiat Pandas so I cross more confidently when I see them...). Problem is that introspection does not always work well.
I would call the difference intuitive knowledge versus rational knowledge.
I've never seen the word calibration used this way:
Unfortunately the word "intuition" has been debased.You raise an interesting question. How do we keep the meanings of words from diverging so dramatically and so rapidly?
A little bit is natural and expected, but this kind of change in meaning feels like a consequence of a culture that in the last decade has accelerated the practice of re-framing specific words and concepts as something that's "actually a positive" or "actually quite negative if you think about it".
Part of this is a result of our (in the US) culture wars and hijacking of popular terms, but it's also a symptom of social media culture that's always seeking a hot take and creators who are looking to distinguish themselves with (what seems to me) clever re-framing.
The result is a culture that is increasingly fragmented and in which a word can have dramatically different meaning and insinuations depending on it's use in certain social groups or intellectual cliques.
It increasingly feels like I need to download a massive amount of linguistic context before I step into the world of a niche online community because their tight-knit dialogues and shared experiences have now re-framed a word or concept that was largely understood to mean something else.
It's always been like this, just on a smaller scale. Every time you join a group, some people can read the room, learning and sensing the cultural implications, while others step in all the landmines and don't even hear the explosions. How do you do this? Not sure how to explain it, mostly calibration through experience!
Not being self-centered is helpful.
> How do we keep the meanings of words from diverging so dramatically and so rapidly?
We don’t engage. It’s the only shot we have.
There was a useful article at 404 Media recently about our failure to prevent those on the extreme edges of culture from normalizing their language and behavior: We Have Learned Nothing About Amplifying Morons[0]. See the article, but essentially by engaging we cede ground. Sorta like how both-sides journalism gives space to anti-science nuts and lets them spread falsehoods.
0. https://www.404media.co/we-have-learned-nothing-about-amplif...
I believe the author was arguing that “calibration” is also rational but it cannot be transmitted. You cannot learn it from reading or following a framework. Books and frameworks are too lossy. The author cited the example of doctors in their residency as an example of this second mode of learning. They are learning from hands on experience what other doctors had also learned before. With residency there are others who oversee the residents.
You're arguing against something I wasn't trying to imply.
Choosing a good abstract dichotomy is hard (mine is also faulty, as you have noted).
They chose "instruction" versus "calibration" which I feel is a terrible splitting plane (muddying whatever they are trying to articulate).
I have been fascinated listening to a smart nursing friend of mine explain some of the intuitions they learnt through observation (not explicitly taught). I believe they had an outlier skill for noticing patterns. They might have been able to teach the patterns they saw, but they probably couldn't teach the skill of discovering patterns ≈intelligence.
Intuition is just our brains' amazing pattern recognition ability at work.
Or maybe inductive vs deductive reasoning
Language changes though . He's directionally correct about calibration. People have some intuition about how something works, and then calibrate this tgrough feedback.
I think despite the presentation this has some good ideas in it:
1. Formally calling out a concept for judgment-based skills that cannot be easily taught. I think everybody understands this, but having a word for this would be useful.
2. Opening up the conversation on the topic of which types of skills can/should be codified and how.
That said, everything else in the article is suss. "Dimensionality" is largely a distraction to try to sound smart, most of the claims in the article are unwarranted (e.g. processes & checklists can be great, even for disciplines with true experts like airline pilots).
For example saying that skill learning cannot be accelerated is just patently false in many domains -- take something like learning chess. If you have a coach and other tools you'll learn a LOT faster. But certainly I've worked places that wished they could automate away reliance on experts because it gives organization power to a few non-management individuals.
> “Street smarts” refers to models that are too high-dimensional for linguistic transmission and were therefore acquired through calibrated experience. The street-smart person cannot explain why they know what they know, which makes them look inarticulate to the book-smart person, which leads the book-smart person to conclude that the street-smart person’s knowledge is inferior. This conclusion is precisely backwards in domains where judgement matters. The inability to articulate the model is not evidence of a crude model. It is evidence of a model too sophisticated for the transmission channel.
I disagree to a degree. Yes, what the author says is accurate about people dismissing street-smarts as a lower level of intelligence than it deserves. But a sufficiently skilled communicator can absolutely articulate many of the factors being evaluated when they judge a situation and how their descision-making process works.
> They evaluate intelligence through the lens of articulacy
There was an earlier instance of the author using a word such as unability (or similar) and it should have been inability and I let it go, but this misuse of language is making my head hurt. However, I confess that I thought the word should have been articularity and it turns out that’s not a real word either. But I at least pay attention to spellcheck. I don’t understand how someone could take the time to write a long and thoughtful essay about intelligence and not use spellcheck to proof it.
> But a sufficiently skilled communicator can absolutely articulate many of the factors being evaluated when they judge a situation and how their descision-making process works
That sounds right but I suspect it is wrong. Watching smart intuition has been a personal interest of mine for years. Few people avoid the manifold traps.
1: people hallucinate their reasoning or are self-deceptive (or even intentionally deceptive). Watching AI has helped hone watching people.
2: you need to be sufficiently close in skills and language for someone to be able to communicate the nuances. E.g. sportspeople.
3: Judging whether an intuitive statement is true is hardhard. We need to identify a correct intuition (and ignore incorrect intuitions) before judging whether some explanation is valid.
What is wrong in that quoted sentence? Do you mean "articulacy" should instead be "articulateness"? "Articulacy" is also a word, and correct in this context.
Articulation. The lens of articulation. Or otherwise, "eloquence;" the lens of eloquence.
LLMs ain't gonna do sheeeeeeeit if this is still where we are at...
I have terrible news for you. Linguistics is descriptive, not prescriptive. We will torment you with word game playing until such time as you loosen up.
The torment is evidently yours [0]. However, I am ecstatic you reveal your own inner turbulence, which I have deliberately engineered.
By the way, how are those food prices working out for ya bud [1]? I understand that you are struggling, but please try not to get, quote, "violent".
Pause. Empathize. Apologize. Then post.
[0] https://news.ycombinator.com/item?id=47639747
[1] https://news.ycombinator.com/item?id=47636685
This jives with something that’s occupied my brain a couple times in the last year, the separation of art and science.
Science is empirical knowledge and processes which can be transferred, art is gut feeling and subconscious knowledge applied automatically, which can’t be transferred.
Roughly I think this corresponds to how our minds perform cognitive offloading of repeated tasks. New tasks that require instruction following occupy our attention, but the more we do them, the more our minds wire the behavior into our “muscle memory”. Practitioners of the arts (or even the art of science, one might say) have a built a neural network that offloads tasks so that higher cognitive functions can focus on applying those tasks in expert ways.
It’s sort of like when we start out our brains have to bitbang all tasks (muscle movement, speech, etc.) but over time our brains develop their own TCP offloading, or UART peripherals. And you can’t just download a TCP offloading engine, it has to be built into the silicon. Hence why “expert knowledge” isn’t transmissible.
Which is why spaced repetition is an effective learning method. You’re hacking your brain to wire facts into the hardware.
You might enjoy a 70s classic, Zen and the Art of Motorcycle Maintenance
Thinking about this in the context of machine learning.. We can discover the dimensions and relationships between them through training over a set of examples.
What we are generally getting though is a network with extremely high dimensionality trained on many domains at once, at least as far as the commonly used ones like LLMs and VLMs.
We do have mixture of experts which I guess helps to compress things.
Going back to the idea that this stuff just can't be represented by language, I wonder if someday there could be a type of more concise representation than transmitting for example a LoRA with millions of bytes.
Maybe if we keep looking at distillation of different models over and over we might come up with some highly compressed standardized hierarchical representation that optimizes subdomain or expert selection and combination to such a degree that the information for a type of domain expertise can be transmitted maybe not orally between humans but at least in very compact and standard way between models.
I guess if you just take something like a 1B 1 bit model and build a LoRA for a very narrow domain and then compress that. That's something like the idea. Or maybe a quantized NOLA.
But I wonder if someday there will be a representation that is more easily interpretable like language but is able to capture high dimensional complex functions in a standard and concise way.
This has me considering law school vs something like, officiating a sport. A foul in basketball, or a balk in baseball are very much just judgement calls by the respective officials, kinda you know it when you see it. You can write down what a foul is, or what a balk is, and if you play by the letters of the rules in place, you will likely have a miserable game and it will be entirely the official's fault, you can also have a game where clear "fouls" are being committed but not called unless they go beyond what has been set as the baseline acceptable level of play(not in the spirit of defense or whatever), and it could be considered a wonderful game by all parties involved (let them play in action).
Kinda has me wondering about the implications of the BAR being the end all be all of a law school. Contrast it with a Doctor's residency and i think law school is very much crafting an overly binary right/wrong profession, and perhaps they should have something beyond it more akin to something like officiating a sports game, where they see potential implications of being too stringent applying their rule system when there is certainly room for being charitable. It is a complex issue though, because the charitable interpretation of a law gives way for bad actors to abuse that interpretation.
Now bridge this all with all the weird 1st level and 2nd level stuff surrounding medicine that is placed there by people outside of the field of medicine and imposed on an expert of the medical field. They have to apply their expertise to the patient, decide a course of action, and then describe that action in those 1st/2nd level terms to a non expert who for some reason is the deciding authority, despite downgrading the expert's actual thoughts by design. I know im all over the place, but it was a pretty good article that made me think about a lot of different applications of the ideas.
Fully agree and this something I only fully realised quite late in life.
One of the implications is that at any given point in time, the vast majority of human knowledge is living in people's brains and cannot be stored. The seemingly ineluctable and almost mechanical progression of technology is happening on a thin margin between generational losses.
isn't that also related to active vs passive knowledge?
active knowledge i can produce on command. passive knowledge only comes up when it is triggered from the outside.
a lot of things we know are only accessible when they are triggered. i could not describe the path through a forest, but i know it when i walk it. same for the process to solve a particular problem. i could not describe the whole way. only if i follow the steps can i remember the next steps. and if there are multiple possible steps at some point, only for the choices that are actually triggered will i remember what comes after.
you can follow me and learn the steps by observing me, but you will only learn the steps that we actually do. and i couldn't possibly give you a list of all other potential steps.
in the latest video by tom scott, he watches people creating bells. the guy observing the molten metal knows when it has the right temperature by just looking at it. he learned that from years of practice, and his future replacement will learn it by observing him.
Absolutely agree. And this is why America cannot simply bring outsourced industries like manufacturing back. Expertise is built through hard-won experience and there is no easy way to transfer or replicate it. And this is why the best forms of instruction are akin to apprenticeship. You can’t teach expertise through a book, but you can guide a person to develop their own expertise and speed up the process.
The Tao that can be spoken, is not the real Tao :-)
Saying certain thing hurts them like a scientist's knife that opens up an animal for a study but also kills it. Thus certain knowledge about life cannot be expressed in an analytical way: it loses the very subject it tries to catch. But if you use the knife merely to point to the animal... that is, if you use words not as an explanation but as means to guide the listener's attention, then there is a chance to convey that understanding. A parable works this way; or a work of real art.
He's got it backwards.
"Language is a serial, low-bandwidth channel. It transmits one proposition at a time, sequentially. Each proposition can relate a small number of variables: “if X and Y, then Z.” Complex conditionals can extend this to perhaps five or six variables before the sentence becomes unparseable: “if X and Y but not Z, unless W and V, then Q.” ... "This is not a claim about human cognitive limitations. It is an information-theoretic claim about the channel capacity of natural language relative to the complexity of the models being transmitted."
No.
This definitely is about human cognitive limitations. Consider *why* that 6-variable rule is unparseable: Human working memory is typically about 7 items. His unparseable example has by a generous accounting 10 entities--6 variables and 4 relationships. (And by a strict accounting you need to count the "then", for 11 entities.) Very, very few humans could learn that. Knowledge must be chopped up into chunks small enough to fit into our working memory to be understood and incorporated into our models.
Once one chunk has been incorporated into our model we can then add another chunk to it, repeating until we have built up a complex model. And it's not just read, store, read another--each bit must be worked with to actually be modeled. (But this process does not need to be strictly linear--to take his stupid pedestrian, they can separately learn speed, distance, stopping distance etc and tie them together. But if his pedestrian cares if the driver is inattentive his model is wrong--why are you doing something that involves a moving driver being aware of you in the fist place??)
This was a powerful read and a useful lens to apply to the legibility of the technology era we're in.
"The second, more fundamental reason is that the relevant features are not given in advance. A large part of what the expert learns through experience is which features of the environment matter."
Yes.
"This is the deepest reason why experience cannot be compressed."
No. Finding relevant features and compressing their valuation is what ML systems do. Especially in vision systems. Early attempts at machine vision had human-chosen features and detectors for them. There used to be papers with suggested features - horizontal lines, vertical lines, diagonal lines, patterns of dots, colors, and such. That's where things were in the 1990s. Those have now been replaced by learning-generated low level feature detectors, which work better.
This does need a lot of training content. The training process is inefficient. It definitely compresses, though.
I loved this! thanks for sharing :)
I think of how trading firms cannot say their strategies, as they would lose their effectiveness.
Or how many people know that many CEOs and leaders are idiots, but cannot say it, as they would face retribution.
ai;dr: experience is denser than words can articulate. Also (I don’t think it was mentioned) tacit knowledge is reflexive: it’s recalled in the scenario it was acquired, which is usually when it’s needed, whereas one must think through to recall book knowledge.
I agree with the article’s main point, but it’s flagrantly AI and (as usual with AI) way too verbose.
I think it read well, perhaps too intimidating for some to get into after the first couple paragraphs, but i think it's pretty much tailored at the audience here. I'm resistant to calling it obviously AI as well, the ideas are well thought out and it flows just fine, its certainly not a "here my article title and 2 sentence overview of whats its about, now write it", level of assistance, which i think your comment is kinda making it seem as though it is.