Hacker Newsnew | past | comments | ask | show | jobs | submit | rspeele's commentslogin

> "What have you got against machines?" said Buck.

> "They're slaves."

> "Well, what the heck," said Buck. "I mean, they aren't people. They don't suffer. They don't mind working."

> "No. But they compete with people."

> "That's a pretty good thing, isn't it--considering what a sloppy job most people do of anything?"

> "Anybody that competes with slaves becomes a slave," said Harrison thickly, and he left.

Kurt Vonnegut, Player Piano


It's afraid!

> Every 1 prompt of building tends to require 1-5 prompts of clean up. Simple, fast, clean good code.

I have found this to be very effective as well. However, it's so easy to do, I can't imagine they won't build it in.

The harnesses will improve and the loop of "self-review, judge what needs clean up, do the refactoring, repeat until clean" will get included in the one-shot. They are already doing this somewhat, they'll just get a lot better at it and as the models get faster and cheaper to run, the refactoring churn at the end of each task won't even create a noticeable delay.

I do not think the high-level "taste" knowledge that I've built up -- when to break something off into its own service, what to put in the DB vs cache vs queues vs blob storage, how to isolate important logic in pure functional layers so it can be tested and validated independently -- is any more "unlearnable" to AI than the stuff I previously considered impressive that's now one-shottable like "write a Prolog implementation from scratch".


They have definitely built some of it in.

And yes, right now you still need the architectural and system design knowledge because the LLM will fuck that up. We'll all find out if that continues being needed in the future. From what I understand about LLMs and how they work, I doubt it, but also, yeah, I doubted it would've gotten this far when I think back 2+ years ago.

Also, maybe I should be clear, I pretty much never one-shot things. My sessions with claude or other cli tools always starts with a bit of a conversation until we converge on a good plan, claude builds the code, we discuss some more, then we iterate.


> I'd say SQL is a very high level language.

Yes indeed. When I learned SQL in college, the professor made a HUGE deal about how it was a "4th generation language" so it was so abstract you didn't have to think about how the computer would answer your query.

Even at that time I thought that was massive overstatement of what using SQL was like. It didn't deliver on that promise very well. But it's very funny to see plain SQL now sometimes called "low level"!


Yes it did, you don't need to worry about whether the sql execution engine is spark, or snowflake, or sqlite or what, you just reason about the top level logic

One thing something like AutoCodeBenchmark cannot demonstrate is what happens when you have human-written type definitions defining the domain before the LLM writes a line of code.

That is something I have found very effective in F#, that I model the domain with types, I know what the type signatures of the functions I need are, and the LLM does the work of actually implementing those functions.

Here is a concrete example:

I have been playing around with a program to assist me with projects I make at home on my hobby-grade CNC router, which does not have an automatic toolchanger. I use a mix of Vectric VCarve and some older handwritten programs to generate GCode files. I end up with a USB drive with maybe 6 to 12 GCode files on it and a model in my head of "to make this product, I start with a board here, gotta install this square nose end mill and zero on this corner of the board, run files A and B. Then install a ball nose end mill and run file C. Then flip the board over lengthwise, switch to a smaller square nose end mill, zero here, run file D. etc. etc."

Although I try to name the GCode files in a self documenting way like 01_TopSide_25square.ngc, if I come back in 1 year and want to make the same thing again, I pretty much always have to open VCarve and eyeball what the hell all the files did and confirm where to zero, what size board to use, etc. So I'm making a tool where I can define those human-operator steps that go with the G-Code files, save it as a "project file", preview in 3d what each step will look like, and export to a printable PDF with screenshots and step-by-step instructions. Hopefully this will reduce the amount of rot that these projects suffer and the cognitive overhead of picking up an old one.

Modeling the steps as F# types was the very first step, like (small excerpt):

    type WorkpiecePlacement =
        {   Id : WorkpieceId
            /// Corner of the workpiece we'll attach to the machine.
            WorkpieceCorner : WorkpieceSpace.Corner3D
            /// Point in machine-space we'll anchor this corner to.
            MachinePoint : MachineSpace.Point
            /// Which face of the workpiece is on top.
            FaceUp : WorkpieceSpace.Face
            /// Rotation around the up-axis.
            Yaw : WorkpieceSpace.Yaw
        }

    type OperationType =
        | PlaceWorkpiece of placement : Operation.WorkpiecePlacement
        | InstallTool of id : ToolId * slot : int option
        | ZeroAt of point : MachineSpace.Point
        | RunGCode of source : GCode.Source
        | RemoveWorkpiece of id : WorkpieceId

For the GCode simulator I needed a parser for GCode files, which produces a type with 1:1 equivalence to the GCode instruction set:

    type GCodeInstruction =
        // --- Motion ---
        | G0_RapidMove of axisMoves : (Axis * float<gcodeunit>) array
        | G1_Move of feedRate : float<gcodeunit/minute> option * axisMoves : (Axis * float<gcodeunit>) array
        | G2_ClockwiseArc of ArcParams
        | G3_CounterClockwiseArc of ArcParams
        | G4_Dwell of seconds : double

        // --- Plane selection ---
        | G17_SelectXYPlane
        | G18_SelectXZPlane
        | G19_SelectYZPlane

        // --- Unit selection ---
        | G20_Inches
        | G21_Millimeters

        // --- Distance mode ---
        | G90_AbsoluteDistance
        | G91_RelativeDistance
        // ... etc truncated, more instructions in real code

But my tool supports doing transforms on toolpaths, like rotating 90 degrees or offsetting so I can easily define that I want to make tiling copies of the same project. To implement those transforms straight up as GCodeInstruction[] -> GCodeInstruction[] is a bad call. GCode is very stateful and lets you switch units, relative vs. absolute coordinate spaces, etc. in instructions. That makes the transform awkward and tricky to write.

So I have a ToolPath type that makes the transforms clean. It normalizes the many ways of expressing the same toolpath in GCode to a single representation with all absolute coordinates in metric units.

    type ToolPathInstruction =
        | Rapid of From : Point * To : Point
        | Linear of From : Point * To : Point * Feed : FeedRate
        | Arc of
            From : Point *
            To : Point *
            Center : Point *
            Plane : Plane *
            Direction : ArcDirection *
            Feed : FeedRate
        | ... etc truncated
That is the appropriate level for the transforms like offset, rotate, scale, etc. to operate on.

Yet there is still ANOTHER level of toolpath-related operations that deserves its own type. When I'm doing simulation of material removal to check for crashes, or rendering the toolpath in 3d, I don't want to deal with arcs! The rendering/simulation is inherently an approximation. It will break down each arc into line segments. So sim code and rendering code shouldn't take a toolpath, it should take basically a line segment list, or in other words...

    type ApproxMove =
        {   From : Vector3
            To : Vector3
            FeedRate : double<m/minute>
            IsRapid : bool
        }

    type ToolPathApproximation =
        {   StartPosition : Vector3
            Moves : ApproxMove[]
        }
Having defined all these types it's clear that I need operations like:

    parse: string -> GCode
    serialize : GCode -> string
    normalizeToToolPath : GCode -> ToolPath
    denormalizeToGCode : ToolPath -> GCode
    offset : Vector3 -> ToolPath -> ToolPath
    rotate90 : ToolPath -> ToolPath
    scale : Vector3 -> ToolPath -> ToolPath
    approximate : ToolPath -> ToolPathApproximation
    simulate : ToolPathApproximation -> MachineState -> MachineState
    renderToolPathWireframe : ToolPathApproximation -> VBO
    renderMachineState : MachineState -> VBO
And so on. An LLM is absolutely awesome at one-shotting the implementations.

I would find it quite frustrating trying to model the same domain without any types, either having all methods working on a single toolpathy data structure that's not really the right fit for any of the places it's used, or having them work on multiple data structures without any clear delineation of which layer is expecting which toolpathy-thing that are all subtly but importantly different.


> the errors I get at runtime are almost never type-based

That surprises me, but everyone's experiences are different. I've been in the statically typed language space for so long and enjoyed it so much, I find it pretty irritating to go back to Python (my long-ago favorite) but many people are in the exact opposite frame of mind. I'm curious: what kinds of errors do you classify as a type-based error? I think that varies from person to person.

For example, null references. A C programmer would say dereferencing a null is not a type-based error, because it's not feasible to encode non-nullable pointers in the C type system. A Haskell programmer would say it is a type-based error because Haskell makes it difficult not to encode this in the type system, you really have to go out of your way to create a runtime null dereference error.

A C# or TypeScript programmer could answer differently depending on who you ask, because both of those languages make it possible to leverage the typechecker to prevent null-deref at compile time, but neither one makes it required (you can turn those checks off or make them warnings if you like), so it depends on the programmer's build settings and how much typechecking they personally have chosen to use.


> I find it pretty irritating to go back to Python (my long-ago favorite) but many people are in the exact opposite frame of mind.

As someone who works exclusively in typed languages for formal methods, what is it you find lacking about modern Python + PyLance? IMO there's still a tiny verbosity issue, and there's no real replacement for fancier polymorphism or (G)ADTs, but I'm very satisfied with it for most things. In particular, null checks are trivial.


It has been about 10 years since Python was a daily driver for me and at that time I wrote it the old fashioned way with no type hints and no static checking, just like grandma used to make. The times I have needed to dig back into it have involved working on old code, so I haven't kept up with modern tooling.

However, in principle any dynamically typed language can be tolerable to me if it can be turned into a statically typed language ;)

But I think I'd still prefer the ergonomics of a language designed that way from the start vs having bolt-ons. My favorite language for the past several years has been F# and I think ML-family languages in general strike a great balance of being able to write terse code when you want to, and being able to model a domain really well with types when you want to.


Fascinating quote and good point.

It should also be remembered that while the industrial revolution netted humanity enormous wealth and eventually a higher average standard of living, it also kinda sucked for the generations of working class living through it, prior to labor reform. Millions of people lived entire lives where the industrial revolution was nothing but bad for them and never saw the upside. So anybody opposing a new industrial revolution is not necessarily acting out of irrationality.


Kind of. I think the history is interesting here and complex.

One of the hard things to grasp is that the industrial revolution was preceded by an environmental collapse. Part of the reason there was a switch to coal (despite being seen as inferior to wood at the time) was massive depletion of wood in England and the high cost of importing not just timber but even just firewood.

Add this in to the enormously expensive wars England was fighting all through this period and stressed everything from labor and food supplies (which also triggered demand for steel and copper and brass) The industrial revolution happened against a backdrop of national crisis so it's hard to know what was being caused by the revolution and what the revolution was helping paper over.

And on top of this, when Engels and Marx wrote about the squalor and desperation of their time (which was very real), nearly a hundred years had passed and something much different was happening. Massive amounts of peasantry were being dispossessed of lands and forced into urban slums. Cities grew something like 10x in a single generation. This wasn't really the fault of the industrial revolution but because of really bad policy.

(BTW, this period in England when wages and quality of life backslid is now called "Engels' Pause" https://en.wikipedia.org/wiki/Engels%27_pause)


> whats wrong with struggling alcoholics having jobs via a program?

Finally, a job AI will never beat me at.


The contract behind open source was something like (GPL):

"If you copy my work, you should share your work too."

or at minimum (MIT):

"If you copy my work, you should credit me."

I think it is no longer under dispute that the legal contract is satisfied by LLMs. The AI companies won and will continue to win.

But we are talking about a social contract, which is not quite the same thing. The social contract is what leads some devs who previously enjoyed publishing their work openly to no longer feel the same way. What did the authors mean by "copy"? Did they mean literally CTRL+C, CTRL+V or something broader?

This is a matter of opinion which only each individual creator can answer. For me, copying meant something like:

"To reproduce the function of my work, dependent on my having published it, without effort nor understanding of your own"

Ten years ago this basically required doing a CTRL+C, CTRL+V so there was no need to be more specific. Anybody who did enough work to, say, rewrite in another language (with that language's idioms), met the bar of clause 3. Now AI enables a form of "copying" that matches my definition, without the user even being aware of whose works they are copying. It perfectly launders the origins of its output. It can write an FFmpeg clone in Rust for you that would appear to be a novel work.

Of course, I cannot say that my own little bits and pieces of open source code would make a scratch in AI's capability, were it removed.

But I do strongly believe that if all the code that was published by authors with the same mindset was unavailable, Claude would be a far weaker developer.


> But we are talking about a social contract, which is not quite the same thing. The social contract is what leads some devs who previously enjoyed publishing their work openly to no longer feel the same way.

Perhaps this illustrates a fissure that was always lurking under the surface, then. The social contract that I've personally always attributed to FOSS communities was that attempting to restrict how people downstream of you use code is illegitimate, and that licenses like the GPL were meant to use copyright law to achieve something that resembles the state of affairs that might exist if copyright didn't exist in the first place. That's what the whole concept of "copyleft" always seemed to imply.

Now we have a new class of technologies that is admittedly fraught with a wide range of risks and pitfalls, but also a lot of promise to enable people to actually put the "four freedoms" into practice in ways they couldn't before, and we're seeing people who have normative opinions about AI derived from other, unrelated principles trying to circle the wagons and exclude those use cases. That is what seems like a breach of the social contract as I've always understood it.

> Did they mean literally CTRL+C, CTRL+V or something broader?

Given that FOSS licenses were always constructed to function within applicable copyright law, I don't see how they could mean anything else. "Literal CTRL+C, CTRL+V" is the only thing copyright has ever applied to, and the whole point of "copyleft" was to lessen the restrictions on even that.


> "Literal CTRL+C, CTRL+V" is the only thing copyright has ever applied to

This is extremely false. Copyright additionally grants you exclusive control over the production and distribution of derivative works.

A "derivative work" is a work based upon one or more preexisting works, such as a translation, musical arrangement, dramatization, fictionalization, motion picture version, sound recording, art reproduction, abridgment, condensation, or any other form in which a work may be recast, transformed, or adapted. A work consisting of editorial revisions, annotations, elaborations, or other modifications which, as a whole, represent an original work of authorship, is a "derivative work".

A training set is just an anthology, and the training process is condensation. That makes the weights a derivative work of every work in the training set.

Now, there's a separate discussion to be had about whether that derivative work meets the criteria for fair use, but that's it's own tangent.


> This is extremely false. Copyright additionally grants you exclusive control over the production and distribution of derivative works.

A derivative work is a work that itself includes copyrighted content from the original work.

That is to say that for something to be a derivative work, some measure of its content must be "CTRL-C, CTRL-V" from the originating work.

Something that's merely inspired by another work, or draws underlying themes or factual knowledge from it, is not a derivative work.

> A training set is just an anthology,

Which might make the training set itself a derivative work, but works created by using the model trained on that anthology are a different matter.

> and the training process is condensation.

No, it isn't. It's the creation of a new work that represents patterns extrapolated or interpolated from the data set, without the resulting model actually including any of the copyrighted elements of the work.

The underlying ideas and facts in the original work were never protected by copyright. Only the specific fixed form of expression is copyrightable.

Someone who looks at a dozen code examples in public repos to learn how to do e.g. a quick sort, then upon understanding the logic flow of the quick sort algorithm, writes his own quick sort implementation is not creating a derivative work of the code in the repos he exampled. And the way LLMs work is much more similar to that process than to the "compressed anthology" concept you're describing.


> A derivative work is a work that itself includes copyrighted content from the original work.

If you put a GPL C program through Emscripten to run in a browser the output doesn't include the original C code but it's surely a derivative work.

> Someone who looks at a dozen code examples in public repos to learn how to do e.g. a quick sort, then upon understanding the logic flow of the quick sort algorithm, writes his own quick sort implementation is not creating a derivative work of the code in the repos he exampled. And the way LLMs work is much more similar to that process than to the "compressed anthology" concept you're describing.

This is undoubtedly the core of the disagreement. Humans can learn from what they have seen, appreciate it, understand it, and draw on that experience in what they create. They do this without being considered ripoff artists, so why not machines that simulate the "same" thing automatically?

To me the answer is simply that humans are special. Human thought and human effort makes it creativity when a human does it, copying when a machine does it. It's a double standard I am perfectly willing to accept. I am unabashedly biased in this regard.

That may seem remarkably unfair to the machines, or like a cop-out. I just carved out a hardcoded special case for humans, and my whole philosophical reasoning is "because I said so". But how fair do we want to be? After all, if you want to treat a machine exactly like a human who learns from prior art to create new art, then the ownership of the new art would also belong to the machine. Not to the person who prompts it.


> If you put a GPL C program through Emscripten to run in a browser the output doesn't include the original C code but it's surely a derivative work.

Because it does include content from the original work -- this is just a translation, and isn't comparable to how LLMs work.

> To me the answer is simply that humans are special.

I don't disagree, but I also view LLMs as tools that extend human capacities and not autonomous entities unto themselves. LLMs are still just software, and can't really be regarded as anything other than instruments that humans use to broaden their capacity to see, appreciate, understand, and draw on that experience in what they create.

> That may seem remarkably unfair to the machines, or like a cop-out.

No, it's unfair to the humans. The machines are just tools that they use. The "double standard" is really a set of inconsistent standards applied to the same underlying moral agents.

> After all, if you want to treat a machine exactly like a human who learns from prior art to create new art, then the ownership of the new art would also belong to the machine. Not to the person who prompts it.

No, it always belongs to the person who prompts it. The machine is not a conscious entity, bears no intentions, and has no capacity to act on its own initiative. The machine is always just a tool that extends human capacity, as all machines always have.

For a good comparison here, we've never not credited a photographer as the author of a photograph. But the photographer is in a sense merely prompting the camera by framing the shot, selecting the exposure, adjusting the lighting, etc. -- the hard work in actually creating the photograph is being done by the camera itself, with the photographer playing no role in directly constructing the final image, and with the many of the qualities of the final image being determined by pre-existing features of the camera's functional design and components that the photographer also played no role in defining, apart from choosing which camera to use.

LLMs are like cameras in this way. And the fact that they rely on external data for model training no more disclaims the user as the author of the resulting work than looking things up in a dictionary or encyclopedia does the same for the author of an essay.


The camera analogy is a good one but I have never had a camera that had every great picture somebody else had taken, plus every work of art, baked into it. They only captured what they were aimed at directly by the user. Well, maybe next time I upgrade my phone that will not be the case since they now have built in AI "enhancement" of photos.

I agree with the framing of the AI as a tool not an autonomous entity. The thing is, to me, it is exactly that framing that makes it so the use of that tool means "copying" more than it means "learning and taking inspiration and creating new art", because who is doing the learning and being inspired? The person who types "make me a 3d arena FPS" certainly didn't do any learning from the Quake source code. The AI itself, being just a program, can't take credit.

I think of a trained AI like a lossy, highly compressed copy of its training data set. AI companies charge access to decompress targeted pieces of that copy and the lossiness makes that decompression interesting and "new". But normally I can't charge for access to other people's stuff even if the access is highly lossy, like a camcorder bootleg.


> The camera analogy is a good one but I have never had a camera that had every great picture somebody else had taken, plus every work of art, baked into it.

I've never had an LLM that had any of that baked into it either. LLMs just have token correlations trained on those works. Trying to get an LLM to output the data it was trained on verbatim is something I'd expect to be heading into monkeys-on-typewriters territory. "Write something in the style of Shakespeare" and "give me the original text of Hamlet" are two very different things.

> I agree with the framing of the AI as a tool not an autonomous entity. The thing is, to me, it is exactly that framing that makes it so the use of that tool means "copying" more than it means "learning and taking inspiration and creating new art", because who is doing the learning and being inspired?

It's not learning or taking inspiration, though. It's just making statistical inferences based on token correlations. Whether or not that's analogous to how humans learn is something I think is a metaphysical question that is of little practical relevance. The fact remains that LLMs are not human, have no intentions of their own, do not exercise any kind of agency despite how often people employing the misnomer "agentic", and are ultimately glorified statistical models.

The LLM is a tool that extends human capacities in the same way as any other mathematical framework or technological device.

> I think of a trained AI like a lossy, highly compressed copy of its training data set.

I've seen a few people in this thread make that argument, but I just can't agree with it. It's not compression, lossy or lossless, which aims to deterministically encode a representation of the specific input data. The training data is analogous to the sample set used in a regression analysis to generate a polynomial function -- it's not valid to treat the output from any application of that polynomial as a copy of the original sample data.


Perhaps the future will be less Idiocracy and more Futurama, with humans and robots living socially together.

> Perhaps this illustrates a fissure that was always lurking under the surface, then(...)

Yes, I do think there has always been such a fissure. People publish OSS code for many reasons, often a blend of multiple reasons. There are selfish reasons such as the desire for one's work to be recognized, or even the hope of getting better employment through showing ones' skill or making something companies will pay for support on. There are social reasons like the desire to collaborate with others. There are altruistic benefit-of-all-mankind reasons like Richard Stallman said "...restrictions reduce the amount and the ways that the program can be used. This reduces the amount of wealth that humanity derives from the program."

It sounds like your view of things is limited mostly to that last version of FOSS, the copyleft style. But even adherents of that style, I think, are not too happy with AI consumption of their code. For one, it allows laundering of the copyleft license so their work goes into closed-source products that are never shared. And for two, if your idea of OSS is that we all put our contributions into the great shared river of human achievements to benefit the world, it is disappointing to see that river funneled into a giant waterwheel of profit for a half dozen trillion dollar companies charging rent for its bounty.

> Given that FOSS licenses were always constructed to function within applicable copyright law, I don't see how they could mean anything else.

I agree from a legal standpoint. I cannot enforce my personal definition of copying nor do I expect that to become possible. It was just conveniently aligned with the reality of how copying software worked in the past, and no longer is and never will be again. That doesn't mean I will be writing OSS software with a new made-up unenforceable license. It just means, like OP, I'll weigh differently whether I want to bother releasing stuff at all.


> It sounds like your view of things is limited mostly to that last version of FOSS, the copyleft style.

No, I'm well aware of the different motivations for and approaches to FOSS. I'm mostly focusing on the copyleft/GNU GPL side of the discussion here because that's the side of the house where most of ideas of a social contract and desire to see a specific ecosystem develop have been located. People on the MIT/BSD side of things, which has always had a much more direct "do whatever you want" ethos, are not the ones I'd expect to be making these arguments in the first place.

> For one, it allows laundering of the copyleft license so their work goes into closed-source products that are never shared.

I'd agree that someone using an LLM to create a deterministic transcription of someone else's work is indeed violating the license. But I think the argument goes beyond that, into using LLMs in any way at all.

> That doesn't mean I will be writing OSS software with a new made-up unenforceable license. It just means, like OP, I'll weigh differently whether I want to bother releasing stuff at all.

That's a reasonable position, and from the perspective of examining whether the current LLM climate is sapping motivation to participate in FOSS, I can understand where you're coming from.

But to that point, I'd argue that if your motivation was to gain recognition, participate in a community, etc. then you're going to lose those things by keeping your code private anyway, whereas you won't necessarily lose those things just because an LLM was trained on your code. If you contribute to a popular project, people were almost certainly already using your work to do things you don't approve of -- if that didn't take away your motivation, why would LLMs do much worse?


> The social contract that I've personally always attributed to FOSS communities was that attempting to restrict how people downstream of you use code is illegitimate,

That's wrong. What on earth gave you that impression when the licenses specifically set constraints on what downstream can do (from "release derivatives as open" to "put me in the credits").

Which part of which open source licenses gave you the impression that there were no restrictions?


> That's wrong. What on earth gave you that impression when the licenses specifically set constraints on what downstream can do (from "release derivatives as open" to "put me in the credits").

These are restrictions on redistribution, not use. And they're there to make sure that derivative works can't themselves impose restrictions on use.


One correction: the point of copyleft was to explot the restrictions in order to ensure that it would be possible for everyone to copy the software.

> "If you copy my work, you should share your work too."

Not exactly. The GPL way is that you should share my work under the same terms if you want to share it, even if modifying it.

You are not required to share anything if you don't actually share anything, and just run it yourself. That's where all the criticism towards cloud providers who freely use FLOSS is directed.

> But we are talking about a social contract, which is not quite the same thing. The social contract is what leads some devs who previously enjoyed publishing their work openly to no longer feel the same way.

There is clearly a misalignment in expectations from some FLOSS enthusiasts. The main FLOSS licenses focus exclusively on distribution, but their expectations somehow extend well beyond distribution. We hear those FLOSS enthusiasts criticize and attack companies for using software exactly according to their terms, and somehow that is framed as abuse if said users happen to be bigger than some arbitrary boundary.


> Static types are, in a real sense, a compensation for the gap I just described - the gap between the description and the running thing. When you can't easily inspect or reshape the live system, you want the compiler to tell you as much as possible before you cross that gap.

I think this underrates static typing. For me the biggest value add of a static type system is that while doing a big, breaking-change refactor, I can near-instantly see all the places I need to update callers. Getting the code to work in the place I was actually working on is easy, I was already focused there. Static types pay off by helping me know when my change broke other parts of the system I wasn't even thinking about.


I'm not underrating static typing and I'm not saying static types have no value for refactoring - they absolutely do. But the "I can't refactor safely without types" argument tends to assume a codebase structured the way typed codebases are structured. Idiomatic Clojure has a different shape, and the refactor pain you're describing almost never materializes.

What the static typing camp can't acknowledge is that not all dynamically typed languages are equal. I agree with your sentiment e.g., on Python codebases - it's legit pain, but PLs like Clojure and Elixir are a different story.

Data-orientation flattens the call graph. Most Clojure functions take and return plain maps/vectors/seqs. A "breaking change" to a data shape doesn't ripple through type signatures across the codebase the way it does when every layer has its own nominal type. You change the shape at the edges (parsing, validation via Spec/Malli) and most intermediate code keeps working.

Fewer, more general functions. map, filter, reduce, get-in, update, etc. replace dozens of bespoke typed methods. There's just less surface area to break. But then again, I would say `(map)`, and someone familiar with Javascript's `map()` would have some wrong assumptions.

You really just can't evaluate ANY language by picking a single aspect of it without wholly understanding the holistic picture in practical, battleground scenarios. Yes, Clojure is dynamic, but it doesn't mean it's harder to maintain or to more difficult to build with, or you just can't write robust software in it. It genuinely has qualities that shine in some domains.

My claim wasn't "static types are useless for refactoring" - it was that they're a compensation for the gap between description and running system. Your refactor scenario fits that exactly: you need the compiler to tell you about distant breakage because you can't cheaply ask the live system "who calls this, and does it still make sense?"

In a live image, that question is answerable directly. find-usages, instrument the function, exercise the path, watch what flows through. The "places I wasn't thinking about" announce themselves the moment they're touched, with real data in hand - not as a list of locations I now have to go read and reason about cold.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: