The reason LLMs are so good at coding is due to the vast amount of training data available to them from Open Source code repositories on sites like Github, as well Q&A sites like Stackoverflow. Stackoverflow is essentially dead, Open Source might be next…
But O’Brien says, “When generative AI systems ingest thousands of FOSS projects and regurgitate fragments without any provenance, the cycle of reciprocity collapses. The generated snippet appears originless, stripped of its license, author, and context.” […] O’Brien sets the stage: “What makes this moment especially tragic is that the very infrastructure enabling generative AI was born from the commons it now consumes.
