ether+nick

Apr 18, 2026

This is a really interesting question! TIL about CA vs. Altai and the abstraction-filtration-comparison test.

I'm not sure how automatable it is. Interesting to try though!

Apr 18, 2026

@evan That’s not enough code for copyright enforcement. People have been finding identical code in the output - you just need something “rare”. It’s similar for subjects with little text in the corpus - I’ve been seeing listings that *can only have one source* (retro datasheets by AMD, in my case).

Apr 18, 2026

@evan @cwebber @bkuhn @ossguy @richardfontana Another major concern is that works generated by AI are not copyrightable per the US Supreme Court. So code generated by an LLM can not be licensed at all, open or closed. https://www.reuters.com/legal/government/us-supreme-court-declines-hear-dispute-over-copyrights-ai-generated-material-2026-03-02/

Apr 18, 2026

@cwebber it's sometimes a distinction that people blur!

@richardfontana @bkuhn @ossguy

Apr 18, 2026

But maybe that's wrong; I don't know. Maybe if I wrote a Person.setName() method that was in the training set, and the LLM generated an identical Person.setName() code snippet for someone else, I could claim that the code is a copyright violation, even if there were thousands of other identical and independent Person.setName() methods in the training set.

@cwebber @richardfontana @bkuhn @ossguy

Apr 18, 2026

@evan @richardfontana @bkuhn @ossguy Sorry, I missed a word when I edited the sentence, I meant "genAI output"

Apr 18, 2026

@cwebber the weights themselves?

@richardfontana @bkuhn @ossguy

Apr 18, 2026

I think the worst case scenario is that the inserted code matches exactly one snippet in the training data.

So you could try to go for zero matches, by using such idiosyncratic and unrecommended coding conventions that nobody else has code like yours.

Or you could try to go for lots of matches, by using bog standard coding conventions and software patterns.

@cwebber @richardfontana @bkuhn @ossguy

Apr 18, 2026

@evan
@cwebber @richardfontana @bkuhn @ossguy or just... not at all

Apr 18, 2026

@evan @richardfontana @bkuhn @ossguy Yeah! I actually already said elsewhere in the thread I don't think we need to worry about using these tools for such scenarios from a *licensing* perspective, only when the genAI is explicitly checked into the codebase