ai goes to the supreme court (of public opinion)

Apr 24 2024

ai, media, movies

...in which I appoint myself as the judge, jury, and executioner for three different uses of generative AI.

I’ve recently noticed a large shift in AI discourse. We have passed the point where questions about AI ethics are purely theoretical, and we have reached a stage where individuals and corporations are actually regularly using AI in public-facing productions. Because of this, I believe there is a desperate need for an actual discussion around the ethical boundaries of AI usage in everyday life — and the rest of this article will be dedicated to such discussion.

In keeping with my mission statement, instead of trying to outline some abstract theory of AI ethics up front, we’re going to explore the issue through three recent concrete examples of public AI usage to see if we can find a throughline that gives us a set of clearly-defined rules to follow.

Exhibit A: Matt Bruenig, NLRB Edge, and Legal Research

The most developed form of AI by far are Large Language Models (LLMs). This is unsurprising, as written language is much easier to express in computational terms than visual or audio stimulus. One of the most obviously useful applications of LLMs is in the context of legal research, which is a task that largely consists of analyzing and summarizing egregiously long (and boring) documents with the aim of finding very specific and obscure pieces of information. It is not a task that requires a great deal of creativity, just one that requires a lot of time and mental stamina — exactly the sort of task that programmers seek to automate.

Many companies, notably Westlaw and LexisNexis, have begun offering AI tools for legal research, but our subject today is labor lawyer and policy analyst Matt Bruenig. Cards on the table: I have followed Bruenig for years now, and I am generally a big fan of his work, so I am naturally inclined to go easy on him. With that said, I’ve done my best to be as impartial as I can on this issue in particular.

Bruenig has cobbled together his own bespoke solution to produce summaries of legal developments at the National Labor Relations Board (NLRB), the federal agency tasked with enforcing labor law. He uses his AI solution to help create his new publication NLRB Edge, where he gives near-daily play-by-plays of ongoing litigation at the Board, among other things. I know that Bruenig uses AI to assist with his legal summaries because he’s extremely transparent about doing so, and has enthusiastically endorsed the usage of LLMs for law.

In discussions of AI ethics, there are mainly two concerns that are brought up. The first is the quality and veracity of content produced by LLMs. Evidence of AI “hallucination” (see: being completely wrong) has been a subject in the press almost as long as LLMs have been in the public consciousness, and is common enough that ChatGPT itself has a disclaimer right below the input box:

ChatGPT can make mistakes. Consider checking important information.

Bruenig has addressed said concerns directly on his Twitter and explained his process in its entirety:

There was only one NLRB file to summarize today, so I thought it might be interesting to show how I do what I do, especially in light of some recent research showing that LLMs are unreliable at summarizing documents.

First thing I do is run my update script, which pulls all of the new documents, adds them to my database, and then produces a csv of anything that is new. … From there, I have Claude Opus summarize the document.

I take that text and lightly edit it (today basically no editing) and post it as a summary. But I also want to hyperlink the cases that are cited. So that is where my database comes in. I can quickly pull up the cites as indicated and pull the URL to the decision.

This is, as Bruenig makes clear, not a mere copy-paste job. As he elaborates in his replies:

I double-check the documents I have it summarize, scanning them briefly. Reprompting, asking questions etc. This is partly where actually being a labor lawyer is key. I can almost always tell when it’s said something off or incomplete because I know enough already.

So it doesn’t fully substitute for the expertise, but it makes this particular task a lot faster even if it does have a non-zero error rate

Bruenig is a practicing labor lawyer, so his status as a domain expert is not in question. There have been no errors in NLRB Edge’s legal summaries that I have seen thus far, and since Bruenig is a noted controversial personality, I am confident that such errors would be publicly seized upon if they did exist.

Now for the second concern that comes up in AI ethics: is Bruenig’s usage of AI in this way anti-labor?

Well, I’m inclined to say no, for several reasons:

NLRB Edge is a unique publication in scope and subject matter — it is not clear to me that there is any market that Bruenig is directly disrupting by using AI to streamline his work process.
Bruenig is a solo practitioner providing a free publication to the public — it is unlikely that NLRB Edge would be able to exist in any capacity if he did not use LLMs to vastly increase the efficiency of his operation.
Most importantly: NLRB Edge is quite literally a tool of the labor movement — the publication’s purpose is to document, distribute, and educate the general public about labor rights and legal developments regarding the exercise of these rights. Employers do not have qualms about using AI to increase their leverage, so it seems unwise for the labor movement to not be willing to use AI to do the same.

Biased as I am, I genuinely think Bruenig’s case represents the most defensible use of AI in a production setting, and in explicitly using it for the benefit of the labor movement he has become one of the first high-profile individuals who has been using LLMs for a socially productive purpose. As he puts it:

So at the end of the day, contrary to other nutty hypes like crypto, it’s hard to see how LLMs especially are not useful tools! If you use them as universal knowledge chatbots or try to make them mess up, you’ll have a bad time. But try to use them effectively, and they are cool!

Verdict: Not guilty.

Exhibit B: Shudder, Late Night With The Devil, and AI Images

The next example we’re going to be talking about is Late Night With The Devil. Full disclosure: I have not watched this film — I’m not much of a horror guy — but my roommate is obsessed with its lead actor (David Dastmalchian) and as such I heard an overwhelming amount of details about the film before and after its release. It is because of this that I heard about the controversy round the film’s use of AI generated images.

The Variety article about the issue has a statement from co-directors Cameron and Colin Cairnes:

“In conjunction with our amazing graphics and production design team, all of whom worked tirelessly to give this film the 70s aesthetic we had always imagined, we experimented with AI for three still images which we edited further and ultimately appear as very brief interstitials in the film. We feel incredibly fortunate to have had such a talented and passionate cast, crew and producing team go above and beyond to help bring this film to life. We can’t wait for everyone to see it for themselves this weekend.”

I find this statement oddly hilarious. There’s clearly a tinge of exhausted entitlement in it — like they’re saying “god, we whole-assed this entire project and you little babies are complaining about three pictures that appear for a few seconds. Just shut up and watch the movie.”

Unfortunately, I’m not going to shut up and watch the movie, and neither are many other people, it seems. The statement begs the question: if the graphics and production design team were indeed “amazing,” why exactly did you need to use AI images for the interstitials?

Let’s discuss this with criteria we established in Exhibit A, the first of which is quality. Were the images good? The only remotely clear image that’s accessible to me is this one, and it’s, well…

One of the AI generated images from Shudder's Late Night With The Devil. It depicts what is supposed to be a skeleton surrounded by skull-esque pumpkins, but picture features several tell-tale signs of AI generated imagery.

Unflattering.

To be fair, the fidelity of the image is being reduced by a filter because the film is a 70s period piece, but even still, the telltale sign of AI is all over this. The weird eyes and hands, misshapen pelvis, and ugly pumpkins all give the game away. It’s no wonder that this is the image that set people off; it’s the one that’s specifically referenced in the Letterbox review that kicked off this discussion.

Furthermore, if you’re making a horror movie about a late-night talk show, it would seem to me that the television bumpers for said talk show are prime opportunities for world building, symbolism, and Easter eggs, and those are all opportunities that you forfeit the moment you decide that you’d rather slap a prompt into Midjourney than spend the time to think critically about how to flesh out your film’s setting. It’s an inarguably inferior result than what a real artist with knowledge of the film’s plot and themes would have created — which brings us to the labor issue.

Without question, this is an ill portent for filmmaking. During the writer’s and actor’s strikes, one of the primary demands from both groups was harsher restrictions and sometimes outright prohibitions on AI generated content in films and television. This was because they correctly understood that production companies would jump at the chance to use AI to undermine their labor if unchecked. In effect, the hypothesis of the “experiment” that the Cairnes brothers embarked upon was this: if you slipped AI generated slop into an otherwise normal film, would people care? Some people do, but check the Reddit thread on r/Shudder and you will see many takes like this one:

I love people that stand by their convictions, but OP isn’t it. What is it with putting “artists” on a pedestal? Boycott, by all means. But if you’ve ever used a self-checkout at the store, or driven a car, or hell, even played around with ChatGPT, I think you should think it through a little more before making yourself a martyr.

It seems like the experiment was at least a partial success, and production companies getting it in their head that audiences are fine with AI generated diegetic material (and getting mad at people who complain about it) can’t be good news for the “amazing graphics and production design teams” that the Cairnes brothers spoke so highly of.

Verdict: Guilty. 1000 years dungeon.

Exhibit C: Neocities, Penelope, and AI Code

Our last exhibit (and the reason this piece is a day late) is fresh off the presses — and it’s by far the most difficult ruling so far. For those who don’t know, Neocities is a free web hosting service intended for those trying to replicate the Web 1.0 DIY ethos — browse the list of available sites, and you will frequently feel as if you fell into a time portal to 2003. As their April Fools gag, Neocities released a sarcastic AI assistant called Daria that would roast your horrible web design skills. From what I can tell, it was generally well received and fully understood to be a ridiculous idea that made for a good bit — which is why it was strange when on April 21st (two days ago at the time of writing) the AI coding assistant “Penelope” appeared in the dashboard.

Penelope has since disappeared, so the only available documentation of its existence I could find was from this Cohost thread I stumbled upon while doing research for this piece. To find out more information about Penelope and what it actually did, I reached out to Nicky Flowers, a blogger and Neocities user who saw Penelope before it was taken offline. They informed me that Penelope was an AI chatbot that was integrated directly into the Neocities HTML editor that allowed you to ask questions about HTML — but importantly, it also allowed users to generate content. Flowers described to me their experience trying it out:

basically, “Penelope” was a chatbot tacked on the right-hand side of the built-in HTML editor. there was a button to close the window but it was opened by default when launching the editor. … basically it said it was there to help you learn HTML and work on your page and that you could ask it any questions about it that you want.

Interfacing with Neocities requires basic HTML and CSS knowledge, and a chatbot that can answer basic programming questions and give examples of implementation doesn’t sound particularly offensive. However, Penelope’s abilities went beyond mere education. Flowers again:

asking it basic questions about HTML (i.e. “How do I include a link in the middle of a paragraph?”) seemed generally fine. I didn’t put the LLM through its paces or anything but it didn’t seem prone to “hallucinating” bad HTML. … anyway, then i thought to try asking the chatbot to generate me “a blog post about AI”. it dutifully complied, generating both the HTML and the content of the blog post.

It’s here where we run into a problem. On its home page, Neocities markets itself explicitly as a project that aims to “[bring] back the lost individual creativity of the web,” but then provided an easy shortcut for the mass generation of deeply un-individualized slop. Not only that, unlike the situation in Exhibit A, it was marketing that tool to people who are explicitly not domain experts (i.e., HTML/CSS newbies) who do not yet have the expertise to even understand the code that the AI is generating or correct mistakes when they occur — meaning that I highly doubt it would be useful as a learning aid. (Neocities has an actual learning aid on the dedicated learn page.)

A devil’s advocate could argue that ChatGPT could perform the same function as Penelope, but I must emphasize that this was a tool being marketed to newcomers. According to Flowers, the Penelope window was opened by default in the integrated Neocities editor, meaning it would literally be the first thing a new user would see. Even putting that aside, Flowers — who pays for Neocities’s supporter tier — makes an excellent point regarding the question of funding:

sure, people could just open ChatGPT in a separate window, ask it to generate blog posts, and slap it on their Neocities site already. but i’m not paying for their tokens. it seemed likely that the 5 dollars a month i’m paying to Neocities would be going towards “Penelope”. i pay to support the project and get increased storage and various other perks, not to pay for an AI slop machine.

Thankfully, as mentioned above, Penelope has since been removed by Neocities founder Kyle Drake (no doubt because he was quaking in his boots over receiving a guilty verdict from a blog he’s never heard of). In correspondence between Flowers and Drake that was shared with me, he seems to be open-minded, attentive to the desires of his user base, and at least willing to entertain the idea that Penelope was a bad idea in theory as well as in practice. Without question, the near-immediate shuttering of a deeply unpopular project is a gesture of goodwill that is seldom seen with most online platform holders these days. (Additionally, since this was a tool for hobbyists, there aren’t really any concerns about whether the tool was undermining labor.)

With that said, the reason I discovered the Penelope incident in the first place was because I am currently working on a desktop application to help people more easily publish personal blogs on Neocities, in an attempt to advance the cause of CJ the X’s Web 1.5 manifesto that I wrote about a few weeks ago. You can imagine that learning that the platform I was spending hours building infrastructure around and planning to funnel people toward was stricken with AI fever gave me a great deal of anxiety. I’m not going to hand out a guilty verdict outright because Drake seems pretty well-intentioned and reasonable, but I would be lying if there weren’t some new doubts in my mind about whether Neocities is a good bet for the ideal Web 1.5 future.

Verdict: Declined to prosecute.

Closing Arguments

The first thing looking at these cases has shown me is that questions about AI need to be broken down into their component parts before a serious discussion can be had. The questions of “does this usage of AI reduce the quality / veracity of the end product” and “does this usage of AI undermine workers” are two different questions that need to be answered separately, and in doing research for this piece I’ve seen AI proponents and detractors frequently conflate the two.

Secondly, intent matters. Neocities wanted to use AI as an aid for online hobbyists using a free service, but Shudder wanted to use it to pad space in a theatrical film they were charging money for. From the beginning, I’m going to be inclined to be harsher to the latter because of the motivations involved, and I generally endorse a softer posture towards AI usage in non-profit and hobbyist projects. Conversely, I wholeheartedly endorse a merciless posture towards AI usage in for-profit consumer-facing projects.

Finally, and probably most controversially, I do not think that an absolutist case against AI usage can be made in all cases. In artistic endeavors I’m comfortable issuing a blanket condemnation AI generated content, but Bruenig’s case makes it clear that there are productive uses for machine learning — and LLMs in particular — when combined with the supervision of experienced and ethically conscious domain experts. The labor movement should absolutely immediately move to establish formal rules in every industry (especially creative industries) around the acceptable usage of AI technology, but until then, here’s the rubric I’m going to use in my court of personal opinion:

1. AI generated content is clearly indefensible in any project with an artistic mission of any kind.

We’ve already talked about the Late Night With The Devil situation, but A24’s Civil War was also in hot water recently about usage of AI images for its posters. In both cases, the production companies are essentially gesturing vaguely at the audience and saying “you get the idea” instead of actually implementing their idea. That is a level of creative bankruptcy that should simply never be tolerated, particularly in creative productions that are grossing millions of dollars.

2. Usage of AI for the purposes of replacing domain experts is both impossible and inexcusable.

Bruenig has demonstrated how he verifies his AI generated text for accuracy — but we have already seen other lawyers that do not. Even before discussing the labor question, it is at minimum irresponsible and at maximum extraordinarily dangerous to attempt to use AI to replace a domain expert in virtually any professional context.

3. Any usage of AI that — even indirectly — undermines the labor movement should be opposed at all costs.

This one goes without saying, but I do want to emphasize that even though it’s worth criticizing the use of AI content on the grounds of its artistic integrity or potential for misinformation, to me, it’s the labor question that I care about the most. I was extraordinarily demoralized at the total complacency seen in the r/Shudder posts that I read for Exhibit B, and I fear that production companies will begin to (correctly) bet that mainstream audiences will largely accept the use of AI generated content in films and television.

The writers and actors bought us some time, but the fact that they had to fight for AI guardrails in the first place proves that there is absolutely no question about what the intended endgame is for production companies. If you’re watching anything with AI generated content, it is your civic duty to bitch and whine at the parties responsible as loudly and obnoxiously as one possibly can. Don’t let them have an inch — because they’ve made it more than clear that they want to take a mile.