Tag Archives: A.I. Artificial Intelligence

Sora Will be A Game Changer

I would love to be wrong, but filmed entertainment seems to be facing its own equivalent of the robotic assembly line.

A little-reported but hugely significant white flag of surrender surfaced a few weeks ago when the producer and actor Tyler Perry suddenly canceled a planned expansion of his Atlanta studios. A dozen new sound stages were originally projected, but that was before he saw what he considered a “mind blowing” demonstration.

Perry changed his mind after he viewed a collection of short videos produced by an A.I. program called Sora. On just verbal prompts to Sora, the name of an image generator from Open AI, a fabricated scene emerged as an instant “video” that was difficult to distinguish from a sequence that a Hollywood production company might take days to set up. The crane shots in some of these fake videos are stunning. The characters look like they have been groomed for their parts. Shadows are mostly authentic. And the live action from people and animals look mostly “real.” As the Washington Post noted in an excellent must-see article,  the images and actions are “shockingly realistic.”  The article and its examples are best seen on a computer screen. Here’s a sample of one of the videos with its text prompt that is cited by the Post.

[Verbal Prompt: A cat waking up its sleeping owner demanding breakfast. The owner tries to ignore the cat, but the cat tries new tactics and finally the owner pulls out a secret stash of treats from under the pillow to hold the cat off a little longer. (OpenAI)]

We expect that most institutions will evolve incrementally: slow enough to allow for adjustments to new realities. That may not be the case here. Every trade in the film and video industry must be asking how they will fit into a world of narrative storytelling when anyone without experience with computer generated images can “create” stunning video effects.

To be sure, things aren’t perfect in this early generation of Sora. Look at a sample of an invented scene from a 1930s movie, also cited by The Post.  It looks great, but Sora doesn’t know how to light a cigarette:

[Verbal Prompt: A person in a 1930s Hollywood movie sits at a desk. They pick up a cigarette case, remove a cigarette and light it with a lighter. The person takes a long drag from the cigarette and sits back in their chair. Golden age of Hollywood, black and white film style. (OpenAI)]

Hollywood is not alone in confronting technological advancement, but the ease of use of this technology makes it an existential threat to the film world as we know it. Producers and various content providers will love this tool. But it cannot be anything but a blow to artists and trades that usually make traditional film or video projects. No wonder actors were so concerned about achieving a new contract that would prohibit the use of their likenesses without their permission. I would love to be wrong, but the future of “filmed” entertainment seems to be facing its own equivalent of the robot revolution in the production of automobiles.

A colleague who knows about these things notes that crews have been dealing with Computer Generated sets and effects for years. As actor can now appear to be walking down a street in Prague while passing in front of a green screen in Burbank. And many are working these days. There’s also the example of recent films like Poor Things (2023), with actual Victorian sets on sound stages and the inventive use of crafts that go with a period piece. My colleague also wonders if many A.I. scenes aren’t essentially rip-offs of other location videos, slightly modified to seem more original than they are.  Newer generations of this software should help clarify the charge of “mere copying”.

To be sure, the future appears bright at least for copyright lawyers.  Then, too, actors in dense roles driven by dialogue construct screen personas carefully.  Performances come from assumed motivations and hard-to-fake nuances. Can a fully integrated performance like Emma Stone’s in Poor Things really be put together from just from verbal directions?  Even so, an upheaval is bound to happen as seemingly recognizable persons are placed in novel settings and given words that they never muttered.

A.I. appears to be a new and fearsome thing facing the film industry, but it is even more of a threat to the culture as a whole if journalists and public figures face an endless tangle of anger and confusion over real and fabricated words and images.

black bar

Revised square logo

flag ukraine

sound file spectrum 2

A.I. and the Mastery of Spoken Language

The question isn’t just whether we are capable of making simulations of human speech but, rather, if bots can replicate the singular mind that gives form to all speech.

In Steven Spielberg’s dystopian film, A. I. Artificial Intelligence, a software designer played by William Hurt explains to a group of younger colleagues that it may be possible to make a robot that can love. He imagines a machine that can learn and use the language of “feelings.” The full design would create a “mecha”—a mechanized robot–nearly indistinguishable from a person. His goal in the short term was to make a test-case of a young boy who could be a replacement for a couple grieving their own child’s extended coma.

The film throws out a lot to consider. There are the stunning Spielberg effects of New York City drowning in ice and water several decades in to the future. But the core focus of the film is the experiment of creating a lifelike robot that could be something more than a “supertoy.”  As the story unfolds, it touches on the familiar subject of the Turing Test: the long-standing challenge to make language-based artificial intelligence that is good enough to be indistinguishable from the real thing.

Should we become attached to a machine packaged as one of us? Even without any intent to deceive, can spoken language be refined with algorithms to leap over the usual trip wires of learning a complex grammar, syntax and vocabulary?  It takes humans years to master their own language.

The long first act of the film lets us see an 11-year old Haley Joel Osment as “David,” effectively ingratiating himself to the Swinton family.  In my classes pondering the effects of A.I., this first segment was enough  to stop the film and ask members what seemed plausible and what looked like wild science fiction. I always hoped to encourage the view that no “bot” could converse in ordinary language with the ease and fluency of a normal kid.  That was my bias, but time has proven me wrong. If anything, David’s reactions were a bit too stiff to reflect the loquacious chatter bots around today. Using Siri, Alexa or IBM’s Watson as simple reference points, it is clear that we now have computer- generated language that has mostly mastered the challenges of formulating everyday speech. There’s no question current examples of synthetic varieties are remarkable.

Here’s an example you can try. I routinely have these short essays “read” back to me by Microsoft Word’s “Read Aloud” bot, which comes in the form of a younger male or female voice that can be activated from the “review” section in the top ribbon. Not having an editor, it helps to hear what I’ve written, often letting me hear garbled prose that my eyes have missed. I recall the first version of this addition to Word was pretty choppy: words piled on words without much of attention to their  intonation, or how they might fit within the arc of a complete sentence. Now the application reads with pauses and inflictions that mostly sound right, especially within the narrower realms of word usage focused on formal rather than idiomatic English.  Here is the second paragraph of this piece as read back to me via this Word function:

Of course, language “means” when it is received and interpreted by a person.  An individual has what artificial Intelligence does not: a personality, likes and dislikes, and a biography tied to a life cycle. Personality develops over time and shapes our intentions. It creates chapters of detail revealing our social and chronological histories as biological creatures. A key question isn’t whether we are capable of making simulations of human speech. And that begs an even bigger question about whether bots can replicate the unique mind within each of us that gives form to human speech.

Even tied to advanced machine-learning software, chatterbots easily use similarity to falsely suggest authenticity. And there’s the rub. Generating speech that implies preferences, complex feelings or emotions makes sense only when there is an implied “I.” For lack of a better word, with Siri or Watson there is no kindred soul at home. The language of a bot is a simulacrum: a copy of a natural artifact, but not a natural artifact itself.

Even so, we should celebrate what we have: machines that can verbalize fluently and–with complex algorithms–might speak to our own unique interests.