Tools to realize and render a better future

 “And when I opened the curtains they were taking the set away and packing up for the day, the cameras and lights turned off. ;

The darkness replaced with strip lights and the grey skies, the blind whirring of machinery.

 I’d like to write a beautiful story about love.”   Stanley Donwood

 

Storytelling is a fundamental human need, an art that is a moral imperative, the mediums through which we express this need have changed from spoken word passed down through generations to AI and other technologies that are emerging now. 

The future of storytelling lies in the tales we tell, and the tools we use to create to tell them. Stories shape our world and imagination.

New tools can break barriers and democratize storytelling, allowing diverse voices to inspire hope, foster empathy, and guide positive change. Tools with the potential to be as revolutionary as the alphabet or the camera,  allowing for new narrative exploration and more importantly the ability for us to impact our collective imagination. Text to screen is powerful advancement in our ability to tell stories. 

Tools and technology has always impacted storytelling, from silent films to interactive narratives, augmented reality, and AI-assisted story generation. These advances have blurred the line between passive viewer and active participant, the economics of content creation has made it easy for people to share mulit-model narrivies with ease.

TikTok exemplifies the ongoing shift from passive to active engagement between creators and viewers, eliminating the need for traditional infrastructure, tools, and gatekeeping mechanisms.

We become the stories we tell.  

Stories, especially science fiction, influence technological innovation. Works like “Star Trek” and “Neuromancer” inspired inventions like flip phones and the internet. Stories reflect our hopes and fears about technology, guiding its development. They inspire us to push boundaries and navigate the ethical implications of our creations.

Science fiction has often predicted technological advancements that later became reality. The Metaverse from Neal Stephenson’s “Snow Crash,” AI assistants like Siri and Alexa inspired by Isaac Asimov’s “I, Robot,” the concept of cyberspace from William Gibson’s “Neuromancer,” and tablet computers and video calls from Arthur C. Clarke’s “2001: A Space Odyssey.”

Authors like Wells, Asimov, Clarke, Dick, Robinson, and Liu have predicted technologies and grappled with their ethical implications. Their stories act as a laboratory for exploring ideas and their consequences, shaping our expectations, innovations, and the future.  In a sense these authors are futurists in disguise.  

Our narrative preferences mirror our cultural and personal contexts. Stories reflect societal values;  for example, climate fiction reflects environmental concerns.

The goal is to create a virtuous cycle: better tools that lead to stronger more impactful stories, which in turn inspire more beneficial real-world innovations, feeding back into even better stories. This approach aligns storytelling tools directly with the aim of influencing the development of positive technology and ideally societal progress.  Story telling tools that enable the creative process, not disrupt it. 

Make tools that realize and render a stronger future.  

Tools focused on promoting stronger, future-shaping narratives. 

New storytelling tools, such as generative AI, democratize storytelling by lowering barriers to entry and making it accessible to diverse voices. These tools enable rapid integration of new technologies, generative or otherwise,  take advantage of immediate global distribution, and support trans-media storytelling across various platforms and screens. Tools that enhance interactivity, create immersive experiences, and foster collaborative creation and feedback.  The opposite of legacy broadcast and film models.

By creating tools that facilitate stronger narratives that everyone can use we can potentially alter the course of our collective future, explore solutions to global challenges, foster empathy, and inspire innovations. Everyone should be able to create using cutting-edge storytelling tools. Maybe we can use power of storytelling and how they impact us to craft a more hopeful, inclusive, and innovative future. The stories we tell and the tools we use are blueprints for the world we wish to build.  We all need tools that realize and render a better future

Suno + Udio Lawsuits

What happens to music, happens to everyone.

Suno: https://s3.documentcloud.org/documents/24776034/1.pdf

Udio: https://s3.documentcloud.org/documents/24776030/1.pdf

“Accompanying this Complaint and designated as Exhibit C is a thumb drive that contains all the Udio outputs referenced herein and in Exhibit B. In the event Udio seeks to remove this evidence of its infringing conduct from public view, the examples cited herein are preserved on this medium.”  (at last this language -ed)

Show the LLM weights, LLM data,  open code, or delete the models.

It’s not the the technology is bad, it’s that rights holders did not see this coming, and did not lay the legal frameworks, and industry best practices. The game started without them (again). These are conversations that should have happened years ago, but likely distracted by some metaverse or quantum something, something,.

The declaration from one major label setting boundaries recently as well as this lawsuit, even if was a day late and model short,  it’s exactly what I am talking about. Do it.   

Prove or remove it should be the remit from all major rights holders in unison and then set the deal terms on a flat playing field, adding on the value of the IP that was stolen and used for to gain market share to the bill.  Remember when Google search launched its news area and the news industry did not negotiate a rev split,  this, is that again. 

Do new deals with rights in tact,  auditable deeply embedded rights trackers in latent space  for downstream revenues.

Mike Kelly

Mediaeater Digest Vol 30, No. 173

Seeing Like A Network- Dark Forests, Dense Networks  ( strangeloopcanon.com ) culture is composed of the communication patterns, behaviours, and symbols that are shared amongst a group. We can think of culture as the common interconnected web that underlay the beliefs that we all hold, which constantly changes and evolves as our beliefs spread.

The Future of Streaming (According to Roberts, Malone and Diller)  (nyt)  Netflix is highly profitable, with operating margins of 28 percent. In the first quarter of 2024, Netflix reported revenue of $9.4 billion, and $2.3 billion in net income. No one else comes close.

Apple Introduces the iStick    (spyglass) The EU puked up the carrots, so Apple uses their Intelligence…

OpenAI’s Mira Murati: “some creative jobs maybe will go away, but maybe they shouldn’t have been there in the first place”

AI Doesn’t Kill Jobs? Tell That to Freelancers (wsj) Since the rollout of ChatGPT in November 2022, high-value tasks like IT & Networking have seen pay increases of up to 8%, while low-value tasks such as Admin Support and Writing have experienced significant pay decreases of up to 17% and 18%  When I see something that looks like it was written by AI, I just switch off,” she adds. “The internet has just gotten so much duller.”

I Will Fucking Piledrive You If You Mention AI Again (mataroa) fixed this link  “Look at us, resplendent in our pauper’s robes, stitched from corpulent greed and breathless credulity, spending half of the planet’s engineering efforts to add chatbot support to every application under the sun when half of the industry hasn’t worked out how to test database backups regularly.”

Hackers ‘jailbreak’ powerful AI models in global effort to highlight flaws (ft) Machine learning security start-ups raised $213mn across 23 deals in 2023, up from $70mn the previous year

AI is exhausting the power grid. Tech firms are seeking a miracle solution. (wapo) A ChatGPT-powered search on Google, according to the International Energy Agency, consumes almost 10 times the amount of electricity as a traditional search

Calculating Empires: A Genealogy of Technology and Power since 1500 (calculating empires) large-scale interactive visualization exploring how technical and social structures co-evolved over five centuries.

Introducing the next generation of Claude   (anthropic) “outperforms its peers on most of the common evaluation benchmarks for AI systems, including undergraduate level expert knowledge”.   

AI’s $600B Question (sequoiacap) The AI bubble is reaching a tipping point. Navigating what comes next will be essential.

schelling.ai. open-source, decentralized AI   

CDK cyber outage hits US auto dealers for second day in a row  (reuters) 15,000 car dealers are now owned and operated by crackers.

We are here:  dub a dub a deeeee dub a dub a deeeee dub a dub a deeeee dubbbbla dooo.  (Who wants to remix this with Crystal Waters lada di lada do with me ?

This Weeks Model

Audioseal (meta) Cutting-edge audio manipulation toolkit.
Florence (msft) Microsoft’s advanced conversational AI.
Open-Sora v1.2 (github) Community-driven open-source AI framework.
Video to sound effects (elevenlabs) AI-driven tool for converting video to audio effects.
Generating audio for video  (googledeepmind) DeepMind’s AI for creating audio content for videos.
Introducing Gen-3 Alpha (runway) Runway’s latest AI model for creative generation.
Veo: Our most capable generative video model  (aitestkitchen)  Generating videos.
JACSO (meta) Meta’s smart solution for audio-visual synchronization.

 

I have this image framed 6×6 massive – orange 60s lucite frame with brown marble border – took off the walls of 51 W 52nd street (CBS/Blackrock) during a refresh.  Still have it and its meaning continues to grow.

 

Licensing Deals for Dataset Creation

It’s encouraging to see that the market is moving towards licensing deals for dataset creation, but the current model, where tech companies make one-time payments and never compensate creators again, is fail. Clearly unsustainable by not accounting for the ongoing value generated from the datasets. 

Compensation mechanisms cannot be put in place with the current modes because the data was initially trained on unlicensed IP. But it should be clear that no amount of new deals can rectify the fundamental compensation issue, or come close in depth and volume. 

Content creators should be actively involved in the dataset creation process, establishing transparent licensing agreements that include provisions for ongoing royalties, rather than one-time payments. By involving creators from the start, allowing them to review and approve the use of their work, and maintaining open communication builds trust through fairness and transparency. Music licensing is a good analog.

Summer Reading

Reading list so far this summer.  (first half of summer )  see  (part 2, end of summer)

All Fours – Miranda July
Co-Intelligence – Ethan Mollusk
The Eye Of The Master
Matteo Pasquinelli
Biography of X Catherine Lacey
Perry Perspective Agents
Daniel Suarez Critical Mass
Charles Duhigg – Supercommunicators
Hari Kunzru Blue Ruin
Kandel -Essays On Art And Science
Tricia Romano – The Freaks Came Out To Wirite
The Twenty Days Of Turin – Giorgio De Maria
Prophet Song – Paul Lynch
Jesmyn Ward Let Us Descend
The Boy, The Mole, The Fox And The Horse – Charlie Mackesy
Candy Darling – Cynthia Carr
Martyr – Kaveh Akbar
The Fraud – Zadie Smith

 

Mediaeater Digest Vol 30, No. 68

NYT workplace advice columnist brings it –  Goodbye Work Friends….”Still, in my heart of hearts, I always wanted to tell you to quit your job. Negotiate for the salary you deserve. Stand up for yourself. Challenge authority. Tell your rude co-worker to shut up. Report your boss to everyone and anyone who will listen. Consult a lawyer. Did I mention quit your job?” …..

When To Write a Simulator “Any problem involving probabilities over time should humble you. Walk away and quietly go write a simulation.”  (sage advice -ed)  

SecondPage – Google Searches without media conglomerates  (tool) this extension blocks most of the top 1000 English sites from appearing in google search. 

Privacy and harm has long been an area of inquiry for me, last week was a bit of a mind blower. Apple put in place secure scaffolding around AI. and understanding (cliaming) your real-time context window with privacy forward AI utility.  

OPSEC win for us all.  Expect many news cycles of Apple needing to put in a back door.  They won’t and never should. Encryption matters and privacy is a fundamental human right. 

Amazon-Powered AI Cameras Used to Detect Emotions of Unwitting UK Train Passengers This falls under the worst possible application of ML.  You do not get to guess (infer intent) what is in people minds and make that actionable!  Full stop.  When did the concept of observation implies supervision and therefore control get lost on society.  We really are getting the governance we deserve. 

While Apple reshaped and recontexualized AI for the marketplace by providing utility around personal privacy,  NSA + OPEN AI have decided the exact opposite optics are needed

Treat your data accordingly and expect information governance, negating which data will-and-won’t, be trained and how your data will and will not be used.  

After Luma dropped their video offering, the Sora halo holding less power,  everyone else is a in constant battle to try and make celloid sense out of latent space.  Here is google paper on Generating audio for video. Every day the range of and tools change and improve. 

Apple / Huggingface repo –  mobile-appropriate vision transformer, CLIP, and image segmentation models 

A.Jafa
A.Jafa

Mediaeater Digest Vol.30, No. 163

States Take Up A.I. Regulation Amid Federal Standstill (nyt) State lawmakers across the country have proposed nearly 400 new laws on A.I. in recent months, according to the lobbying group TechNet. California leads the states with a total of 50 bills proposed, although that number has narrowed as the legislative session proceeds.  (ed note- don’t worry – they will get it done like ad-tech, privacy, social media and other key regulations – dripping sarcasm)

Meta says European data is essential for culturally relevant AI (stackdiary) Meta says it wants to “push to develop AI that understands and reflects European cultures, languages, and humor,” which sounds promising on the surface. However, the approach

Deception abilities emerged in large language models (pnas) This study unravels a concerning capability in Large Language Models (LLMs): the ability to understand and induce deception strategies. As LLMs like GPT-4 intertwine with human communication, aligning them with human values becomes paramount. The paper demonstrates LLMs’ potential to create false beliefs in other agents within deception scenarios, highlighting a critical need for ethical considerations in the ongoing development and deployment of such advanced AI systems. (seems impt -ed)

Private Cloud Compute: A new frontier for AI privacy in the cloud (apple.com) Apple Intelligence is the personal intelligence system that brings powerful generative models to iPhone, iPad, and Mac. For advanced features that need to reason over complex data with larger foundation models, we created Private Cloud Compute (PCC), a groundbreaking cloud intelligence system designed specifically for private AI processing. For the first time ever, Private Cloud Compute extends the industry-leading security and privacy of Apple devices into the cloud, making sure that personal user data sent to PCC isn’t accessible to anyone other than the user — not even to Apple. 

Apple’s On-Device and Server Foundation Models (machinelearning.apple.com) We protect our users’ privacy with powerful on-device processing and groundbreaking infrastructure like Private Cloud Compute. We do not use our users’ private personal data or user interactions when training our foundation models.

MacOS Sequoia to Allow iCloud Logins in Virtual Machines on ARM Macs (developer.apple.com) If someone moves a VM to a different Mac host and restarts it, the Virtualization framework automatically creates a new identity for the VM using the information from the Secure Enclave of the new Mac host. This identity change requires the person using the VM to reauthenticate to allow iCloud to restart syncing data to the VM.

What Apple’s AI Tells Us: Experimental Models⁴ While Apple is building narrow AI systems that can accurately answer questions about your personal data (“tell me when my mother is landing”), OpenAI wants to build autonomous agents that would complete complex tasks for you (“You know those emails about the new business I want to start, could you figure out what I should do to register it so that it is best for my taxes and do that.”). The first is, as Apple demonstrated, science fact, while the second is science fiction, at least for now.

 

(this digest in progress)

Clapper AI
Clapper AI

Generative AI Video: Embracing Interoperability and Open-Source Tools (Clapper AI)

Pointing out an important AI video opportunity: Open source script to screen AI video editor Clapper AI. This project and format is singular in the market. Studios, media companies, producers and talent all stand to benefit by supporting development while it is in the nascent stage.  

It would set a baseline standard thats interoperable and open. It sits atop existing workflows and pipelines  This is a framework that scales for everyone, if done correctly with the added benefit of avoiding costly strategic dependencies in a future timeline.

The ability to switch between different vendors and models seamlessly is a game-changer. This flexibility drives competition like Sora against Veo, StabilityAI against ElevenLabs, and Replicate against Fal.ai, allowing users to harness the best features from each. Unlike traditional open-source tools such as Blender and GIMP, which typically compete against a single commercial product, this tool’s interoperability is a significant competitive edge.

As the media industry is forced to reckon with generative AI multi-model tools, the importance of early stage strategic long term thinking is critical. Be smart, local integration of compute and training into natural production workflows, will fundamentally change how content is created and consumed.

For studios, an open-source script-to-screen AI video editor offers efficiencies + customization and strategic independence. Producers would benefit  interoperability + streamline workflows, allowing the focus to be on storytelling

The barrier removed for talent so they can  prototype their own narratives, better understand set and setting and scene context, dev new story telling skills and act as a catalyst for creative acceleration. Incremental not supplemental.

Interoperable standards can ensure compatibility between different tools and models, enabling seamless workflows and maximizing efficiency, as it should be. (mp3 and audio codecs anyone) It will also foster  innovation and adaptation, but perhaps more importantly avoiding dependencies on specific vendors allows for strategic flexibility. Consider this an open call to action for dev. 

 

 

Links:  – still in dev…

Code  https://github.com/jbilcke-hf/clapper
App   https://clapper.app

 

 

 

Ed note: We are a long, long, long, way from this replacing film making.  This is how we don’t leave that behind.  The list of AI generated multi-model media thats meaningful can be counted on zero fingers right now, be a part of that change.  Fin