Mediaeater Digest Vol 30, No. 173

 

The Future of Streaming (According to Roberts, Malone and Diller)  (nyt)  Netflix is highly profitable, with operating margins of 28 percent. In the first quarter of 2024, Netflix reported revenue of $9.4 billion, and $2.3 billion in net income. No one else comes close.

OpenAI’s Mira Murati: “some creative jobs maybe will go away, but maybe they shouldn’t have been there in the first place”

AI Doesn’t Kill Jobs? Tell That to Freelancers (wsj) Since the rollout of ChatGPT in November 2022, high-value tasks like IT & Networking have seen pay increases of up to 8%, while low-value tasks such as Admin Support and Writing have experienced significant pay decreases of up to 17% and 18%  When I see something that looks like it was written by AI, I just switch off,” she adds. “The internet has just gotten so much duller.”

I Will Fucking Piledrive You If You Mention AI Again (mataroa) “Look at us, resplendent in our pauper’s robes, stitched from corpulent greed and breathless credulity, spending half of the planet’s engineering efforts to add chatbot support to every application under the sun when half of the industry hasn’t worked out how to test database backups regularly.”

Hackers ‘jailbreak’ powerful AI models in global effort to highlight flaws (ft) Machine learning security start-ups raised $213mn across 23 deals in 2023, up from $70mn the previous year

AI is exhausting the power grid. Tech firms are seeking a miracle solution. (wapo) A ChatGPT-powered search on Google, according to the International Energy Agency, consumes almost 10 times the amount of electricity as a traditional search

Calculating Empires: A Genealogy of Technology and Power since 1500 (calculating empires) large-scale interactive visualization exploring how technical and social structures co-evolved over five centuries.

Introducing the next generation of Claude   (anthropic) “outperforms its peers on most of the common evaluation benchmarks for AI systems, including undergraduate level expert knowledge”.   

AI’s $600B Question (sequoiacap) The AI bubble is reaching a tipping point. Navigating what comes next will be essential.

schelling.ai. open-source, decentralized AI   

CDK cyber outage hits US auto dealers for second day in a row  (reuters) 15,000 car dealers are now owned and operated by crackers.

We are here:  dub a dub a deeeee dub a dub a deeeee dub a dub a deeeee dubbbbla dooo.  (Who wants to remix this with Crystal Waters lada di lada do with me ?

This Weeks Model

Audioseal (meta) Cutting-edge audio manipulation toolkit.
Florence (msft) Microsoft’s advanced conversational AI.
Open-Sora v1.2 (github) Community-driven open-source AI framework.
Video to sound effects (elevenlabs) AI-driven tool for converting video to audio effects.
Generating audio for video  (googledeepmind) DeepMind’s AI for creating audio content for videos.
Introducing Gen-3 Alpha (runway) Runway’s latest AI model for creative generation.
Veo: Our most capable generative video model  (aitestkitchen)  Generating videos.
JACSO (meta) Meta’s smart solution for audio-visual synchronization.

 

I have this image framed 6×6 massive – orange 60s lucite frame with brown marble border – took off the walls of 51 W 52nd street (CBS/Blackrock) during a refresh.  Still have it and its meaning continues to grow.

 

Licensing Deals for Dataset Creation

It’s encouraging to see that the market is moving towards licensing deals for dataset creation, but the current model, where tech companies make one-time payments and never compensate creators again, is fail. Clearly unsustainable by not accounting for the ongoing value generated from the datasets. 

Compensation mechanisms cannot be put in place with the current modes because the data was initially trained on unlicensed IP. But it should be clear that no amount of new deals can rectify the fundamental compensation issue, or come close in depth and volume. 

Content creators should be actively involved in the dataset creation process, establishing transparent licensing agreements that include provisions for ongoing royalties, rather than one-time payments. By involving creators from the start, allowing them to review and approve the use of their work, and maintaining open communication builds trust through fairness and transparency. Music licensing is a good analog.

Mediaeater Digest Vol 30, No. 68

NYT workplace advice columnist brings it –  Goodbye Work Friends….”Still, in my heart of hearts, I always wanted to tell you to quit your job. Negotiate for the salary you deserve. Stand up for yourself. Challenge authority. Tell your rude co-worker to shut up. Report your boss to everyone and anyone who will listen. Consult a lawyer. Did I mention quit your job?” …..

When To Write a Simulator “Any problem involving probabilities over time should humble you. Walk away and quietly go write a simulation.”  (sage advice -ed)  

SecondPage – Google Searches without media conglomerates  (tool) this extension blocks most of the top 1000 English sites from appearing in google search. 

Privacy and harm has long been an area of inquiry for me, last week was a bit of a mind blower. Apple put in place secure scaffolding around AI. and understanding (cliaming) your real-time context window with privacy forward AI utility.  

OPSEC win for us all.  Expect many news cycles of Apple needing to put in a back door.  They won’t and never should. Encryption matters and privacy is a fundamental human right. 

Amazon-Powered AI Cameras Used to Detect Emotions of Unwitting UK Train Passengers This falls under the worst possible application of ML.  You do not get to guess (infer intent) what is in people minds and make that actionable!  Full stop.  When did the concept of observation implies supervision and therefore control get lost on society.  We really are getting the governance we deserve. 

While Apple reshaped and recontexualized AI for the marketplace by providing utility around personal privacy,  NSA + OPEN AI have decided the exact opposite optics are needed

Treat your data accordingly and expect information governance, negating which data will-and-won’t, be trained and how your data will and will not be used.  

After Luma dropped their video offering, the Sora halo holding less power,  everyone else is a in constant battle to try and make celloid sense out of latent space.  Here is google paper on Generating audio for video. Every day the range of and tools change and improve. 

Apple / Huggingface repo –  mobile-appropriate vision transformer, CLIP, and image segmentation models 

A.Jafa
A.Jafa

Mediaeater Digest Vol.30, No. 163

States Take Up A.I. Regulation Amid Federal Standstill (nyt) State lawmakers across the country have proposed nearly 400 new laws on A.I. in recent months, according to the lobbying group TechNet. California leads the states with a total of 50 bills proposed, although that number has narrowed as the legislative session proceeds.  (ed note- don’t worry – they will get it done like ad-tech, privacy, social media and other key regulations – dripping sarcasm)

Meta says European data is essential for culturally relevant AI (stackdiary) Meta says it wants to “push to develop AI that understands and reflects European cultures, languages, and humor,” which sounds promising on the surface. However, the approach

Deception abilities emerged in large language models (pnas) This study unravels a concerning capability in Large Language Models (LLMs): the ability to understand and induce deception strategies. As LLMs like GPT-4 intertwine with human communication, aligning them with human values becomes paramount. The paper demonstrates LLMs’ potential to create false beliefs in other agents within deception scenarios, highlighting a critical need for ethical considerations in the ongoing development and deployment of such advanced AI systems. (seems impt -ed)

Private Cloud Compute: A new frontier for AI privacy in the cloud (apple.com) Apple Intelligence is the personal intelligence system that brings powerful generative models to iPhone, iPad, and Mac. For advanced features that need to reason over complex data with larger foundation models, we created Private Cloud Compute (PCC), a groundbreaking cloud intelligence system designed specifically for private AI processing. For the first time ever, Private Cloud Compute extends the industry-leading security and privacy of Apple devices into the cloud, making sure that personal user data sent to PCC isn’t accessible to anyone other than the user — not even to Apple. 

Apple’s On-Device and Server Foundation Models (machinelearning.apple.com) We protect our users’ privacy with powerful on-device processing and groundbreaking infrastructure like Private Cloud Compute. We do not use our users’ private personal data or user interactions when training our foundation models.

MacOS Sequoia to Allow iCloud Logins in Virtual Machines on ARM Macs (developer.apple.com) If someone moves a VM to a different Mac host and restarts it, the Virtualization framework automatically creates a new identity for the VM using the information from the Secure Enclave of the new Mac host. This identity change requires the person using the VM to reauthenticate to allow iCloud to restart syncing data to the VM.

What Apple’s AI Tells Us: Experimental Models⁴ While Apple is building narrow AI systems that can accurately answer questions about your personal data (“tell me when my mother is landing”), OpenAI wants to build autonomous agents that would complete complex tasks for you (“You know those emails about the new business I want to start, could you figure out what I should do to register it so that it is best for my taxes and do that.”). The first is, as Apple demonstrated, science fact, while the second is science fiction, at least for now.

 

(this digest in progress)

Clapper AI
Clapper AI

Generative AI Video: Embracing Interoperability and Open-Source Tools (Clapper AI)

Pointing out an important AI video opportunity: Open source script to screen AI video editor Clapper AI. This project and format is singular in the market. Studios, media companies, producers and talent all stand to benefit by supporting development while it is in the nascent stage.  

It would set a baseline standard thats interoperable and open. It sits atop existing workflows and pipelines  This is a framework that scales for everyone, if done correctly with the added benefit of avoiding costly strategic dependencies in a future timeline.

The ability to switch between different vendors and models seamlessly is a game-changer. This flexibility drives competition like Sora against Veo, StabilityAI against ElevenLabs, and Replicate against Fal.ai, allowing users to harness the best features from each. Unlike traditional open-source tools such as Blender and GIMP, which typically compete against a single commercial product, this tool’s interoperability is a significant competitive edge.

As the media industry is forced to reckon with generative AI multi-model tools, the importance of early stage strategic long term thinking is critical. Be smart, local integration of compute and training into natural production workflows, will fundamentally change how content is created and consumed.

For studios, an open-source script-to-screen AI video editor offers efficiencies + customization and strategic independence. Producers would benefit  interoperability + streamline workflows, allowing the focus to be on storytelling

The barrier removed for talent so they can  prototype their own narratives, better understand set and setting and scene context, dev new story telling skills and act as a catalyst for creative acceleration. Incremental not supplemental.

Interoperable standards can ensure compatibility between different tools and models, enabling seamless workflows and maximizing efficiency, as it should be. (mp3 and audio codecs anyone) It will also foster  innovation and adaptation, but perhaps more importantly avoiding dependencies on specific vendors allows for strategic flexibility. Consider this an open call to action for dev. 

 

 

Links:  – still in dev…

Code  https://github.com/jbilcke-hf/clapper
App   https://clapper.app

 

 

 

Ed note: We are a long, long, long, way from this replacing film making.  This is how we don’t leave that behind.  The list of AI generated multi-model media thats meaningful can be counted on zero fingers right now, be a part of that change.  Fin    

Mediaeater Digest Vol.30, No. 159

YouTube Copyright Transparency Report (youtube) paid out over $70 billion to creators, artists, and media companies in recent years, and it boasts over 100 million Premium and Music subscribers. Its Content ID system, which allows rightsholders to monetize user-generated content, has paid out over $9 billion in ad revenue

Adobe roofies all of their customers (youtube) After they went to the cloud I stopped using their products. -ed

How to Lead an Army of Digital Sleuths in the Age of AI (nyt) Eliot Higgins and his 28,000 forensic foot soldiers at Bellingcat have kept a miraculous nose for truth—and a sharp sense of its limits—in Gaza, Ukraine, and everywhere else atrocities hide online.

Apple to Debut Passwords App in Challenge to 1Password, LastPass  (bloomberg)  Apple Inc. will introduce a new homegrown app next week called Passwords, aiming to make it easier for customers to log in to websites and software, according to people with knowledge of the matter.

Andrew Ng (Twitter) The effort to protect innovation and open source continues. I believe we’re all better off if anyone can carry out basic AI research and share their innovations. Right now, I’m deeply concerned about California’s proposed law SB-1047. It’s a long, complex bill with many parts that require safety assessments, shutdown capability for models, and so on.

FTC Opens Antitrust Probe of Microsoft AI Deal (WSJ) Commission has sent subpoenas to tech giant and startup, asking whether their partnership evaded required government review

Heymusic.ai – Coherence-Oriented Contrastive Learning of Musical Audio Representations

Secret hand gestures in paintings  (pdf) – hand signs in visual art may provide clues about the underlying iconographical symbols. This paper will examine the eventual hidden meanings behind a peculiar hand gesture that has been widely used by several painters

KLING – Chinese new DiT Video AI Generation model 【KLING】 Open access  Generate 120s Video with FPS30 1080P, Understand Physics Better, Model Complex Motion Accurately prompt: Traveling by train, viewing all sorts of landscapes through the window.

Marker: Converts PDFs to Markdown using deep learning, supports all languages.

Intro to LLM’s – Slides from talk by Andrej Karpathy 

Recommended reading

Introducing Stable Audio Open – An Open Source Model for Audio Samples and Sound Design (stablityai)
Stable Audio Open is an open source text-to-audio model for generating up to 47 seconds of samples and sound effects. Users can create drum beats, instrument riffs, ambient sounds, foley and production elements. The model enables audio variations and style transfer of audio samples.

Mistral has released a software development kit (SDK), Mistral-Finetune, for fine-tuning its models on workstations, servers and small datacenter nodes.

The Impossibility of Fair LLMs We show that each framework either does not logically extend to LLMs or presents a notion of fairness that is intractable for LLMs, primarily due to the multitudes of populations affected, sensitive attributes, and use cases.

Recommended reading:

Mind as Machine by Margaret Boden (2006)
Weapons of Math Destruction by Cathy O’Neil (2016)
Algorithms of Oppression by Safiya Noble (2018)
Race after Technology by Ruha Benjamin (2019)
Discriminating Data by Wendy Chun (2021)
Artificial Knowing by Alison Adam (1998)
Computation and Human Experience by Philip Agre (1997)
The Closed World by Paul Edwards (1996)
Computer Power and Human Reason by Joseph Weizenbaum (1976)
Alchemy and Artificial Intelligence by Hubert Dreyfus (1965)
The Machinery Question by Maxine Berg (1980)