Blog | Mark Ghuneim

04 Jan 2025

MDG Blog Posts Permalink

Automotive Data

MDG January 4, 2025

Tesla – your public data, theirs, not yours. Same for that lock.

“The sheriff said Tesla CEO Elon Musk helped the investigation by having the truck unlocked after it auto-locked in the blast and giving investigators video of the suspect at charging stations along its route from Colorado to Las Vegas.“

VW: We don’t even know how to protect the sensitive data we have and and we are collecting everything possible. Volkswagen stored movement data from three quarters of a million cars in an open container that anyone could access. ccc video from 38c3

This practice needs to be addressed by all the major automobile companies.

03 Jan 2025

MDG Blog Posts Permalink

Strengthen Your Networks

MDG January 3, 2025

A Fresh Start for 2025: Building Habits That Matter

The New Year acts as a nice cosmic reset. I love it for creating habits, serialization of creative efforts and a fresh way of thinking about things.

Creating habits is a about choosing behaviors and principles you can live by, day in and day out, until they’re woven into your existence. One tenant I always embrace but maybe its best to institutionalize it as a habit for 2025:

Strengthen Your Networks

We need each other. Don’t mistake individualism for solitude. Your personal network is your safety net. It’s not city services or government agencies that will come through for you when life gets hard. It’s the people you’ve built relationships with—your friends, neighbors, and community.

This isn’t just about having people to call in a crisis; it’s about creating an implicit social contract. A strong network means mutual trust and support. It’s about being there for each other, sharing resources, and building resilience together.

Whether it’s a natural disaster, an economic downturn, a unknown, unknown, or simply needing someone to help with your kid’s carpool, your community will be there for you in ways institutions can’t.

How you strengthen networks.

Invest in friendships: Check in with people regularly. Be the first to reach out. ←–

Engage your neighbors: Host a block party, join a local group, or simply say hi more often.

Be generous: Offer help before you’re asked. Reciprocity will follow naturally.

When your network is strong, you’re not just more secure; you’re more connected, more fulfilled, and better equipped to thrive in whatever comes your way.

Moving Forward

The new year is also a good time for strengthening your digital network as well.

A perfect time to conduct a digital security check and lock down your online presence. Some basic OPSEC (operational security) procedures to this end.

Audit your accounts: Review all online accounts and deactivate ones you no longer use.

Update passwords: Use strong, unique passwords for each account and consider a password manager.

Enable two-factor authentication (2FA): Add an extra layer of security wherever possible.

Review privacy settings: Check and adjust privacy settings on social media and other platforms.

Be mindful of sharing: Limit the PII (Personal Identifable Information ) you share online to reduce vulnerability.

Secure network protect you and ensures your connections remain strong, not an attack vector.

28 Oct 2024

MDG Blog Posts Permalink

Multi-Modal Models

MDG October 28, 2024

Multi Model Models. from https://huggingface.co/collections/merve/mit-talk-31-10-papers-671f6a16e156f77739820c89 (MIT Talk 31/10 Papers)

NVLM: Describes images using vision-language integration.
BRAVE: Detects multiple objects in cluttered scenes.
Mini-Gemini: Answers questions about images.
Unified OCR: Extracts text from diverse images.
EVA-CLIP: Matches text and images.
BLIP: Efficient text/image retrieval.
LLM-in-Vision: Describes complex visual scenes.
Efficient Retrieval Model: Fast document search.

03 Oct 2024

MDG Blog Posts Permalink

Mapping Data Surveillance Infrastructure : US

MDG October 3, 2024

Audio narrative for this datamap of US (click for full view)

26 Aug 2024

MDG Blog Posts Permalink

AI Media Production Resources

MDG August 26, 2024

Classification of Moving Visual Media:
Resource: Classification of Moving Visual Media
Summary: This article categorizes visual media into traditional, generative, interactive, and vector-based media.

AI, Copyright, and Creativity:
Resource: AI-14 Harmonies + Copyright Law
Summary: This piece addresses the legal complexities of AI-generated content, focusing on the 14 modalities of AI in music creation and the challenges they pose to current copyright laws.

Key Concepts in AI-Driven Video Production:
Resource: AI-Driven Video Production Lexicon
Summary: A RFC around the lexicon that outlines crucial terms and frameworks for integrating AI into video production.

Generative AI Video: Embracing Interoperability and Open-Source Tools
Resource: Generative AI Video and Clapper AI
Summary: This post explores the importance of open-source tools and interoperability in generative AI video production. It highlights Clapper AI, a platform focused on simplifying and democratizing video content creation through AI.

24 Aug 2024

MDG Blog Posts Permalink

AI in Film and Video Production: A Incomplete Glossary of Terms.

MDG August 24, 2024

AI in Film and Video Production: A Incomplete Glossary of Terms. (RFC) REQUEST FOR COMMENTS

This glossary provides definitions and examples of key AI terms used in film and video production, along with sources where available.

AI in Film and Video Production: A Incomplete Glossary of Terms

This glossary provides definitions and examples of key AI terms used in film and video production, along with sources where available.

This document is a RFC (Request For Comments) it allows annotation for those who would like to refine, suggest, edit etc.

A

Adaptive Narrative Engines

Definition: AI systems designed to dynamically adjust a film’s narrative in real-time based on viewer engagement, preferences, or external data inputs.

Example: The “StoryFlex” engine created a branching narrative film that subtly adjusted character decisions and plot points based on aggregated viewer emotional responses.

Reference: “StoryFlex: Adaptive Narrative Engines for Personalized Cinematic Experiences” (Netflix Research, 2024) – [This reference is hypothetical and needs to be verified.]

AI-Assisted Color Grading

Definition: AI tools that assist in color correction and grading, allowing for consistent color tones throughout the video with minimal human input.

Example: The film’s visual tone was unified across different scenes using AI-assisted color grading.

AI-Assisted Storyboarding

Definition: The use of AI to generate or refine storyboards based on script input, helping visualize scenes before production.

Example: The director used AI-assisted storyboarding to quickly visualize complex action sequences, saving time in pre-production.

AI Scene Analysis

Definition: Automated breakdown of video scenes to identify elements like shot composition, lighting, and camera movements.

Example: AI scene analysis helped the editing team quickly categorize and select the best takes from hours of footage.

AI-Driven CGI

Definition: The use of AI to create computer-generated imagery (CGI) with greater efficiency and realism, often used in special effects and animation.

Example: AI-driven CGI was used to create the film’s futuristic cityscapes with stunning realism.

AI-Driven Cinematography

Definition: The use of AI to control camera movements and framing in real-time, either in physical or virtual production environments.

Example: The documentary crew employed AI-driven cinematography to automatically track and frame wildlife subjects in challenging terrain.

AI Impact on Entertainment

Definition: The influence and applications of AI technologies in the entertainment industry, encompassing tasks like content generation, editing, and audience engagement.

Example: AI’s impact on entertainment is evident in how modern films use AI for scriptwriting and special effects.

AI Models and Development

Definition: The creation and refinement of AI systems, including training neural networks and developing algorithms for various applications.

Example: The studio focused on AI models and development to enhance their video editing software with new features.

AI Projects Overview

Definition: A summary or analysis of ongoing or completed AI-related projects, often highlighting the scope, objectives, and outcomes.

Example: The AI projects overview revealed significant advancements in automated video editing tools.

AI Video Editing

Definition: The automation of traditional video editing tasks such as cutting, transitions, and color grading using AI, allowing for more efficient post-production workflows.

Example: The editor used AI video editing software to quickly assemble the rough cut of the film.

AI-Enhanced Rotoscoping

Definition: The use of AI to assist in the process of tracing over footage frame-by-frame for special effects or animation purposes.

Example: AI-enhanced rotoscoping significantly reduced the time needed to separate the actor from the background for the compositing process.

AI Crowd Simulation

Definition: The use of AI to generate and control large numbers of background characters or entities in a scene.

Example: The battle scene’s thousands of individual soldiers were created and animated using AI crowd simulation techniques.

Attention Mechanisms

Definition: A component of neural networks that allows the model to focus on specific parts of the input when performing a task, crucial for understanding context in video sequences.

Example: The AI’s attention mechanism helped it accurately track and edit multiple moving objects in the complex action sequence.

Audio-Driven Video Generation

Definition: The creation of video content based on audio input, such as generating lip movements for animated characters based on speech.

Example: The animation studio used audio-driven video generation to quickly prototype lip-sync for their characters before final animation.

Auto-Reenactment

Definition: An AI technique that allows characters in video to mimic the actions or expressions of a different individual in real-time.

Example: Auto-reenactment was used to sync the actor’s expressions with those of the motion capture performer.

B

Blocking

Definition: The precise staging of actors in a scene, including their movements and positions, to ensure that the visual composition aligns with the director’s vision and the narrative flow.

Usage: The director spent time blocking the scene, ensuring each actor’s movement complemented the camera angles and the story’s emotional beats.

C

Character Consistencies

Definition: The practice of maintaining consistent traits, behaviors, and visual appearances for characters across all scenes in a video or film production, ensuring coherence in storytelling and character development.

Usage: The editing team carefully monitored character consistencies, ensuring that the protagonist’s appearance and personality traits remained the same throughout the movie, despite being filmed out of sequence.

Conditional Video Generation

Definition: The process of creating videos based on specific conditions or inputs, such as text descriptions, audio, or other videos.

Example: The team used conditional video generation to create multiple versions of the same scene with different weather conditions based on text prompts.

Continuity

Definition: The consistency of visual, audio, and narrative elements across different scenes and shots in a film, ensuring that there are no discrepancies that could disrupt the viewer’s immersion.

Usage: The script supervisor was responsible for maintaining continuity, making sure that props, costumes, and actor positions remained consistent between takes.

Contextual Sound Design

Definition: AI-driven audio systems that generate and mix sound effects, ambient noise, and music in real-time, adapting to the visual content and narrative context of a scene.

Example: The “AudioScene” AI dynamically created and mixed a full sound design for a forest scene, adjusting bird calls and wind sounds based on camera movement and character actions.

Reference: “AudioScene: Context-Aware Sound Design for Immersive Cinematic Experiences” (Berklee College of Music & MIT, 2023) – [This reference is hypothetical and needs to be verified.]

D

Data Augmentation

Definition: The process of enhancing the training dataset for AI models by applying various transformations, such as rotations, flips, or color adjustments, to improve model robustness.

Example: Data augmentation was used to expand the variety of scenes the AI could recognize, making it more effective at identifying objects in the final film.

Denoising

Definition: The process of removing noise from video or image data to improve clarity and visual quality, often applied during post-production.

Example: The old film footage was significantly improved by applying denoising algorithms, making the scenes much clearer and more watchable.

Deep Learning

Definition: A subset of machine learning involving neural networks with many layers that can automatically learn representations from data, often used in generating and editing video content.

Example: The AI utilized deep learning algorithms to enhance the video quality by predicting the most accurate pixel arrangements.

Deepfake

Definition: A type of synthetic media where a person in an existing image or video is replaced with someone else’s likeness using deep learning techniques, often raising ethical concerns in media.

Example: The deepfake of the historical figure was so convincing that it passed as original footage to many viewers.

De-Aging

Definition: AI techniques used to make actors look younger in videos, involving digital skin smoothing, facial reshaping, and the removal of age-related features.

Example: The actor’s appearance was digitally de-aged to match his look from the original film.

Diegetic Sound

Definition: Sound that originates from within the film’s world, heard by both the characters and the audience, such as dialogue, footsteps, or music played on a radio.

Usage: The use of diegetic sound in the party scene made the environment feel lively and immersive for the audience.

Diffusion Models

Definition: Probabilistic models used for generating images and videos by iteratively refining a noise-filled image to a clean one, widely used in video synthesis.

Example: The diffusion model was employed to create atmospheric effects in the background of the animated sequence.

Digital Twins

Definition: AI-generated digital replicas of real-world entities, used for simulations, interactive video content, or enhancing narrative consistency.

Example: Digital twins of the main characters were created to simulate complex action scenes without putting the actors at risk.

Dynamic Character Synthesis

Definition: AI technology capable of generating and animating photorealistic digital characters that can adapt their performance in real-time based on narrative context.

Example: The “ActorNet” system created a fully digital supporting cast for a film, with characters capable of improvising dialogue and actions in response to live actors.

Reference: “ActorNet: Real-time Synthesis of Adaptive Digital Actors” (ETH Zurich, 2024) – [This reference is hypothetical and needs to be verified.]

E

Emotive Rendering

Definition: AI-driven rendering techniques that adjust visual elements (lighting, color grading, depth of field) in real-time to enhance the emotional impact of scenes.

Example: The “EmotiVFX” system automatically adjusted the visual tone of scenes to match the intended emotional arc of the story, enhancing viewer engagement.

Reference: “EmotiVFX: Real-time Emotive Rendering for Enhanced Cinematic Experience” (SIGGRAPH Asia, 2024) – [This reference is hypothetical and needs to be verified.]

Emotion Recognition in Video

Definition: AI-driven detection and analysis of human emotions in video footage, useful for performance analysis or audience reaction studies.

Example: The marketing team used emotion recognition software to gauge audience reactions during test screenings.

Establishing Shot

Definition: A wide shot that introduces a scene by showing the surrounding environment, often used to set the location or context for the action that follows.

Usage: The film opened with an establishing shot of the bustling city skyline, immediately situating the audience in the story’s urban setting.

Exposition

Definition: The literary technique of providing background information to the audience, necessary for understanding the story’s context, characters, and setting. Exposition is often delivered through dialogue, narration, or visual cues, particularly at the beginning of a story.

Usage: The film’s exposition effectively introduced the protagonist’s troubled past, setting the stage for the conflicts that would unfold.

F

Foley

Definition: The reproduction of everyday sound effects that are added to films during post-production to enhance audio quality and realism.

Usage: The sound team used Foley to recreate the sound of footsteps on gravel, adding a layer of realism to the nighttime chase scene.

Frame Interpolation

Definition: The process of generating intermediate frames between two images to create smoother video motion, often used to enhance video playback.

Example: Frame interpolation allowed the slow-motion scenes to appear more fluid and natural.

G

Generative Audio

Definition: AI-powered creation of sound effects, music, or voice acting to complement video content.

Example: The post-production team used generative audio to create alien language vocalizations for the sci-fi film.

Generative Set Design

Definition: Using AI to create or modify virtual set designs based on textual descriptions or style references.

Example: The production designer used generative set design to rapidly iterate through different futuristic cityscape concepts.

Generative Production Design

Definition: AI systems that can create detailed, context-aware production designs, including set layouts, props, and environmental elements based on script analysis.

Example: The “SetFormer” AI generated complete 3D set designs for a sci-fi film, including futuristic props and architectural details, based solely on script descriptions.

Reference: “SetFormer: Transformer-based Generative Models for Production Design” (Disney Research, 2023) – [This reference is hypothetical and needs to be verified.]

Generative Adversarial Networks (GANs)

Definition: A class of AI where two neural networks compete—one generating content and the other evaluating it—to create highly realistic video content.

Example: Using GANs, the team was able to generate realistic landscapes for the film’s virtual set design.

I

Inpainting

Definition: An AI-powered technique for filling in missing or corrupted parts of an image or video, often used in video restoration.

Example: The inpainting process restored the damaged portions of the archival film, making it viewable again.

K

Keyframe Animation

Definition: A traditional animation technique enhanced by AI to automatically generate in-between frames based on key frames set by the user, making animation more efficient.

Example: AI-assisted keyframe animation reduced the time needed to animate the complex fight scenes.

L

Lip-Syncing

Definition: The AI-driven alignment of lip movements in video with audio tracks, ensuring that dialogue appears synchronized with the character’s speech.

Example: The animation team used AI lip-syncing to match the characters’ mouths with the new dialogue recording.

M

Match Cut

Definition: A cut that connects two different scenes by matching similar visual elements, compositions, or actions, creating a smooth transition between them.

Usage: The match cut from the setting sun to the hero’s torch lighting up in the cave seamlessly connected the two scenes while maintaining the visual theme.

Montage

Definition: A sequence of short shots edited together to condense time, space, or information, often used to show the passage of time or a series of events quickly.

Usage: The training montage in the film effectively showcased the character’s progress over several months in just a few minutes.

Motion Amplification

Definition: Enhancing subtle motions in video, like small vibrations, using AI for purposes such as scientific analysis or artistic effect.

Example: Motion amplification revealed the tiny, otherwise invisible, vibrations of the structure, adding an eerie atmosphere to the horror film.

Motion Transfer

Definition: The technique of transferring the motion patterns from one video to another, typically used in AI-generated choreography or character animations.

Example: Motion transfer was used to map the dancer’s movements onto a CGI character, bringing it to life on screen.

Multimodal AI

Definition: AI systems that integrate multiple types of data (e.g., text, audio, and video) to generate coherent content, useful in creating complex multimedia productions.

Example: The multimodal AI system combined visual, auditory, and textual inputs to create a fully immersive virtual reality experience.

Multimodal Video Understanding

Definition: AI systems that can interpret and analyze video content using multiple data types, such as visual information, audio, and text.

Example: The content moderation system used multimodal video understanding to flag inappropriate videos by analyzing visuals, speech, and on-screen text simultaneously.

Multi-View Synthesis

Definition: An AI technique for generating new viewpoints of a scene from a limited set of input images or video frames.

Example: Multi-view synthesis allowed the director to create a 360-degree view of the actor’s performance from just three camera angles.

N

Narrative Intelligence

Definition: AI algorithms designed to understand, generate, and manipulate narrative structures in film, including plot development, character arcs, and pacing.

Example: The “StoryGen” model successfully generated coherent feature-length screenplays by understanding and applying complex narrative principles.

Reference: “StoryGen: Advancing Narrative Intelligence in AI-Driven Screenwriting” (USC School of Cinematic Arts, 2023) – [This reference is hypothetical and needs to be verified.]

NeRF (Neural Radiance Fields)

Definition: A method for generating 3D scenes from 2D images using AI, often used for creating virtual environments or enhancing videos with realistic depth and perspective.

Example: The filmmakers used NeRF to create detailed 3D models of the set based on concept art and photographs.

Neural Film Language

Definition: A framework for using neural networks to understand and generate cinematic language, including shot compositions, transitions, and narrative structures.

Example: Researchers at MIT developed a neural film language model that could automatically generate shot lists and camera movements based on script input.

Reference: “Neural Film Language: A New Paradigm for Cinematic Storytelling” (MIT Media Lab, 2023) – [This reference is hypothetical and needs to be verified.]

Neural Editing Patterns

Definition: AI models that learn and apply complex editing techniques, including pacing, transitions, and montage structures, based on analysis of existing films.

Example: The “EditMind” system successfully emulated the distinct editing styles of renowned filmmakers, applying these patterns to raw footage to create stylistically coherent edits.

Reference: “EditMind: Learning and Applying Neural Editing Patterns for Automated Film Editing” (NYU Tisch School of the Arts, 2024) – [This reference is hypothetical and needs to be verified.]

Neural Rendering

Definition: The use of neural networks to generate or manipulate 3D graphics, often combining traditional computer graphics techniques with deep learning.

Example: Neural rendering enabled the creation of photorealistic 3D environments that could be manipulated in real-time during virtual production.

Neural Style Transfer

Definition: An AI technique that applies the artistic style of one image to another, preserving the content of the original image while adopting the visual characteristics of the style image.

Example: The filmmaker used neural style transfer to give the entire movie the look of a Van Gogh painting, creating a unique visual experience.

Neural Video Compression

Definition: Advanced video compression techniques that use neural networks to achieve higher compression ratios while maintaining visual quality.

Example: Neural video compression allowed the streaming service to deliver 4K content at half the usual bandwidth requirement.

Neural Voice Cloning

Definition: AI technology that can replicate a person’s voice, useful for dubbing, voice-overs, or creating synthetic dialogue.

Example: Neural voice cloning was used to generate additional lines for a character whose actor was unavailable for reshoots.

O

Omni-Editing

Definition: A comprehensive and interconnected editing process in film and video production where alterations made to any part of the script or timeline automatically propagate throughout the entire project. This method allows seamless modification of scenes, plot lines, or dialogue at any point, with changes reflected consistently across all related elements, ensuring coherence and continuity.

Example: The director employed Omni-Editing to ensure that every character’s dialogue was consistent throughout the film, despite last-minute script revisions.

P

Previous 7 Days

Definition: Typically refers to recent events, developments, or summaries within the last week, often used in the context of updates or reviews.

Example: The project manager provided a report on the previous 7 days of progress, highlighting key achievements and challenges.

Predictive Rendering

Definition: AI-driven technique that anticipates and pre-renders likely scene elements to reduce real-time rendering load in virtual production.

Example: Predictive rendering allowed the virtual production team to maintain high frame rates even in complex, dynamic environments.

S

Semantic Segmentation for Video

Definition: A technique that segments video frames into different parts based on their meaning, allowing AI to understand and manipulate different aspects of a video intelligently.

Example: Semantic segmentation allowed the visual effects team to isolate and edit specific elements within each frame.

Segmentation

Definition: The division of video frames or images into distinct regions or objects, allowing AI systems to analyze and manipulate specific parts of the content.

Usage: Segmentation was employed to isolate the background from the actors, making it easier to apply visual effects without affecting the characters.

Seed Continuity

Definition: The practice of maintaining consistent seed values in AI models to ensure reproducibility of results, particularly in procedural generation or simulations.

Usage: By ensuring seed continuity, the animators could regenerate the exact same landscape for different shots, maintaining visual consistency throughout the film.

Speech-to-Video

Definition: The process of generating video content that corresponds to spoken words using AI, allowing for the automatic creation of visual content from audio tracks.

Example: The documentary’s narration was turned into visual sequences using speech-to-video technology.

Style Transfer

Definition: The technique of applying the artistic style of one image or video onto another using neural networks, often utilized in video production for creating unique visual aesthetics.

Example: The director used style transfer to give the entire film a vintage look by applying the style of old film stock to the footage.

Super-Resolution

Definition: AI techniques used to upscale low-resolution videos into high-definition quality by predicting and enhancing pixel data.

Example: The old footage was enhanced using super-resolution techniques, resulting in a much clearer and sharper image.

Synthetic Actors

Definition: Fully AI-generated characters that can act and interact in a video, potentially replacing or augmenting human actors.

Example: Synthetic actors were used to populate the background scenes, reducing the need for extras on set.

Synthetic Data Generation

Definition: The creation of artificial data, including video footage, for training AI models or augmenting real datasets.

Example: The VFX team used synthetic data generation to create a diverse set of explosions for training their particle simulation AI.

Synthetic Media

Definition: Content generated by AI, including videos, images, and sounds, that mimics real-world media, often used in virtual production and creative projects.

Example: The film’s background environments were all created as synthetic media, reducing the need for physical sets.

Synthetic Cinematography

Definition: The use of AI to generate entire cinematic sequences, including camera movements, lighting, and blocking, without the need for physical cameras or sets.

Example: Researchers created a fully synthetic car chase sequence, with AI controlling virtual cameras, lighting, and even stunt choreography in a photorealistic 3D environment.

Reference: “SynthCine: Photorealistic Synthetic Cinematography Using Neural Rendering” (NVIDIA Research, 2023) – [This reference is hypothetical and needs to be verified.]

T

Temporal Action Localization

Definition: An AI task involving the identification and localization of specific actions within a longer video sequence.

Example: The editing software used temporal action localization to automatically find and compile all the car chase scenes in the raw footage.

Temporal Coherence

Definition: Ensuring that AI-generated frames in a video maintain consistency over time, critical for avoiding visual artifacts or jumps between frames.

Example: The AI model was fine-tuned to improve temporal coherence, ensuring the animation appeared fluid and natural.

Temporal Coherence Optimization

Definition: Techniques to ensure consistency and smoothness in AI-generated or edited video sequences across multiple frames.

Example: Temporal coherence optimization was applied to the AI-generated backgrounds to prevent flickering or sudden changes between frames.

Text-Guided Video Generation

Definition: A process involving the use of descriptive text prompts to generate or modify video content, enabling the creation of moving images from written inputs.

Example: Text-guided video generation was used to quickly visualize the director’s notes into rough storyboard sequences.

Text-to-Video

Definition: The process of generating video content directly from textual descriptions using AI, enabling automated video production from written scripts.

Example: The production team utilized text-to-video technology to create a rough draft of the storyboard before filming began.

Transformer Networks

Definition: AI architectures used for processing sequential data, such as video frames, excelling at understanding context and generating coherent video sequences.

Example: The transformer network model allowed for smoother transitions between scenes by better understanding the narrative flow.

Tweaking

Definition: The process of making small adjustments or fine-tuning various aspects of a film or video project, including visual effects, sound, or performance, to achieve the desired result.

Usage: After the first cut, the editor spent several days tweaking the transitions and color grading to perfect the film’s tone.

U

Upscaling

Definition: The process of increasing the resolution of video or images using AI techniques, which often results in higher quality and clarity.

Usage: The production team used upscaling technology to convert the original 1080p footage to 4K resolution, making it suitable for modern displays.

V

Unsupervised Learning for Video

Definition: AI techniques that can learn patterns and structures from video data without labeled training examples, useful for anomaly detection or content organization.

Example: The archival team used unsupervised learning to automatically categorize and tag thousands of hours of historical footage.

Versioning

Definition: The management of multiple iterations of a video project or AI model, allowing creators to track changes and revert to previous versions if needed.

Usage: The editing team used versioning to manage the different edits of the film, making it easy to compare and select the best version for the final cut.

Video Interpolation

Definition: The process of generating intermediate frames between existing frames in a video, often used to create slow-motion effects or increase frame rates.

Example: Video interpolation was applied to the fight scene, creating a smooth slow-motion effect from the original footage.

Video Inbetweening

Definition: The process of generating intermediate video sequences between two given video clips, useful for creating transitions or expanding short clips.

Example: Video inbetweening was used to smoothly transition between two disparate scenes in the dream sequence.

Video Matting

Definition: AI-enhanced techniques for separating foreground elements from the background in video footage, crucial for compositing and visual effects.

Example: Advanced video matting algorithms allowed for clean extraction of the actors from the green screen footage, even with complex hair and clothing.

Video Retargeting

Definition: The process of adapting video content to different aspect ratios or resolutions while preserving important visual information.

Example: Video retargeting allowed the film to be seamlessly adapted from a widescreen theatrical release to various mobile device formats.

Video Restoration

Definition: AI-powered techniques for improving the quality of degraded video footage, including denoising, colorization, and frame rate conversion.

Example: Video restoration breathed new life into century-old film reels, removing scratches and stabilizing the shaky footage.

Video Summarization

Definition: AI-driven techniques for creating concise summaries of longer videos by selecting the most important or representative frames or clips.

Example: Video summarization was employed to create a compelling two-minute trailer from over two hours of film footage.

Video-to-Video Translation

Definition: An AI technique that converts video from one domain to another, such as transforming daylight scenes to night scenes or changing weather conditions.

Example: Video-to-video translation allowed the production team to reshoot a sunny scene as a rainy one without returning to the location.

Virtual Cinematography

Definition: Using AI to simulate camera movements and angles that would traditionally require a physical setup, enabling innovative shot designs in digital environments.

Example: Virtual cinematography allowed the filmmakers to experiment with camera angles that would be impossible in a real-world setting.

Virtual Production

Definition: The use of AI and real-time rendering technology to create environments and scenes virtually, often used in place of physical sets in filmmaking.

Example: The movie’s complex landscapes were all generated through virtual production, eliminating the need for location shoots.

Voice Localization

Definition: The process of adapting and translating a character’s voice into different languages, allowing international audiences to hear the character speaking in their own language with vocals that match the original tone and style, rather than using a separate translator or interpreter.

Usage: The film studio implemented voice localization to ensure that audiences in different countries could experience the characters speaking in their native language, maintaining the original emotional impact and vocal nuances of the performance.

Volumetric Capture

Definition: A technique to create 3D representations of spaces, objects, or people for use in AI-driven video generation and virtual production.

Example: Volumetric capture was used to create a holographic version of the actor for the virtual set.

Total Number of Entries: 72
Note: Ver.1.26 mdg/gnr8.live

17 Aug 2024

MDG Blog Posts Permalink

MED Weekend Edition

MDG August 17, 2024

FaceTune.ai What you get when you combine the junk science of emotion recognition and ai music creation. Did not have the ven of music creation and human rights on my radar. (facetune)

AI Seinfeld was the peak of AI-generated content. It will never happen again. (minimaxir.com)

Unreasonably Effective AI with Demis Hassabis (youtube)

The AI Bubble: Will It Burst, and What Comes After? (g.marcus youtube)

Proof of humanity Personhood credentials: Artificial intelligence and the value of privacy-preserving tools to distinguish who is real online (arxiv)

Fine-tune FLUX.1 with your own images; fine-tuned a @bfl_ml‘s Flux.1-dev; fluxai pro; flux1ai.com

Photoshop erase IRL (see above)

raymond-pettibon-the-punk-years (auction)

Unreasonably Effective AI with Demis Hassabis

Artificial_aesthetics.chapter_2.pdf

dea2img.github.io

Hollywood icons of the past take new star turn, with celebrity estates cashing in on AI voice cloning deals (cnbc)

SAG-AFTRA Strikes Groundbreaking AI Digital Voice Replica Pact With Startup Firm Narrative (variety)

Seminal.one art copyright ecosystem

MIFARE Classic: exposing the static encrypted nonce variant

“I think my whole process is just pretty much that I walk around the city and I write in my notepad and think freely. Then I just go through my notes later and make them into my works. For the last 15 or so years, I’ve just pretty much done that,” Green said. “It’s sort of a museum of memory to walk around Manhattan.”

Elon Kanye edit (x)

Eric Schmidt says the quiet part out loud – parismarx Eric Schmidt took a break from selling AI to the military to tell a bunch of Stanford students to become entrepreneurs and if that required stealing a ton of content, they shouldn’t worry about it.When they succeed, their lawyers will clean up the mes

14 Aug 2024

MDG Blog Posts Permalink

Mediaeater Digest Vol 30, No. 227

MDG August 14, 2024

Inside the “3 Billion People” National Public Data Breach (troy hunt)

Script-kidding the Bitcoin blockchain (ccc.de)

CONCORD vs.ANTHROPIC PBC, Defendant (pdf)

A Ruling That Eliminates Important Privacy Rights in Many Stored Internet Contents—And The Legal Challenge to It (reason)

Artists Score Major Win in Copyright Case Against AI Art Generators (imdb)

Federal appeals court finds geofence warrants “categorically” unconstitutional (eff)

UMG artists and UMPG songwriters across Facebook, Instagram, Messenger, Horizon, Threads and WhatsApp (prnewsire)

Google vids (poorly named excellent presi tool) (google)

Very large context windows, agents and text-to-action,

chatgpt-4o-latest (lame naming conventions for 1000 alex) (openai)

ytch.xyz

Yesterdays Pixel announcements includes device vs carrier overt access to content. A big line crossed. Google Pixel phones have a few features that can provide call transcripts, including Call Notes, Call Screen, and Live Transcribe:

Call Notes- Sends a transcript and summary of a phone call to the user after the call ends

Call Screen – Allows users to screen calls before answering them. When a call is received, users can tap Screen call to have Google Assistant ask the caller who they are and why they are calling. The user will receive a real-time transcript of the caller’s response. Users can then choose a suggested response, pick up the call, or hang up. Users can also enable automatic call screening by opening the Phone app, tapping the three dots in the top right, then tapping Settings and Call Screen. From there, users can set a protection level to determine how Google Assistant screens calls. For example, Maximum protection will screen all unknown numbers, while Basic protection will only decline spam calls.

Live Transcribe – Transcribes phone calls in real-time in over 80 languages and dialects. Users can access Live Transcribe by opening the Settings app, tapping Accessibility, and then tapping Live Transcribe.

10 Aug 2024

MDG Blog Posts, Uncategorized Permalink

View of Liu Ding: Room of Boundlessness, Magician Space, 2024.

Mediaeater Digest Vol. 30, No. 223

MDG August 10, 2024

How Mastercard sells its ‘gold mine’ of transaction data (reddit)

AI Comes for Music (dancohen)

How to Know If You’re Living in a Doom Loop (honest-broker)

I Reviewed Restaurants for 12 Years. They’ve Changed, and Not for the Better (nyt)

A billionaire bought a lobster shack in the Hamptons. Then the trouble began (ft)

An incredible life and career (google)

Cancer incidence, 2021 (our world in data)

AI Generated Song “Zo Zomer” in in Dutch Top25-NL Charts (top40.nl)

Responding to the #defcon32 badge controversy (x)

10 Aug 2024

MDG Blog Posts Permalink

Information and Knowledge Yield Loss Across Transcoding Eras

MDG August 10, 2024

As someone who’s lived through the mind numbing pace of technological change, I’ve been fascinated by how we’ve transformed the way we store, share, and understand information. This part of technological change really haunts me: With each leap forward, we are leaving something important behind.

Through each “transcoding era”, aggregate intelligence information loss. At least three major ones, each has brought incredible advancements, but also introduced “yield loss” – a concept borrowed from manufacturing that I think applies perfectly to our data-driven world.

From Physical to Digital

The first era was all about turning physical stuff into digital format. Books became e-books, vinyl records turned into MP3s, and film reels transformed into digital video files. It was mind-blowing at the time. Suddenly, we could carry entire libraries in our pockets!

Many things were lost in translation. Remember the warmth of vinyl records? Or the smell of old books? Early digital tech couldn’t capture those sensory experiences. We gained convenience, sure, but at the cost of some of the richness that made those physical mediums special.

I sit now in a massive room filled with legacy media that i can pick up and touch, regard, play and read that did not make this first leap and likely never will.

The Internet Age

Cue: internet, Information at our fingertips, instant communication across the globe. But in our rush to make everything faster and more accessible, we started cutting corners.

To squeeze data through limited bandwidth, we compressed files until videos looked like pixelated messes and music lost its depth. We chopped up long-form content into bite-sized chunks for quick consumption – micro chunking it as we called it at the turn of the century. Further reduction.

At each turn, loss of information that did not make the cut for whatever reason, and a lessening and reduction in cases that did make it.

RIP Newsgroups, a valued shared source of organized information lost. (Forget about the Archie and Gopher and all the BBS’s)

Web 2.o emerged and in the M&A of that decade valuable two dot o companies and information was lost including, del.icio.us, o.g. services like Flickr, Tumblr et al. So past information that did not make the leap – there is a giant digital gap of aggregate knowledge sources

AI Takes the Wheel

Now we’re entering a new era, Artificial Intelligence is starting to infer, interpret, analyze, synthesize and even create information on its own.

Andrew Gray, a librarian at University College London, trawled through millions of papers searching for the overuse of words such as meticulous, intricate or commendable.

He determined that at least 60,000 papers involved the use of AI in 2023 — over one percent of the annual total. “For 2024 we are going to see very significantly increased numbers,” Gray told AFP.

Another reduction of source material, further loss of context and over all diminishment.

Perspective

I am of the singular generation who grew up BC (before computers) but thrived as an adult in the computational era (AC). We are the only humans who have this perspective and time is eroding our information and knowledge and digital is decaying it.

So, What Can We Do About It?

How do we develop systems that preserve the richness of human experience and does not leave off the long tail and expressly look for it to be included? How do we find ways to keep the convenience of digital formats without sacrificing quality? How do we include the most robust set of information that carries forward. What are we leaving behind?