Info-Sec: Adversarial Machine Learning

NIST Identifies Types of Cyberattacks That Manipulate Behavior of AI Systems

Adversarial Machine Learning

An AI system can malfunction if an adversary finds a way to confuse its decision making. In this example, errant markings on the road mislead a driverless car, potentially making it veer into oncoming traffic. This “evasion” attack is one of numerous adversarial tactics described in a new NIST publication intended to help outline the types of attacks we might expect along with approaches to mitigate them.

Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations (NIST.AI.100-2), 106 page PDF. Part of NIST trustworthy AI initiative.

The report considers the four major types of attacks: evasion, poisoning, privacy and abuse attacks. It also classifies them according to multiple criteria such as the attacker’s goals and objectives, capabilities, and knowledge.

Evasion attacks, which occur after an AI system is deployed, attempt to alter an input to change how the system responds to it. Examples would include adding markings to stop signs to make an autonomous vehicle misinterpret them as speed limit signs or creating confusing lane markings to make the vehicle veer off the road.

Poisoning attacks occur in the training phase by introducing corrupted data. An example would be slipping numerous instances of inappropriate language into conversation records, so that a chatbot interprets these instances as common enough parlance to use in its own customer interactions.

Privacy attacks, which occur during deployment, are attempts to learn sensitive information about the AI or the data it was trained on in order to misuse it. An adversary can ask a chatbot numerous legitimate questions, and then use the answers to reverse engineer the model so as to find its weak spots — or guess at its sources. Adding undesired examples to those online sources could make the AI behave inappropriately, and making the AI unlearn those specific undesired examples after the fact can be difficult.

Abuse attacks involve the insertion of incorrect information into a source, such as a webpage or online document, that an AI then absorbs. Unlike the aforementioned poisoning attacks, abuse attacks attempt to give the AI incorrect pieces of information from a legitimate but compromised source to repurpose the AI system’s intended use.

 

Media Authenticity: Photography Secure Capture Technology, Synthethic Media Watermarking

A quick roundup of authentication and watermarking initiatives.

The quest for the real image: Lica camera sales re up 10x over the last decade, at the same launching a secure capture technology into their flagship M11-P cameras. The world’s first production camera to guarantee the source of images through the Content Credentials standard.  There are both consumer driven trends towards authenticity as well as standards driven initiatives moving forward.

Thomson Reuters, Canon, and Starling Lab, an academic research lab based at Stanford and USC announced over the summer completion of a pilot program demonstrating how news organizations could certify the authenticity of an image and provide legitimacy.

Nikkei Assia reports, digital signatures will contain information such as the date, time, location, and photographer of the image and will be resistant to tampering. 

The digital signatures now share a global standard used by Nikon, Sony and Canon. Japanese companies control around 90% of the global camera market.

-Canon is releasing an image management app to tell whether images are taken by humans
-Google has released a tool  Synth-ID Identifying AI-generated images 

NIST’s Responsibilities Under the October 30, 2023 Executive Order include 

– Authenticating content and tracking its provenance
– Labeling synthetic content (e.g., watermarking)
– Detecting synthetic content

 

Unpacking the AI Shift: Transformative vs Evolutional

AI: Transformative vs Evolutional

In the world of artificial intelligence (AI), including machine learning (ML), large language models like GPT, and image generation models such as SD-XL/Dall-E, we’ve witnessed a remarkable evolution. What was once the stuff of science fiction has become an integral part of our daily lives, affecting various aspects of society.

The question that arises is whether AI represents a truly transformative force capable of reshaping entire industries and societal structures, or if it’s primarily an evolutionary process, gradually enhancing existing systems. Perhaps it’s a blend of both.

Given this perspective, the practical application of generative AI at an enterprise level still needs to be proven. Issues like the propensity to provide inaccurate information and the ability to generate plausible completely false answers are problematic in an enterprise-level environment. You can’t have your key mechanisms driving economies and productivity and profit tripping out when it gets a bug.

My take is enterprise-level AI will not happen at scale in 2024, and I don’t expect to see real bottom-line impact until Q1 25 as it gradually overcomes current practical challenges to provide real utility at enterprise scale.

Make no mistake, even with my “it’s going to take-a-second” lens, valuable use cases will emerge and make a big impact, disrupting industries and creating unforeseen opportunities.

Right now, examples fall across call centers, protein folding insights, balancing load in production, energy, traffic, and streaming, optimizing marketing campaigns, medical differentials. These are steps towards a transformative shift, and true application is the likely outcome.

Feature or Glitch

Inaccuracies in language models are fine when you are creating cool visuals, not so much in spreadsheets that bottom lines are based on. The glitch is the feature (again).

A great example of this phenomenon can be seen in the emergence of snowy static on televisions when no signal was clear, and VHS tapes ended became a key effect across movies in video programming over the next two decades.

With that thinking in hand, let’s look at the media landscape and multi-modal future. These media-to-media models generate vector graphics, video, audio, and text into a tapestry of creative possibilities. The “glitch” becomes a creative brushstroke, allowing for the crafting of immersive experiences, artistic expressions, and content frankly unimagined at this point.

The barrier between producer and viewer or consumer disappeared with TikTok and social media. The tooling to create these ‘environments’ allows creators to shape (not edit!) media is a key emergent capability. The string of companies tackling this has grown from a few to a dozen in the last six months.

Media: Scale and Value

In the same timeline, scale becomes the enemy of traditional media, having more legacy assets downstream from M&A becomes a secular problem. This linear networks math regarding scale is dead; the value of cable networks over the past 30 years has seen a trajectory of rapid growth, peak, and then a gradual decline.

Ad revenue coming from network and local affiliates and amassing a large channel lineup does not increase value as it once did, although it’s still a strong business.

There is a larger tale to be told here about why in the world did companies that were the producers of content think it’s a good idea to become a player and service? Inheriting all that goes with it, a big bag of hurt and expenses to attract viewers, maintain the platform and on and on, my head hurts. All delivered through a wire when they had ‘cable’ + OTA. Sirens call to rock, and shiny-object-syndrome terms come to mind here.

There does not need to be the current, checks notebook, somewhere between 150 to 200 solely video-focused SVOD services (US). The economic burden of owning both parts of the vertical is significant and not tenable for most big players. The number is likely to be under 5 and more likely, less than 3.

Disney may be the only player that has made a successful transition from legacy to digital. TL/DR Discounted M&A ahead, less production budgets, fewer shows, less chances for hit franchises or sticking with shows until they become hits.

So the familiar B&C scale has collapsed with the rise of digital media, streaming services, and personalized content, aka TikTok, YouTube. The relationship between channels and viewers has become more complex, and the value of each legacy channel has diminished.

At the same time, the gap between producer and viewer has also collapsed, as evidenced by Insta + TikTok creator economies. These producers are about to get some new tools.

The Year Ahead

I fully expect one AI breakout of 2024 will be a visual-forward AI offering. A program, show, channel, a persona, a step towards the Max Headroom media future we all need. The scale and costs to do this will help define the space.

The economics will not be legacy; the audience, not weaned on legacy media but social media, makes this transition feel natural to them. The scale happens when display is solved for in a creative manner, providing an organizing principle that can move the whole space further. That will take a keen curatorial eye and a lot less financial resources to advance.

While there are a dozen players making tooling, the best of them, a small handful, are still far away from meaningful utility, but that will happen. It’s always the worst it will be with AI as it continues to evolve. Right now I can’t even look at synthetic output, which has settled into a style that’s not satisfying. That will change, and with that, so will everything.

Public Domain Day 2024

Public Domain Day 2024

Films that enter the public domain today include:

Steamboat Willie and Plane Crazy (the silent version) (directed by Walt Disney and Ub Iwerks) [5] as a Dataset for training
The Cameraman (directed by Edward Sedgwick and Buster Keaton)
Lights of New York (directed by Bryan Foy; billed as “the first ‘all-talking’ picture”)
The Circus (directed by Charlie Chaplin)
The Passion of Joan of Arc (directed by Carl Theodor Dreyer)
The Singing Fool (directed by Lloyd Bacon; follow-up to The Jazz Singer)
Speedy (directed by Ted Wilde; Harold Lloyd’s last silent theatrical release)
In Old Arizona (“100% all talking” film featuring singing cowboys)
The Man Who Laughs (directed by Paul Leni; features a character who inspired the appearance of the Joker from Batman)
Should Married Men Go Home? (directed by Leo McCarey and James Parrott; the first Laurel and Hardy film to bill them as a team)
The Wind (directed by Victor Sjöström)
The Wedding March (directed by Erich von Stroheim)
The Crowd (directed by King Vidor) – NYC scene  (colorized and upscaled )
The Last Command (directed by Josef von Sternberg; Emil Jannings won the first Academy Award for Best Actor)
Street Angel (directed by Frank Borzage; Janet Gaynor won the first Academy Award for Best Actress)

Duke Law, Centers & Programs at the  Center for the Study of the Public Domain
has the full list of works.

 

Unpacking the AI Shift: Copyright, Disinformation, and the Future of News

The subversion of trust creates significant risk

2024 promises to unpack the some implications of the AI shift, with realizations and understandings about its impact suddenly coming to light.  (that just happened!)

As we begin to understand the composition of ‘training data’ and its downstream effects, issues arise.

The New York Times’ copyright infringement claim against AI-generated content, and the ensuing discussions, signal a growing awareness of intellectual property (IP) concerns. As more instances of AI-generated IP infringement come to light, legal consequences for other users are likely.

The New York Times is the largest proprietary dataset in Common Crawl used to train GPT.

The sheer volume of data required for AI training presents a challenge. While readily available datasets like social media were easily eaten (GIGO), access to high-quality, curated data, like that of The New York Times, is crucial for producing reliable and ethical AI outputs.

The recent demonstration of ‘substantial similarity/copy’ between The New York Times articles and ChatGPT’s outputs underscores the issue of AI-generated copyright infringement, moving beyond theoretical concerns.

On social media the loudest voices against this lawsuits implications are the people who stand to benifit the most as it moves unchecked into mass adoption.

What is clear to me is that AI systems like DALL-E and ChatGPT, trained on undisclosed materials including copyrighted works, create content without indicating potential infringement.

The complexity and unpredictability of the tool’s output shifts greater responsibility to the company.

Beyond copyright concerns, AI’s ability to generate false information attributed to real sources poses a significant threat. The ease with which AI can create fabricated articles, mimicking the style and tone of trusted publications like The New York Times, raises serious concerns about disinformation (at a large scale). The ability to leverage copyrighted materials for disinformation generation further amplifies the potential harm.

The year ahead of unpacking LLMs requires us to confront these challenges. By scrutinizing these complexities, we can lay the groundwork for establishing new values in responsible data use, prioritizing transparency, ethical data use, and user accountability, to ensure responsible and trustworthy AI-generated content shapes the future, not fabricated narratives woven from stolen fabrics.

Not your model not your property.

Privacy: EU Artificial Intelligence Act

Artificial Intelligence Act  :: Banned applications

 

Recognizing the potential threat to citizens’ rights and democracy posed by certain applications of AI, the co-legislators agreed to prohibit:

  • Biometric categorization systems that use sensitive characteristics (e.g. political, religious, philosophical beliefs, sexual orientation, race);
  • Untargeted scraping of facial images from the internet or CCTV footage to create facial recognition databases;
  • Emotion recognition in the workplace and educational institutions;
  • Social scoring based on social behavior or personal characteristics;
  • AI systems that manipulate human behavior to circumvent their free will;
  • AI used to exploit the vulnerabilities of people (due to their age, disability, social or economic situation).

Synthetic Media: AI SAG-AFTRA Deal Points

From this document: /AIFAQs.pdf

Big Picture Questions

AI Ban on Projects: AI not completely banned; focus on setting guardrails around its use.
Training AI on Work: Challenges in outright banning AI training on actors’ work due to legal complexities.
Consent for Digital Replicas: Producers must obtain explicit, informed consent for creating and using digital replicas.
Specificity in Consent: Legal requirements for clarity and specificity in consent provisions.

Consent

Refusal to Hire Without Consent: Producers can refuse to hire if consent for digital replica creation is not given.
Clear and Conspicuous Consent:  Legal standards for clear and conspicuous consent, including separate signing.
New Consent for Each Use: Requirement for separate consent for each specific use of digital replicas.
Informed Consent After Death: Provisions for post-mortem use of digital replicas and union’s role in consent process.

Principal Performers

Ownership of Digital Replicas: Legal ownership by employers with consent required for use.
Digital Replicas for Voice Actors: Inclusive provisions for voice actors in digital replica terms.
Difference Between Replica Types: Distinction between Employment-Based and Independently Created Digital Replicas.
Protections Against Unconsented Use: New contractual provisions protecting against unconsented use of digital replicas.

Background Actors

Ownership and Use of Background Actor Digital Replicas: Similar ownership and consent requirements as principal performers.
Compensation for Scanning and Use: Guidelines on compensation for scanning and use in projects.

Generative AI and Synthetic Assets

Use of Generative AI and Synthetic Performers: Contract language requiring notification and bargaining for the use of synthetic performers.

Protections Against Replacement by AI: Measures to prevent wholesale replacement of actors with AI-generated performers.

Miscellaneous

Handling Bankruptcy and Asset Transfer: Procedures for handling digital replicas in case of company bankruptcy.
Tracking and Enforcement of Digital Replica Terms: Strategies for monitoring and enforcing terms related to digital replicas.
Impact on Future Contract Negotiations: Implications of these terms on negotiations for other contracts.
Use of Digital Replicas During Strikes: Guidelines on the use of digital replicas during work stoppages or strikes.
Tracking Technology for Digital Replicas: Exploration of tracking technologies for monitoring the use of digital replicas.

Navigating the Digital Maze: Dark Patterns, Algorithm Ethics, Facial Recognition and the FTC

This digital age of ours has consumers facing a deluge of potential harms ranging from privacy invasions to deceptive practices. Being online is like a dance of thousand cuts.

While Congress has been paralyzed crafting legislation to address these rapidly evolving harms the Federal Trade Commission (FTC) has emerged as a key player in enforcing accountability.

As one of the rare mechanisms capable of responding swiftly to digital malpractices, the FTC’s enforcement actions against dark patterns, unethical use of algorithms, and privacy breaches have become crucial in safeguarding consumer rights. The absence of comprehensive congressional intervention is at this point criminal and has allowed for harms both at national and a personal level. I took the time to extract some key points from recent enforcement actions.

FTC’s Crackdown on Dark Patterns

Definition of Dark Patterns: Dark Patterns: Dark patterns are deceptive design tactics used in websites and apps that trick or manipulate users into making unintended decisions, often resulting in unwanted subscriptions, purchases, or loss of privacy. These patterns can take various forms, such as confusing navigation, hidden costs, misleading wording, or bait-and-switch techniques.

Dark Pattern Characteristics
– Misleading Navigation: Design elements that intentionally confuse or mislead users.
– Hidden Costs: Concealing extra charges or subscriptions.
– Bait-and-Switch: Promising one thing but delivering another.
– Privacy Intrusion: Coercing users to surrender more personal data than necessary.

List of FTC Enforcement Cases on Dark Patterns
– Vonage: A $100 million settlement for consumers misled by dark patterns into unwanted service commitments.
– Credit Karma: Action taken for using dark patterns to mislead consumers about credit card pre-approvals.
– WW International: Demanded deletion of algorithmic systems developed from unlawfully obtained data.
– Everalbum, Inc.: Required the deletion of a facial recognition algorithm developed through deceptive practices.

The Everalbum case prompted a deeper dive into the … FTC’s Guidelines on Ethical Use of Facial Recognition Technology (FRT) The FTC recommends companies using FRT prioritize consumer privacy, develop secure data practices, and ensure consumer awareness and consent. FTC insists on explicit consumer consent before using consumer images or biometric data in ways not initially represented, and before identifying anonymous images of a consumer.

Navigating the Digital Maze: The Weaponization of Audio and Video

The world of audio and visual (A/V) media has experienced a profound transformation, evolving from a realm of entertainment and delight into one being weaponized with manipulation and fear.

This evolution has given rise to what I term the “Dark A/V Era,” characterized by the growing exploitation of these mediums to disseminate misinformation, incite violence, and exploit vulnerabilities.  A loss of trust in what we hear and see.  

The Rise of Audio Manipulation

Audio technology, once celebrated for its power to connect and entertain, has taken a turn for the dark side . Synthetic voices, eerily accurate in mimicking human speech, has opened a Pandora’s box of dark arts deceptive practices.

Advanced algorithms are being exploited in “voice scams,” where unsuspecting individuals receive calls from voices indistinguishable from those of their loved ones (Hi mom it’s me i am in jail and need you to send 10k) or trusted authorities undermining the integrity of communication and  posing a significant threat to personal trust and safety.  As these technologies become more sophisticated, they represent a disturbing evolution in the landscape of digital fraud, challenging our ability to discern reality of what we are hearing.

Video’s Manipulative Grip

Likewise, video, once a source of amusement, and delight now distorted into a tool for manipulation. Videos are meticulously crafted to change hearts and minds across political landscapes.  Deceiving viewers unable to discern or unpracticed in new visual consumption best practices.  Deepfake technology, enabling the realistic manipulation of video footage, has exacerbated this trend, blurring the line between fact and fiction.  These tools are not being used for joy they are being leveraged to evoke fear, anger, and hatred, often with the intention of inciting violence or promoting extremism.

Watermarks: An Insufficient Shield

While watermarks are intended to safeguard copyright and ownership, they fall short in addressing the root cause of this issue. The capacity to create and manipulate A/V content resides not solely in the tools themselves but in the intentions of those who wield them. The addition of a watermark alone does not deter malicious actors from exploiting these technologies for nefarious purposes.  More importantly it does not prevent its viewing, which happens at lighting speeds globally. Watermarks are mostly ignored or noticed after the fact by the few that care.  The social networks have made it clear they are not going to be the solution or voice of reason in allowing manipulated media to exist and be promoted on their networks often obtaining reach though the social networks own ranking algos. 

Navigating the Dark A/V Era

To navigate the Dark A/V Era effectively, a holistic approach is imperative. This involves addressing the underlying motivations behind the creation and dissemination of harmful content, fostering digital literacy and critical thinking skills, and formulating ethical guidelines for the use of A/V technologies. 

A Collective Effort

Addressing the weaponization of audio and video necessitates a collective effort involving individuals, governments, and technology companies. None of which will happen in our life times. Making it more important for individuals to acquire the skills to identify and resist manipulation because the hope that governments will enact policies regulating technology usage, and technology companies will  develop and adhere to ethical guidelines for their platforms is unlikely.