NAIROBI, Kenya— Cyber safety firm Gen has unveiled an early preview of a new on-device deepfake detection system, developed in partnership with Intel, as fresh data points to a worrying trend: scam deepfakes are thriving in long-form online videos.
The prototype runs directly on users’ devices and analyses both audio and visual elements in real time, allowing it to detect manipulated voices and altered images without sending data to the cloud.
Gen says the approach significantly cuts detection time, improves privacy, and offers a more practical way to curb fraud driven by generative AI.
How the deepfake detection system works
Unlike traditional moderation tools that rely on centralised servers, Gen’s system operates locally on PCs and other consumer devices, scanning video content as it plays.
The tool separates analysis into two layers:
- Audio checks to detect cloned or synthetic voices
- Visual checks to identify manipulated or composited video frames
By running both checks simultaneously, the system can flag suspicious content mid-playback, even when it appears inside normal recommendation-driven video feeds.
Gen and Intel say the technology is designed to function as core infrastructure, sitting beneath consumer security software and media platforms to provide continuous protection rather than one-off scans.
Alongside the technology preview, Gen released new data from its consumer security products showing that deepfake-enabled scams are far more common in extended viewing sessions than in short clips.
According to the company, longer videos give scammers time to build trust, gradually shifting from harmless content into persuasion and fraud.
Gen found that most intercepted deepfake scams appeared on platforms that support long-form video, led by YouTube, followed by Facebook and X.
These platforms often reach users through TVs and personal computers, where content feels more immersive and credible.
Crucially, the scams didn’t arrive as suspicious links or downloads. Instead, they were embedded within ordinary video streams, blending in until fraudulent segments surfaced minutes into playback.
Audio manipulation leads the deception
Gen’s analysis shows that synthetic or cloned voices were the dominant tactic, often paired with largely authentic video footage.
In many cases, real interviews or broadcasts were reused, with only the audio track replaced, creating an effect similar to dubbed foreign-language video. Visually, little appeared out of place—making the scam harder to detect.
The risk escalated sharply when manipulated videos introduced:
- Promises of unusually high financial returns
- Urgent countdowns or “limited-time” offers
- Instructions to move conversations or payments to unregulated channels
These elements typically appeared after a period of normal, trust-building content.
Scam tactics are evolving
Gen says the data highlights a shift away from crude, short deepfake clips toward longer, more polished narratives.
Fraudsters are now embedding synthetic speech into familiar formats such as investment explainers, tutorials, or talk-show-style interviews.
By mimicking trusted creators and public figures, scammers can exploit recommendation algorithms and reach viewers organically—without relying on obvious spam tactics.
Gen and Intel say development of the on-device system is ongoing, with plans to refine detection models and collaborate with hardware makers and major consumer platforms to scale protection against AI-driven fraud.



