VLC player demos real-time AI subtitling for videos

Otter@lemmy.ca · 1 year ago

VLC player demos real-time AI subtitling for videos

asbestos@lemmy.world · 1 year ago

Finally, some good fucking AI

shyguyblue@lemmy.world · 1 year ago

I was just thinking, this is exactly what AI should be used for. Pattern recognition, full stop.

snooggums@lemmy.world · 1 year ago

Yup, and if it isn’t perfect that is ok as long as it is close enough.

Like getting name spellings wrong or mixing homophones is fine because it isn’t trying to be factually accurate.

vvv@programming.dev · 1 year ago

I’d like to see this fix the most annoying part about subtitles, timing. find transcript/any subs on the Internet and have the AI align it with the audio properly.

TJA!@sh.itjust.works · 1 year ago

Problem ist that now people will say that they don’t get to create accurate subtitles because VLC is doing the job for them.

Accessibility might suffer from that, because all subtitles are now just “good enough”

Railcar8095@lemm.ee · 1 year ago

Or they can get OK ones with this tool, and fix the errors. Might save a lot of time

snooggums@lemmy.world · 1 year ago

Regular old live broadcast closed captioning is pretty much ‘good enough’ and that is the standard I’m comparing to.

Actual subtitles created ahead of time should be perfect because they have the time to double check.

TachyonTele@lemm.ee · 1 year ago

I have a feeling that if you care enough about subtitles you’re going to look for good ones, instead of using “ok” ai subs.

shyguyblue@lemmy.world · edit-2 1 year ago

I imagine it would be not-exactly-simple-but-not- complicated to add a “threshold” feature. If Ai is less than X% certain, it can request human clarification.

Edit: Derp. I forgot about the “real time” part. Still, as others have said, even a single botched word would still work well enough with context.

snooggums@lemmy.world · 1 year ago

That defeats the purpose of doing it in real time as it would introduce a delay.

shyguyblue@lemmy.world · 1 year ago

Derp. You’re right, I’ve added an edit to my comment.

LandedGentry@lemmy.zip · edit-2 1 year ago

deleted by creator

LandedGentry@lemmy.zip · edit-2 1 year ago

sadl,fgjsaklfjsal;d

m8052@lemmy.world · 1 year ago

What’s important is that this is running on your machine locally, offline, without any cloud services. It runs directly inside the executable

YES, thank you JB

renzev@lemmy.world · 1 year ago

This sounds like a great thing for deaf people and just in general, but I don’t think AI will ever replace anime fansub makers who have no problem throwing a wall of text on screen for a split second just to explain an obscure untranslatable pun.

FMT99@lemmy.world · 1 year ago

Translator’s note: keikaku means plan

rustyricotta@lemmy.ml · 1 year ago

Bless those subbers. I love those walls of text.

Phoenixz@lemmy.ca · edit-2 1 year ago

As vlc is open source, can we expect this technology to also be available for, say, jellyfin, so that I can for once and for all have subtitles.done right?

Edit: I think it’s great that vlc has this, but this sounds like something many other apps could benefit from

GreenKnight23@lemmy.world · 1 year ago

crunchyroll is currently using AI subtitles. it’s obvious because when someone says “mothra. Funky…” it captions “mother fucker”

Alexstarfire@lemmy.world · 1 year ago

That explains why their subtitles have seemed worse to me lately. Every now and then I see something obviously wrong and wonder how it got by anyone who looked at it. Now I know why. No one looked at it.

GreenKnight23@lemmy.world · 1 year ago

my wife and I love laughing at the dumbass mistakes it makes.

some characters name is Asura Halls?

instead of “That’s Asura Halls!” you get “That asshole!”

but if I was actually hearing impaired I’d be really pissed that I’m being treated as second class even though Sony still took my money like everyone else.

dance_ninja@lemmy.world · 1 year ago

Malevolent Kitchen Intensifies

Eezyville@sh.itjust.works · 1 year ago

I hope it’s available for Stash App. I wanna know what this JAV girls are saying.

NOT_RICK@lemmy.world · 1 year ago

( ͡° ͜ʖ ͡°)

QuadratureSurfer@lemmy.world · 1 year ago

It’s already available for anyone to use. https://github.com/openai/whisper

They’re using OpenAI’s Whisper model for this: https://code.videolan.org/videolan/vlc/-/merge_requests/5155

asbestos@lemmy.world · 1 year ago

Ooooh I like this

m-p{3}@lemmy.ca · 1 year ago

Now I want some AR glasses that display subtitles above someone’s head when they talk à la Cyberpunk that also auto-translates. Of course, it has to be done entirely locally.

Obi@sopuli.xyz · 1 year ago

I guess we have most of the ingredients to make this happen. Software-wise we’re there, hardware wise I’m still waiting for AR glasses I can replace my normal glasses with (that I wear 24/7 except for sleep). I’d accept having to carry a spare in a charging case so I swap them out once a day or something but other than that I want them to be close enough in terms of weight and comfort to my regular glasses and just give me AR like overlaid GPS, notifications, etc, and indeed instant translation with subtitles would be a function that I could see having a massive impact on civilization tbh.

m-p{3}@lemmy.ca · 1 year ago

I believe you can put prescription lenses in most AR glasses out there, but I suppose the battery is a concern…

I’m in the same boat, I gotta wear my glasses 24/7.

vvv@programming.dev · 1 year ago

I think we’re closer with hardware than software. the xreal/rokid category of hmds are comfortable enough to wear all day, and I don’t mind a cable running from behind my ear under a clothes layer to a phone or mini PC in my pocket. Unfortunately you still need to byo cameras to get the overlays appearing in the correct points in space, but cameras are cheap, I suspect these glasses will grow some cameras in the next couple of iterations.

TheRealKuni@lemmy.world · 1 year ago

And yet they turned down having thumbnails for seeking because it would be too resource intensive. 😐

VerPoilu@sopuli.xyz · 1 year ago

I hope Mozilla can benefit of a good local translation engine that could come out of it as well.

m-p{3}@lemmy.ca · 1 year ago

They technically already do with Project Bergamot.

VerPoilu@sopuli.xyz · 1 year ago

I know they do, but it’s lacking so many languages.

SuperCub@sh.itjust.works · 1 year ago

Haven’t watched the video yet, but it makes a lot of sense that you could train an AI using already subtitled movies and their audio. There are times when official subtitles paraphrase the speech to make it easier to read quickly, so I wonder how that would work. There’s also just a lot of voice recognition everywhere nowadays, so maybe that’s all they need?

EVERGREEN@lemmy.one · edit-2 1 year ago

Wonderful! Now we need descriptive audio for the visually impaired!

MichaelMuse@programming.dev · 4 months ago

Removed by mod