FaceDeer

FaceDeer@fedia.io · 8 hours ago

I recall reading once upon a time that the original idea for this exemption was that it was for literal scholars - a few hundred priestly intellectual sorts that were professional serious full-time Torah-studiers. But the exemption didn’t have any specific criteria listed for what that meant, so the ultra-orthodox all wound up saying “yeah, I study the Torah all day too, so I qualify.”

FaceDeer@fedia.io · 13 hours ago

Especially because seeing the same information in different contexts helps mapping the links between the different contexts and helps dispel incorrect assumptions.

Yes, but this is exactly the point of deduplication - you don’t want identical inputs, you want variety. If you want the AI to understand the concept of cats you don’t keep showing it the same picture of a cat over and over, all that tells it is that you want exactly that picture. You show it a whole bunch of different pictures whose only commonality is that there’s a cat in it, and then the AI can figure out what “cat” means.

They need to fundamentally change big parts of how learning happens and how the algorithm learns to fix this conflict.

Why do you think this?

FaceDeer@fedia.io · 1 day ago

There actually isn’t a downside to de-duplicating data sets, overfitting is simply a flaw. Generative models aren’t supposed to “memorize” stuff - if you really want a copy of an existing picture there are far easier and more reliable ways to accomplish that than giant GPU server farms. These models don’t derive any benefit from drilling on the same subset of data over and over. It makes them less creative.

I want to normalize the notion that copyright isn’t an all-powerful fundamental law of physics like so many people seem to assume these days, and if I can get big companies like Meta to throw their resources behind me in that argument then all the better.

FaceDeer@fedia.io · edit-2 1 day ago

Remember when piracy communities thought that the media companies were wrong to sue switch manufacturers because of that?

It baffles me that there’s such an anti-AI sentiment going around that it would cause even folks here to go “you know, maybe those litigious copyright cartels had the right idea after all.”

We should be cheering that we’ve got Meta on the side of fair use for once.

look up sample recover attacks.

Look up “overfitting.” It’s a flaw in generative AI training that modern AI trainers have done a great deal to resolve, and even in the cases of overfitting it’s not all of the training data that gets “memorized.” Only the stuff that got hammered into the AI thousands of times in error.

FaceDeer@fedia.io · 2 days ago

You get out ahead of the locomotive knowing that most of the directions you go aren’t going to pan out. The point is that the guy who happens to pick correctly will win big by getting out there first. Nothing wrong with making the attempt and getting it wrong, as long as you factored that risk in (as McDonalds’ seems to have done given that this hasn’t harmed them).

FaceDeer@fedia.io · edit-2 2 days ago

Training an AI does not involve copying anything so why would you think that fair use is even a factor here? It’s outside of copyright altogether. You can’t copyright concepts.

Downloading pirated books to your computer does involve copyright violation, sure, but it’s a violation by the uploader. And look at what community we’re in, are we going to get all high and mighty about that?

FaceDeer@fedia.io · 3 days ago

Training an AI on something doesn’t involve copying it.

FaceDeer@fedia.io · 3 days ago

And under copyleft licensing, they’re allowed to do that. Both to GitHub repositories and Wikipedia.

FaceDeer@fedia.io · 3 days ago

Why would that matter? You can fork such projects too.

FaceDeer@fedia.io · 4 days ago

If you want to argue that Lemmy doesn’t represent users at large, or that the people complaining about AI are a loud minority, go for it.

Yes, that’s exactly what I’m doing. Though specifically this community, not Lemmy as a whole (I’m not a Lemmy user myself for that matter).

FaceDeer@fedia.io · 4 days ago

Of course it is! We are simultaneously facing a labor shortage and mass unemployment. The important thing is to keep being angry and frightened, the specific subject you’re angry about at any given time is flexible.

FaceDeer@fedia.io · 4 days ago

You made an assertion about what end users want. I’m an end user and my desires are not the same as your desires.

But if the sentiment is that common, maybe there’s something to it.

Or maybe it’s just a common fallacy. Like argumentum ad populum.

FaceDeer@fedia.io · 4 days ago

FTX was a cryptocurrency exchange, how is that remotely similar to NVIDIA?

FaceDeer@fedia.io · 4 days ago

Can you remind me how those technologies are related, other than the mere accusation of them being “buzzwords”?

Cryptocurrency is actually doing fine, BTW. Just because you don’t find it useful doesn’t mean it’s not useful to other people.

FaceDeer@fedia.io · 4 days ago

I am an end user and I find it quite handy for a number of applications.

The reasoning “I don’t find it useful and therefore nobody finds it useful” is common in these sorts of threads.

FaceDeer@fedia.io · 4 days ago

How long does AI need to be used, and how much demand needs to be sustained, for it to stop being called a “buzzword”? I’m a little dubious that NVIDIA became literally the most highly-valued company on Earth off the back of a mere “buzzword.”

FaceDeer@fedia.io · 4 days ago

Why not both? A large project like this needs to fix bugs and also continue to refine its features for long term relevance.

FaceDeer@fedia.io · 4 days ago

deleted by creator

FaceDeer@fedia.io · 4 days ago

A car window is a lot easier to shatter than a fighter jet canopy.

FaceDeer@fedia.io · 4 days ago

Or, climb into the front seat and open the front door.