chebra

joined 6 years ago
[–] [email protected] 4 points 1 day ago (2 children)

@some_guy Or they know very well, but can't get a cut from it.

[–] [email protected] 20 points 3 days ago

@SurpriZe It most definitely does show something in Vietnam. I know, because I added them to the map. Btw Grab is contributing to openstreetmaps and you can too. What did you find that was missing on the map? Your local cafe? Just put it there.

[–] [email protected] -2 points 5 days ago

@Untold1707 As opposed to the hardware requirements of windows, who force you to buy a new computer for every new windows version just because?

[–] [email protected] 0 points 6 days ago (1 children)

@sag Jeeeeeesus now I'm scared to click it, what if it's really in Typescript?

[–] [email protected] 2 points 2 weeks ago (1 children)

@ReakDuck Yup, and that's a much better avenue to fight against the AI companies. Because fundamentally, this is almost impossible to avoid in the ML models. We should stop complaining about how they scraped copyrighted content, this complaint won't succeed until that legal loophole is removed. But when they reproduce copyrighted content, that could be fatal. And this applies also to reproducing GPL code samples by copilot for example.

[–] [email protected] -1 points 2 weeks ago (1 children)

@dandi8 the license of Adobe Photoshop is not open-source because it specifically restricts reverse-engineering and modifications, and a lot of other things. The license of Mistral Nemo IS open-source, because it's Apache2.0, you are free to use it, study it, redistribute it, ... open-source doesn't say anything about giving you all the tools to re-create it, because that would mean they would need to give you the GPU time. "Open-source" simply means something else than what you think.

[–] [email protected] -1 points 2 weeks ago (3 children)

@dandi8 @marvelous_coyote

> E.g, Mistral Nemo can't be considered open source, because there is no Mistral Nemo without the training data set.

Right here - that's your logical conflict. By downloading the model file, you can run it, thereby you can "have Mistral Nemo" even without having the training data, contradicting your statement -> your statement is invalid.

[–] [email protected] -1 points 2 weeks ago* (last edited 2 weeks ago) (1 children)

@dandi8 I'm not changing the definition of open-source. And I'm not saying models are magic. Please take your strawmen back. You are the one saying that dataset is source code, and you have no backing for this argument. I agree that dataset is the "source for training", but that doesn't make it "source code" as per the open-source licenses. And the tools are not the compiler. Just because something was created from something else, that doesn't turn it into "source code".

[–] [email protected] -1 points 2 weeks ago (3 children)

@dandi8 surprise surprise, LLMs are not a classic compiled software, in case you haven't noticed yet. You can't just transfer the same notions between these two. That's like wondering why quantum physics doesn't work the same as agriculture.

Think of it as a database. If you have an open-source social network, all tools and code is published, free to use, but the value of the network is in the posts, the accounts, the people who keep coming back. The data in the database is not the source code

[–] [email protected] -1 points 2 weeks ago (5 children)

@dandi8 But the proof is in your quote. Open source is a license which allows people to study the source code. The source code of a model is a bunch of float numbers, and you can study it as much as you want in Mixtral and others. Clearly a model can be published without the dataset (Mixtral), and also a model can be closed, hosted, unavailable for study (OpenAI). I think you need to find some argument showing how "source code" of a model = the dataset. It just isn't so.

[–] [email protected] -1 points 2 weeks ago (7 children)

@dandi8

> The training data set is a vital part of the source code because without it, the rest of it is useless.

This is simply false. Dataset is not the "source code" of a model. You need to delete this notion from your brain. Model is not the same as a compiled binary.

[–] [email protected] 2 points 2 weeks ago (1 children)

@flamingmongoose @cmnybo

> copyright free datasets like Wikipedia

🤦‍♂️

 

The federation between mastodon and lemmy is strange. If a M account wants to follow a L community, they need to follow an automated M account which represents the L community. But if any M post mentions that L community, the post will get boosted by the community's M account, so everybody who follows will get a notification. And I'm not sure if this can be moderated from the L side, because it seems like it never goes through L. Such as - do you see this @opensource ? Does a L mod see this?

view more: next ›