this post was submitted on 14 Dec 2024
74 points (100.0% liked)

Technology

1573 readers
151 users here now

Which posts fit here?

Anything that is at least tangentially connected to the technology, social media platforms, informational technologies and tech policy.


Rules

1. English onlyTitle and associated content has to be in English.
2. Use original linkPost URL should be the original link to the article (even if paywalled) and archived copies left in the body. It allows avoiding duplicate posts when cross-posting.
3. Respectful communicationAll communication has to be respectful of differing opinions, viewpoints, and experiences.
4. InclusivityEveryone is welcome here regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
5. Ad hominem attacksAny kind of personal attacks are expressly forbidden. If you can't argue your position without attacking a person's character, you already lost the argument.
6. Off-topic tangentsStay on topic. Keep it relevant.
7. Instance rules may applyIf something is not covered by community rules, but are against lemmy.zip instance rules, they will be enforced.


Companion communities

[email protected]
[email protected]


Icon attribution | Banner attribution

founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 0 points 1 week ago (13 children)

It’s not open source, but open weights, documented, relatively permissively licensed and all the inference/finetuning libraries for it are open source.

[–] MCasq_qsaCJ_234 2 points 1 week ago (12 children)

I understand, but Meta has the rights to Llama and at any time they can change that license to make it less open just to make more money.

Currently it is open weight to attract customers, because once there are no competitors they will start to squeeze them.

[–] [email protected] 3 points 1 week ago (5 children)

Also competition is stiff. Alibaba is currently handing their butts to them with Qwen 2.5. Deepseek (a Chinese startup), tencent and Mistral (French) are giving them a run for their money too, and there are even some that “continue train” their old weights.

[–] MCasq_qsaCJ_234 1 points 1 week ago (1 children)

And what are those examples of those who continue training old weights?

[–] [email protected] 1 points 1 week ago (1 children)

A small startup called Arcee AI actually “distilled” logits from several other models (Llama, Mistral) and used the data to continue train Qwen 2.5 14B (which itself is Apache 2.0). It’s called supernova medius, and it’s quite incredible for a 14B model… SOTA as far as I know, even with their meager GPU resources.

A company called upstage “expands” models to larger parameter counts by continue training them. Look up the SOLAR series.

And quite notably, Nvidia continue trained Llama 3.1 70B and published the weights as Nemotron 70B. It was the best 70B model for awhile, and may still be in some areas.

And some companies like Cohere continuously train the same model slowly, and offer it over API, but occasionally publish the weights to promote them.

[–] MCasq_qsaCJ_234 1 points 1 week ago (1 children)

The fact that there is AI with open source licenses is already a good thing, as is the competition. Although in my opinion it is not enough because it can further consolidate oligopolies in this sector.

Trying to prevent OpenAI from becoming a for-profit seems to me to be a questionable tactic. It's as if Mozilla wanted to be a for-profit company in order to make Firefox more competitive with Chrome, but Google opposes this measure.

[–] [email protected] 1 points 1 week ago (1 children)

Well for one, I directly disagree with Altman’s fundamental proposition, they don’t need to “scale” AI so dramatically to make it better.

See: Qwen 2.5 from Alibaba, a fraction of the size, made with a tiny fraction of the H100 GPUs and highly competitive (and (mostly) Apache licensed). And frankly, OpenAI is pointedly ignoring all sorts of open research that could make their models notably better or more powerful efficient, even with the vast resources and prestige they have… they seem most interested in anticompetitive efforts to regulate competitors that would make them look bad, using the spectre of actual AGI (which has nothing to do with transformers LLMs) to scare people.

Even if doing it for the wrong reasons, I feel like Google would be right to oppose Mozilla axing the nonprofit division if they were somehow in a similar position to OpenAI. Their mission of producing a better, safer browser would basically be lying through their teeth.

[–] MCasq_qsaCJ_234 1 points 1 week ago

Open AI has different priorities they want to achieve AGI, so they seek to explore the capabilities of AI not look at what competency does in those directions to replicate and/or improve it. They only optimize it to make their services faster and less resource consuming.

Also, becoming a for-profit organization doesn't mean you eliminate your non-profit division. Those two parts separate and become independent, although the nonprofit ends up getting considerable funds from the funding offer received by the other part.

As is the case with Mastercard, whose nonprofit organization is one of the richest in the world. In that scenario Mozilla would split into two entities one would focus on making a profit and making Firefox more competitive, while the other would focus on what Mozilla currently does.

load more comments (3 replies)
load more comments (9 replies)
load more comments (9 replies)