this post was submitted on 26 Aug 2024
228 points (99.1% liked)

Android

17470 readers
204 users here now

The new home of /r/Android on Lemmy and the Fediverse!

Android news, reviews, tips, and discussions about rooting, tutorials, and apps.

🔗Universal Link: [email protected]


💡Content Philosophy:

Content which benefits the community (news, rumours, and discussions) is generally allowed and is valued over content which benefits only the individual (technical questions, help buying/selling, rants, self-promotion, etc.) which will be removed if it's in violation of the rules.


Support, technical, or app related questions belong in: [email protected]

For fresh communities, lemmy apps, and instance updates: [email protected]

💬Matrix Chat

💬Telegram channels / chats

📰Our communities below


Rules

  1. Stay on topic: All posts should be related to the Android OS or ecosystem.

  2. No support questions, recommendation requests, rants, or bug reports: Posts must benefit the community rather than the individual. Please post to [email protected].

  3. Describe images/videos, no memes: Please include a text description when sharing images or videos. Post memes to [email protected].

  4. No self-promotion spam: Active community members can post their apps if they answer any questions in the comments. Please do not post links to your own website, YouTube, blog content, or communities.

  5. No reposts or rehosted content: Share only the original source of an article, unless it's not available in English or requires logging in (like Twitter). Avoid reposting the same topic from other sources.

  6. No editorializing titles: You can add the author or website's name if helpful, but keep article titles unchanged.

  7. No piracy or unverified APKs: Do not share links or direct people to pirated content or unverified APKs, which may contain malicious code.

  8. No unauthorized polls, bots, or giveaways: Do not create polls, use bots, or organize giveaways without first contacting mods for approval.

  9. No offensive or low-effort content: Don't post offensive or unhelpful content. Keep it civil and friendly!

  10. No affiliate links: Posting affiliate links is not allowed.

Quick Links

Our Communities

Lemmy App List

Chat and More


founded 1 year ago
MODERATORS
 

The super privacy-focused third-party ROM, GrapheneOS now officially supports the Google Pixel 9, 9 Pro, and 9 Pro XL.

you are viewing a single comment's thread
view the rest of the comments
[–] possiblylinux127 0 points 1 month ago (1 children)

Ollama isn't a model. It is a software that allows you to run llms and query them in layer 7

[–] [email protected] -1 points 1 month ago (1 children)

I didn't say it was a model. I said it doesn't even do what Gemini Nano does. Gemini Nano is not an LLM.

[–] [email protected] 0 points 1 month ago (1 children)

Our most efficient model for on-device tasks

Gemini Nano is an LLM.

[–] [email protected] 0 points 1 month ago (2 children)

Did you even look at that? Your own link disproves your claim. It's just a general AI model that powers a variety of tasks, and is integrated into apps.

[–] possiblylinux127 1 points 1 month ago (1 children)

What else would it be except an llm? What do you think model means?

[–] [email protected] 0 points 1 month ago (1 children)

...what do you think LLM means?

[–] possiblylinux127 1 points 1 month ago (1 children)
[–] [email protected] 0 points 1 month ago (1 children)

Large language model.

You are aware AI is used for more than just reading and generating text?

[–] [email protected] 0 points 1 month ago* (last edited 1 month ago) (1 children)

You are aware that those are often called LMMs, Large Multimodal Model. And one of the modes that makes it multi-modal is Language. All LMMs are or contain an LLM.

[–] [email protected] 0 points 1 month ago* (last edited 1 month ago) (1 children)

LLMs are not called LMMs, they're called LLMs LOL

But thank you for moving the goalposts and making it clear you don't know what you're talking about and have no interest in an honest discussion. Goodbye.

[–] possiblylinux127 1 points 1 month ago (1 children)

https://github.com/haotian-liu/LLaVA

I don't think Google actually uses LLava but the concept is the same. The data gets converted into text for the model to process.

[–] [email protected] 1 points 1 month ago (1 children)

How do you convert text to images?

[–] possiblylinux127 1 points 1 month ago* (last edited 1 month ago)

Its complicated and far over my head mathematically.

https://arxiv.org/abs/2304.08485

Instruction tuning large language models (LLMs) using machine-generated instruction-following data has improved zero-shot capabilities on new tasks, but the idea is less explored in the multimodal field. In this paper, we present the first attempt to use language-only GPT-4 to generate multimodal language-image instruction-following data. By instruction tuning on such generated data, we introduce LLaVA: Large Language and Vision Assistant, an end-to-end trained large multimodal model that connects a vision encoder and LLM for general-purpose visual and language understanding.Our early experiments show that LLaVA demonstrates impressive multimodel chat abilities, sometimes exhibiting the behaviors of multimodal GPT-4 on unseen images/instructions, and yields a 85.1% relative score compared with GPT-4 on a synthetic multimodal instruction-following dataset. When fine-tuned on Science QA, the synergy of LLaVA and GPT-4 achieves a new state-of-the-art accuracy of 92.53%. We make GPT-4 generated visual instruction tuning data, our model and code base publicly available.

This paper is a few years old but it is the basics. The newer llava is based on open models.

[–] [email protected] 0 points 1 month ago (1 children)

When are you going to admit you have no idea what you are talking about?

An LLM literally is a "general AI model that powers a variety of tasks".

[–] [email protected] 0 points 1 month ago (1 children)

When are you going to admit you have no idea what you are talking about?

An LLM literally is not a "general AI model", it's a Large Language Model, as in it processes language.

[–] [email protected] 1 points 1 month ago* (last edited 1 month ago) (1 children)

I'm going to be honest, I actually know a lot more than I can say on this matter. But believe me Gemini Nano is a multimodal LLM.

I spoke to Google engineers about this a few months ago:

[–] [email protected] 0 points 1 month ago

I wasn't calling you a liar. Just misinformed. I also watched Google I/O on YouTube.