Technology

59174 readers

2128 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

155

OpenAI’s new AI image generator pushes the limits in detail and prompt fidelity (arstechnica.com)

submitted 1 year ago by [email protected] to c/[email protected]

30 comments fedilink hide all child comments

On Wednesday, OpenAI announced DALL-E 3, the latest version of its AI image synthesis model that features full integration with ChatGPT. DALL-E 3 renders images by closely following complex descriptions and handling in-image text generation (such as labels and signs), which challenged earlier models. Currently in research preview, it will be available to ChatGPT Plus and Enterprise customers in early October.

Like its predecessor, DALLE-3 is a text-to-image generator that creates novel images based on written descriptions called prompts. Although OpenAI released no technical details about DALL-E 3, the AI model at the heart of previous versions of DALL-E was trained on millions of images created by human artists and photographers, some of them licensed from stock websites like Shutterstock. It's likely DALL-E 3 follows this same formula, but with new training techniques and more computational training time.

Judging by the samples provided by OpenAI on its promotional blog, DALL-E 3 appears to be a radically more capable image synthesis model than anything else available in terms of following prompts. While OpenAI's examples have been cherry-picked for their effectiveness, they appear to follow the prompt instructions faithfully and convincingly render objects with minimal deformations. Compared to DALL-E 2, OpenAI says that DALL-E 3 refines small details like hands more effectively, creating engaging images by default with "no hacks or prompt engineering required."

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 1 points 1 year ago (1 children)

Nah that Guardians of the Galaxy art is exactly what I'm talking about. It makes basic mistakes even a child could point out and looks more long a knockoff. And refining it is just rolling the dice to get a better result, whereas an artist you can actually give feedback they can understand.

The game assets look a little better, but if you look carefully you'll notice that they don't tile correctly. It's 90% there but the last 10% is the hardest part and it's important especially for large projects and not just single static images. Not too mention they look generic as fuck, you're not going to get the next Hollow Knight or Darkest Dungeon with an amazing original style from AI, you're only going to get existing styles mashed together. The more specific the vision for the artstyle the harder it will be to generate it.

Also the idea of a Tiktok feed of AI generated content is exactly why I hate AI art. Sure, go ahead and use it to help existing artists generate cheap assets that would otherwise be random brush strokes. But replacing them? The idea that AI generated slop will have anything close to the quality and meaning of even cheap art is ridiculous. Why would anyone want that when they could have actual art made by real people, more of which exists today than anyone could go through in their entire life?

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

The idea that AI generated slop will have anything close to the quality and meaning of even cheap art is ridiculous.

You are missing the bigger picture: This took SECONDS, no effort on my part and it was a first try, using technology that was a little less than three years ago at this stage. I can generate new images on any topic I want, instantly. This stuff is already incredible today and is getting better rapidly.

Meanwhile here are examples of glorious human art:

Human art is full of mistakes. The best of the best human art has "quality and meaning", the average not really. Stuff like "Somehow, Palpatine returned" was written by humans. There is a lot of garbage that slips through, even in project that have so much money that there is really no excuse. I'll take a few additional AI generated fingers, that are trivial to fix, over that trash.

Here some of the box art recreated with AI, again zero effort, first try: https://imgur.com/a/kHcwv4j

And you can remix it at will: https://i.imgur.com/y38UPX6.jpg

Also the idea of a Tiktok feed of AI generated content is exactly why I hate AI art.

Netflix is already running personalized thumbnails, not with AI, but that's exactly the kind of stuff I expect AI to be used for real soon, if it isn't already in some capacity.

Why would anyone want that when they could have actual art made by real people, more of which exists today than anyone could go through in their entire life?

Nobody cares about who makes the art outside of some art historians. Every movie, TV show or game has dozens or even hundreds of people involved, you have no idea who was responsible for what or what was going on behind the scenes. All you see is the result and you either like it or you don't. "The Death of the Author" and all that.