/c/artificial@lemmit.online

I read the papers and docs for Bark and Tortoise TTS - two text-to-speech models that seemed pretty similar on the surface but are actually pretty different.

Here's what Bark can do:

It can synthesize natural, human-like speech in multiple languages.
Bark can also generate music, sound effects, and other audio.
The model supports generating laughs, sighs, and other non-verbal sounds to make speech more natural and human-sounding. I find these really compelling and these imperfections make the speech sound much more real. Check out an example here (scroll down to "pizza.webm").
Bark allows control over tone, pitch, speaker identity and other attributes through text prompts.
The model learns directly from text-audio pairs.

Whereas for Tortoise TTS:

It excels at cloning voices using just short audio samples of a target speaker. This makes it easy to produce text in many distinct voices (like celebrities). I think voice cloning is the best use case for this tool.
The quality of the synthesized voices is pretty high.
Tortoise supports fine-grained control of speech characteristics like tone, emotion, pacing, etc through priming text.
Tortoise is only trained on English and it's not capable of producing sound effects.

Here's how they compare to the other speech-related models I've taken a look at so far:

|

I have a full write-up here if you want to read more, it's about a 10-minute read. I also looked at the model inputs and outputs and speculated on some products you can build with each tool.

5

1

One-Minute Daily AI News 8/9/2023 (lemmit.online)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/Excellent-Target-847 on 2023-08-10 06:35:10.

Google today announced the launch of Project IDX, its foray into offering an AI-enabled browser-based development environment for building full-stack web and multiplatform apps.[1]
NVIDIA today announced NVIDIA AI Workbench, a unified, easy-to-use toolkit that allows developers to quickly create, test and customize pretrained generative AI models on a PC or workstation.[2]
IBM said on Wednesday it would host Meta Platforms’ artificial intelligence language program on its own enterprise AI platform, watsonx.[3]
New high-tech microscope using AI successfully detects malaria in returning travelers.[4]

Sources:

[1]

[2]

[3]

[4]

6

1

Are there any examples of Artificial Intelligence that aren't Machine Learning? (lemmit.online)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/ZealousidealTomato74 on 2023-08-09 19:04:20.

I hear AI & ML used interchangeable, and a lot of people dispute the use of the term "AI", as defining "intelligence" can be a sticky wicket. "Machine learning" seems like a much clearer term, describing systems that can optimize themselves given an objective function & maybe training data (generalization).

But, I know ML is just a subset of AI, so is there any extant AI that isn't ML? If not, what would AI that's not ML look like?

7

1

AI is about to turn the internet into a total nightmare (www.businessinsider.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/Alone-Competition-77 on 2023-08-09 16:56:19.

8

1

Damn! Now everybody can be a film producer (old.reddit.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/anonymous_guyy on 2023-08-09 12:53:14.

9

1

What does it take to get AI to work like a scientist? | "As machine-learning algorithms grow more sophisticated, artificial intelligence seems poised to revolutionize the practice of science itself." (arstechnica.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/Tao_Dragon on 2023-08-09 10:06:59.

10

1

One-Minute Daily AI News 8/8/2023 (lemmit.online)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/Excellent-Target-847 on 2023-08-09 05:41:44.

Researchers at the Massachusetts Institute of Technology (MIT) and the Dana-Farber Cancer Institute have discovered that the use of artificial intelligence (AI) could make it easier to determine the sites of origin for enigmatic cancers and enable doctors to choose more targeted treatments.[1]
Meta disbands protein-folding team in shift towards commercial AI.[2]
OpenAI has introduced GPTBot, a web crawler to improve AI models. GPTBot scrupulously filters out data sources that violate privacy and other policies.[3]
Disney has created a task force to study artificial intelligence and how it can be applied across the entertainment conglomerate, even as Hollywood writers and actors battle to limit the industry’s exploitation of the technology.[4]

Sources:

[1]

[2]

[3]

[4]

11

1

How AI generated movies / TV series might be done in near future. (lemmit.online)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/aluode on 2023-08-09 05:05:06.

I see lots of people say AI will not be able to do movies / TV because it hallucinates yada yada yada. But, movies / TV shows follow a clear script. Script being a generic formula that is taught to script writers the same way as music theory cheat sheet helps aspiring musicians to write songs.

Script. Script can be turned into machine readable format. You can add commands how to render a movie on the basis of it. For example in a script you could have #Jack, telling the thing reading the script that we are talking of actor #Jack meaning it should tap into assets about jack which would reside in folder Jack. Jack meanwhile could be rendered by sub ai to fit the part.

That helps us nail down the character so it wont be changing appearance wise in our script.

The AI part here comes from how Jack would be handled. "Acting" - ragdoll physics of Jack, a 3d character and how it behaves on the basis of the machine readable script, that would be delegated to sub AI. As would be the emotional tone of his voice which could also be implied in the script, as would be the look on his face, his pose. The way he behaves among other actors. This would be a sub AI. So a bunch of sub ai's that have been trained on possibly movies in which the character acting has been reduced to ragdoll physics and the way their faces look when they have certain looks on their faces.

Now this is not that much different from a car driving in traffic and observing what is happening around it. We are teaching AI's how to read the world as we are also teaching AI's sentiment analysis etc. All models that will be needed in a script rendering AI.

So it will not be a singular AI that will dream a movie, but it will be a whole bunch of AI's each dedicated to different parts of the script rendering.

So the script could be written by AI to be based on some movie for example and the AI could think about what is has learnt from the movie. The way the different sub AI's interpreted the acting, the sentiment, the scenes, the pacing. The overview and then if you want to do something similar in a different setting, it would do something similar to what chatgpt can do with text, or what runwayML can do with a video. Style transfer, in the way the person prompting requires.

But the script reading AI would act more like a director. Guiding the sub AI's to keep certain styles as asked by the promptee. This is how we humans do it too. A director does not a dream a movie in a single setting. A script writer writes it, a director sees it his own way, the cameraman films it the way he does, the choreographers do their thing as do the actors etc etc etc.

Until at the end of each filming day the director looks at the cuts of that particular day and thinks if it is ok, or not. If you have a strong willed director, he will over ride the will of the actor or the person in charge of the camera etc. These are basically the same as if you render a picture on basis of another picture, what sort of weight you give model.

So I think what we will end up is a bunch of highly specialized AI's that will be trained to do very specific things. Like people on the movie set.

I think this is how AI movies will be done. It might not be much in the beginning compared to masters of film making, but eventually, this sort of process will no doubt get better.

But I do not think we will see AI that will just dream up a hour 30 minute movie in a long while.

I think it will be a big business to train the models that can be used for movie making. Perhaps in the beginning they will be used more like DAW (Digital audio work station) tools on singular scenes of movies. But eventually. We might end up with Netflix AI that will write the movie we want.

12

1

Where to begin studying AI/ML from a COGNITIVE SCIENCE PERSPECTIVE? (lemmit.online)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/BornAgain20Fifteen on 2023-08-09 06:47:06.

I am currently an AI/ML student but I have recently been thinking more and more about cognitive science. I was wondering if you know of any good resources that approach AI from the perspective of cognitive science

13

1

QUESTION from a Lay person non-math/science type who likes to read about science and AI (lemmit.online)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/OpenWaterRescue on 2023-08-09 03:08:20.

Thanks any answers or musings -

what are some technical limitations (eg computing / storage power/speed) that (1) limits AI's progress and (2) might be solved (and how), and (3) if solved, would make possible developments we can conceive of but not do yet?

I'm just wondering if AI researchers forsee a kind of 'leap forward' and what are some obstacles?

14

1

Catching up on the weird world of LLMs (simonwillison.net)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/nangaparbat on 2023-08-08 23:22:37.

15

1

Generative AI: An Artist's Honest Perspective (lemmit.online)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/x86dragonfly on 2023-08-08 15:45:35.

Hi everyone.

I am an artist. And programmer, and kind of a bit of everything. But what is important, is that I was an artist before the current "generative AI" was a thing, and I have been drawing, digitally and traditionally alike for like... a decade?

Art, to me, is getting what is inside your head, and presenting it to others outside of your consciousness and thoughts. It's showing the world a piece of your interpretation, your experience, your impressions of the world you inhabit. It's about communicating to others your emotions, your ideas, your thoughts and feelings.

Not everyone can draw, or paint, or sculpt. I could say "learn it, it's easy", but that would be a lie. It isn't easy. It is years upon years of constant, hard work, requiring focus and dedication, and a passion for learning it. After all these years, I still struggle with motivation sometimes.

But everyone has ideas. Everyone has feelings. Some people are disabled, some have learning difficulties. There is also something called "aphantasia", where people find it difficult to imagine things. People taking antipsychotics for bipolar disorder and schizophrenia might have difficulties with expressing themselves due to both their illnesses and the side effects of the medications they are taking. Not to mention the people who have paralysis, muscle control issues and terminal illnesses like ALS and MS, whose control over their bodies will slowly deteriorate and they have no power to do anything about that. I could go on...

Generative AI can give all of these less fortunate souls a chance to express their feelings, thoughts and desires in a way that is closer to what a person without these hindrances could achieve.

They can now use words, and other modalities to translate their thoughts into something visual. They now have the ability to create something similar to someone like me, who is fortunate enough to be healthy.

Don't you feel the weight of this? The fact that now anyone can express how they feel, express what they are thinking of, their desires and emotions, even those who couldn't before. I sincerely believe it's a positive thing.

Whether you are a struggling traditional/digital artist, an AI-workflow professional, a programmer, an animator, a writer. As long as you do what you love with passion and respect others, I respect you.

I respect everyone who is honest about the tools they use. Everyone who is honest about how they achieved their desired results.

And most importantly, I respect those who have the courage to present a piece of their soul through whatever medium to the world to see, be it digital, traditional, generative, mixed media, or even something else entirely.

I understand the frustration of anyone who fears for their livelihood, animators, artists... but it's 2023. Things have always changed. Things will always change. We all fear for our livelihoods. We all fear for our future, both as individuals and as a species. Disaster happens upon disaster, people die, wars happen, conflicts, and climate change, the list goes on...

AI is a tool, and can be used to create bad, mediocre and amazing art, just like any other. Instead of creating a single piece over two weeks, now I can create a whole world in the same amount of time, create custom brushes with a single line of command, help myself out with compositional ideas when I'm stuck, sick or burned out.

I think we need to be more mindful of consent during data collection, yes, but I personally do not mind my art being part of one.

For reference, content-aware fill has been a thing in Photoshop since CS5 (released in 2010, mind you)

I have never gotten any complaints from customers saying "you used content aware fill in this part, these are not your pixels, I want my money back."

Art isn't the tools people use to create it. Art is a precious shard of the soul that produced it. As long as you are genuine and honest about your technique, your art is valid and worthy of appreciation by those who witness it.

Until next time.

16

1

One-Minute Daily AI News 8/7/2023 (lemmit.online)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/Excellent-Target-847 on 2023-08-08 05:58:27.

Data analytics company Qureight has entered into a multi-year strategic research collaboration with AstraZeneca that will use AI models to accelerate research into lung diseases.[1]
Zoom’s terms of service update establishes the video platform’s right to use some customer data for training its AI models.[2]
Cigna, one of the country’s largest health insurance companies, faces a class action lawsuit over charges that it illegally used an AI algorithm to deny hundreds of thousands of claims without a physician’s review.[3]
Japan plans guidelines for AI-savvy human resources.[4]

Sources:

[1]

[2]

[3]

[4]

17

1

Ai generated trailer for horror film “Magic 8” (old.reddit.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/SellowYubmarine on 2023-08-08 03:20:09.

18

1

AI — weekly megathread! (lemmit.online)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/jaketocake on 2023-08-04 19:01:13.

This week in AI - provided by aibrews.com feel free to follow their newsletter

News and Insights

In an innovative clinical trial, researchers at Feinstein Institutes successfully implanted a microchip in a paralyzed man's brain and developed AI algorithms to re-establish the connection between his brain and body. This neural bypass restored movement and sensations in his hand, arm, and wrist, marking the first electronic reconnection of a paralyzed individual's brain, body, and spinal cord [Details].
IBM's watsonx.ai geospatial foundation model – built from NASA's satellite data – will be openly available on Hugging Face. It will be the largest geospatial foundation model on Hugging Face and the first-ever open-source AI foundation model built in collaboration with NASA [Details].
Google DeepMind introduced RT-2 - Robotics Transformer 2 - a first-of-its-kind vision-language-action (VLA) model that can directly output robotic actions. Just like language models are trained on text from the web to learn general ideas and concepts, RT-2 transfers knowledge from web data to inform robot behavior [Details].
Meta AI released Audiocraft, an open-source framework to generate high-quality, realistic audio and music from text-based user inputs. AudioCraft consists of three models: MusicGen, AudioGen, and EnCodec. [Details | GitHub].
ElevenLabs now offers its previously enterprise-exclusive Professional Voice Cloning model to all users at the Creator plan level and above. Users can create a digital clone of their voice, which can also speak all languages supported by Eleven Multilingual v1 [Details].
Researchers from MIT have developed PhotoGuard, a technique that prevents unauthorized image manipulation by large diffusion models [Details].
Researchers from CMU show that it is possible to automatically construct adversarial attacks on both open and closed-source LLMs - specifically chosen sequences of characters that, when appended to a user query, will cause the system to obey user commands even if it produces harmful content [Paper]
Together AI extends Meta’s LLaMA-2-7B from 4K tokens to 32K long context and released LLaMA-2-7B-32K. [Details | Hugging Face].
AI investment can approach $200 billion globally by 2025 as per the report from Goldman Sachs [Details].
Nvidia presents a new method, Perfusion, that personalizes text-to-image creation using a small 100KB model. Trained for just 4 minutes, it creatively modifies objects' appearance while keeping their identity through a unique "Key-Locking" technique [Details].
Perplexity AI, the GPT-4 powered interactive search assistant, released a beta feature allowing users to upload and ask questions from documents, code, or research papers [Link].
Meta’s LlaMA-2 Chat 70B model outperforms ChatGPT on AlpacaEval leaderboard [Link].
Researchers from LightOn released Alfred-40B-0723, a new open-source Language Model (LLM) based on Falcon-40B aimed at reliably integrating generative AI into business workflows as an AI co-pilot [Details].
The Open Source Initiative (OSI) accuses Meta of misusing the term "open source" and says that the license of LLaMa models such as LLaMa 2 does not meet the terms of the open source definition [Details]
Google has updated its AI-powered Search experience (SGE) to include images and videos in AI-generated overviews, along with enhancing search speeds for quicker results [Details].
YouTube is testing AI-generated video summaries, currently appearing on watch and search pages for a select number of English-language videos [Details]
Meta is reportedly preparing to release AI-powered chatbots with different personas as early as next month [Details]

🔦 Weekly Spotlight

The state of AI in 2023: Generative AI’s breakout year: latest annual McKinsey Global Survey [Link].
Winners from Anthropic’s #BuildwithClaude hackathon last week [Link].
Open-source project Ollama: Get up and running with large language models, locally [Link].
Cybercriminals train AI chatbots for phishing, malware attacks [Link].

—-------

Welcome to the r/artificial weekly megathread. This is where you can discuss Artificial Intelligence - talk about new models, recent news, ask questions, make predictions, and chat other related topics.

Click here for discussion starters for this thread or for a separate post.

Self-promo is allowed in these weekly discussions. If you want to make a separate post, please read and go by the rules or you will be banned.

Previous Megathreads & Subreddit revamp and going forward

19

1

humanscript: An LLM powered plain english programming language (github.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/dyslexiccoder on 2023-08-07 15:19:11.

20

1

Dungeons & Dragons tells illustrators to stop using AI to generate artwork for fantasy franchise (apnews.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/SAT0725 on 2023-08-07 13:22:19.

21

1

Albert Einstein not in black and white, but in lifelike color using AI 🤯. (old.reddit.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/m-king473 on 2023-08-07 08:49:01.

22

1

🤖❤️ (old.reddit.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/Sonic_Improv on 2023-08-07 03:12:03.

23

1

Seeking AI Solution to Remaster My Chiptune Songs with Real Instruments, is there any? (lemmit.online)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/Severo_ on 2023-08-07 02:55:08.

I have these chiptune songs I made myself, and I want to know if there is any AI that can remaster them with real instruments, etc., like an old 8-bit video game song that is updated to a modern version in a remake. Is any already AI capable of doing that?

24

1

Pioneering AI Democracy: Introducing a Decentralized and Merit-Based Governance System for Large Language Models like ChatGPT (proposed to OpenAI) (www.reddit.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/CreepToCrypto on 2023-08-06 22:00:44.

25

1

In the game Superintelligence, you play as an AI trying dominate the planet. [Fictional game concept] (www.reddit.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/Philipp on 2023-08-06 13:20:08.