In the hidden layer, the activation function will decide what is being determined by the neural network, is it possible for an AI to generate activation function for itself so it can improve upon itself?

34

1

FOSAI Nexus (v0.0.1)! (lemmy.world)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

Welcome to the FOSAI Nexus!

(v0.0.1 - Summer 2023 Edition)

The goal of this knowledge nexus is to act as a link hub for software, applications, tools, and projects that are all FOSS (free open-source software) designed for AI (FOSAI).

If you haven't already, I recommend bookmarking this page (the native one on lemmy.world). It is designed to be periodically updated in new versions I release throughout the year. This is due to the rapid rate in which this field is advancing. Breakthroughs are happening weekly. I will try to keep up through the seasons while including links to each sequential nexus post - but it's best to bookmark this since it will be the start of the content series, giving you access to all future nexus posts as I release them.

If you see something here missing that should be added, let me know. I don't have visibility over everything. I would love your help making this nexus better. Like I said in my welcome message, I am no expert in this field, but I teach myself what I can to distill it in ways I find interesting to share with others.

I hope this helps you unblock your workflow or project and empowers you to explore the wonders of emerging artificial intelligence.

Consider subscribing to /c/FOSAI if you found any of this interesting. I do my best to make sure you stay in the know with the most important updates to all things free open-source AI.

Find Us On Lemmy!

[email protected]

Fediverse Resources

Lemmy

Large Language Model Hub

Download Models

oobabooga

text-generation-webui - a big community favorite gradio web UI by oobabooga designed for running almost any free open-source and large language models downloaded off of HuggingFace which can be (but not limited to) models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and many others. Its goal is to become the AUTOMATIC1111/stable-diffusion-webui of text generation. It is highly compatible with many formats.

Exllama

A standalone Python/C++/CUDA implementation of Llama for use with 4-bit GPTQ weights, designed to be fast and memory-efficient on modern GPUs.

gpt4all

Open-source assistant-style large language models that run locally on your CPU. GPT4All is an ecosystem to train and deploy powerful and customized large language models that run locally on consumer-grade processors.

TavernAI

The original branch of software SillyTavern was forked from. This chat interface offers very similar functionalities but has less cross-client compatibilities with other chat and API interfaces (compared to SillyTavern).

SillyTavern

Developer-friendly, Multi-API (KoboldAI/CPP, Horde, NovelAI, Ooba, OpenAI+proxies, Poe, WindowAI(Claude!)), Horde SD, System TTS, WorldInfo (lorebooks), customizable UI, auto-translate, and more prompt options than you'd ever want or need. Optional Extras server for more SD/TTS options + ChromaDB/Summarize. Based on a fork of TavernAI 1.2.8

Koboldcpp

A self-contained distributable from Concedo that exposes llama.cpp function bindings, allowing it to be used via a simulated Kobold API endpoint. What does it mean? You get llama.cpp with a fancy UI, persistent stories, editing tools, save formats, memory, world info, author's note, characters, scenarios, and everything Kobold and Kobold Lite have to offer. In a tiny package around 20 MB in size, excluding model weights.

KoboldAI-Client

This is a browser-based front-end for AI-assisted writing with multiple local & remote AI models. It offers the standard array of tools, including Memory, Author's Note, World Info, Save & Load, adjustable AI settings, formatting options, and the ability to import existing AI Dungeon adventures. You can also turn on Adventure mode and play the game like AI Dungeon Unleashed.

h2oGPT

h2oGPT is a large language model (LLM) fine-tuning framework and chatbot UI with document(s) question-answer capabilities. Documents help to ground LLMs against hallucinations by providing them context relevant to the instruction. h2oGPT is fully permissive Apache V2 open-source project for 100% private and secure use of LLMs and document embeddings for document question-answer.

Image Diffusion Hub

Download Models

StableDiffusion

Stable Diffusion is a text-to-image diffusion model capable of generating photo-realistic and stylized images. This is the free alternative to MidJourney. It is rumored that MidJourney originates from a version of Stable Diffusion that is highly modified, tuned, then made proprietary.

SDXL (Stable Diffusion XL)

With Stable Diffusion XL, you can create descriptive images with shorter prompts and generate words within images. The model is a significant advancement in image generation capabilities, offering enhanced image composition and face generation that results in stunning visuals and realistic aesthetics.

ComfyUI

A powerful and modular stable diffusion GUI and backend. This new and powerful UI will let you design and execute advanced stable diffusion pipelines using a graph/nodes/flowchart-based interface.

ControlNet

ControlNet is a neural network structure to control diffusion models by adding extra conditions. This is a very popular and powerful extension to add to AUTOMATIC111's stable-diffusion-webui.

TemporalKit

An all-in-one solution for adding Temporal Stability to a Stable Diffusion Render via an automatic1111 extension. You must install FFMPEG to path before running this.

EbSynth

Bring your paintings to animated life. This software can be used in conjunction with StableDiffusion + ControlNet + TemporalKit workflows.

WarpFusion

A TemporalKit alternative to produce video effects and animation styling.

Training & Education

LLMs

Diffusers

Bonus Recommendations

AI Business Startup Kit

LLM Learning Material from the Developer of SuperHOT (kaiokendev):

Here are some resources to help with learning LLMs:

Andrej Karpathy’s GPT from scratch

Huggingface’s NLP Course

And for training specifically:

Alpaca LoRA

Vicuna

Community training guide

Of course for papers, I recommend reading anything on arXiv’s CS - Computation & Language that looks interesting to you: https://arxiv.org/list/cs.CL/recent.

Support Developers!

Please consider donating, subscribing to, or buying a coffee for any of the major community developers advancing Free Open-Source Artificial Intelligence.

If you're a developer in this space and would like to have your information added here (or changed), please don't hesitate to message me!

TheBloke

https://www.patreon.com/TheBlokeAI

Oobabooga

https://ko-fi.com/oobabooga

Eric Hartford

https://erichartford.com/

kaiokendev

https://kaiokendev.github.io/

Major FOSAI News & Breakthroughs

(June 2023) MPT-30B: Raising the bar for open-source foundation models
(May 2023) Google "We Have No Moat, And Neither Does OpenAI"
(May 2023) Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs
(March 2023) OpenAI Releases Chat-GPT 4
(November 2022) OpenAI Releases Chat-GPT 3
(December 2017) Attention Is All You Need

35

3

Training AI on other AI causes models to collapse (original title : The AI is eating itself) (www.platformer.news)

submitted 1 year ago by [email protected] to c/[email protected]

1 comments fedilink

Hi lemmings, what do you think about this and do you see a parallel with the human mind ?

... "A second, more worrisome study comes from researchers at the University of Oxford, University of Cambridge, University of Toronto, and Imperial College London. It found that training AI systems on data generated by other AI systems — synthetic data, to use the industry’s term — causes models to degrade and ultimately collapse" ...

36

5

New ROCm™ 5.6 Release Brings Enhancements and Optimizations for AI and HPC Workloads (community.amd.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

cross-posted from: https://lemmy.world/post/811496

Huge news for AMD fans and those who are hoping to see a real* open alternative to CUDA that isn't OpenCL!

*: Intel doesn't count, they still have to get their shit together in rendering things correctly with their GPUs.

We plan to expand ROCm support from the currently supported AMD RDNA 2 workstation GPUs: the Radeon Pro v620 and w6800 to select AMD RDNA 3 workstation and consumer GPUs. Formal support for RDNA 3-based GPUs on Linux is planned to begin rolling out this fall, starting with the 48GB Radeon PRO W7900 and the 24GB Radeon RX 7900 XTX, with additional cards and expanded capabilities to be released over time.

37

1

Full DragGAN source code is now released: Interactive Point-Based Manipulation of Images (github.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

38

3

MPT-30B: Raising the bar for open-source foundation models (www.mosaicml.com)

submitted 1 year ago by [email protected] to c/[email protected]

1 comments fedilink

and another commercially viable open-source LLM!

39

2

MIT researchers make language models scalable self-learners (news.mit.edu)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

0 comments fedilink

TLDR Summary:

MIT researchers developed a 350-million-parameter self-training entailment model to enhance smaller language models' capabilities, outperforming larger models with 137 to 175 billion parameters without human-generated labels.
The researchers enhanced the model's performance using 'self-training,' where it learns from its own predictions, reducing human supervision and outperforming models like Google's LaMDA, FLAN, and GPT models.
They developed an algorithm called 'SimPLE' to review and correct noisy or incorrect labels generated during self-training, improving the quality of self-generated labels and model robustness.
This approach addresses inefficiency and privacy issues of larger AI models while retaining high performance. They used 'textual entailment' to train these models, improving their adaptability to different tasks without additional training.
By reformulating natural language understanding tasks like sentiment analysis and news classification as entailment tasks, the model's applications were expanded.
While the model showed limitations in multi-class classification tasks, the research still presents an efficient method for training large language models, potentially reshaping AI and machine learning.

40

6

Accelerating Drug Discovery With the AI Behind ChatGPT – Screening 100 Million Compounds a Day (scitechdaily.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

TLDR summary:

Researchers at MIT and Tufts University have developed an AI model called ConPLex that can screen over 100 million drug compounds in a day to predict their interactions with target proteins. This is much faster than existing computational methods and could significantly speed up the drug discovery process.
Most existing computational drug screening methods calculate the 3D structures of proteins and drug molecules, which is very time-consuming. The new ConPLex model uses a language model to analyze amino acid sequences and drug compounds and predict their interactions without needing to calculate 3D structures.
The ConPLex model was trained on a database of over 20,000 proteins to learn associations between amino acid sequences and structures. It represents proteins and drug molecules as numerical representations that capture their important features. It can then determine if a drug molecule will bind to a protein based on these numerical representations alone.
The researchers enhanced the model using a technique called contrastive learning, in which they trained the model to distinguish real drug-protein interactions from decoys that look similar but do not actually interact. This makes the model less likely to predict false interactions.
The researchers tested the model by screening 4,700 drug candidates against 51 protein kinases. Experiments confirmed that 12 of the 19 top hits had strong binding, including 4 with extremely strong binding. The model could be useful for screening drug toxicity and other applications.
The new model could significantly reduce drug failure rates and the cost of drug development. It represents a breakthrough in predicting drug-target interactions and could be further improved by incorporating more data and molecular generation methods.
The model and data used in this research have been made publicly available for other scientists to use.

41

8

AI Translates 5000-Year-Old Cuneiform (www-timesofisrael-com.cdn.ampproject.org)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

A team from Israel has developed an AI model that translates Cuneiform, a 5000-year-old writing system, into English within seconds. This model, developed at Tel Aviv University, uses Neural Machine Translation (NMT) and has fairly good accuracy. Despite the complexity of the language and age, the AI was successfully trained and can now help to uncover the mysteries of the past. You can try an early demo of this model on The Babylon Engine and its source code is available on GitHub on Akkademia and the Colaboratory.

42

5

7 AI Companies That Could Become Trillion-Dollar Companies (markets.businessinsider.com)

submitted 1 year ago by [email protected] to c/[email protected]

1 comments fedilink

43

6

Meta AI Reveals Game-Changing I-JEPA: A Leap Forward in Self-Supervised Learning Mimicking Human Perception and Reasoning (www.marktechpost.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

Meta AI has revealed their first AI model, I-JEPA, which learns by comparing abstract representations of images, not the pixels. This self-supervised learning model fills in knowledge gaps in a way that mirrors human perception. I-JEPA is adaptable and efficient, offering robust performance even with a less complex model. Excitingly, the code for this pioneering technology is open-source. Check it out on GitHub!

44

3

13b parameter Orca LLM is redefining what small model LLM's are capable of. (docs.kanaries.net)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

Machine Learning | Artificial Intelligence

Welcome to the FOSAI Nexus!

Fediverse Resources

Large Language Model Hub

Image Diffusion Hub

Training & Education

LLMs

Diffusers

Bonus Recommendations

Support Developers!