this post was submitted on 01 Jul 2023
3 points (80.0% liked)

Machine Learning | Artificial Intelligence

947 readers
3 users here now

Welcome to Machine Learning โ€“ a versatile digital hub where Artificial Intelligence enthusiasts unite. From news flashes and coding tutorials to ML-themed humor, our community covers the gamut of machine learning topics. Regardless of whether you're an AI expert, a budding programmer, or simply curious about the field, this is your space to share, learn, and connect over all things machine learning. Let's weave algorithms and spark innovation together.

founded 1 year ago
MODERATORS
 

In the hidden layer, the activation function will decide what is being determined by the neural network, is it possible for an AI to generate activation function for itself so it can improve upon itself?

you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 3 points 1 year ago

Based on my research, there is an emerging interest in the field of meta-learning, or "learning to learn." Some researchers are exploring the concept of allowing neural networks to learn their own hyperparameters, which could include parameters of activation functions. However, it's my understanding that this approach could lead to more complex training processes and risks such as unstable gradients, and it might not always result in significantly better performance.

While activation functions with learnable parameters aren't commonly used, there is ongoing research that explores them. One such example is the Parametric ReLU (PReLU) function - a variant of the ReLU activation function that allows the negative slope to be learned during training, as opposed to being a predetermined hyperparameter.

In my opinion, if you're new to this field, it's essential to grasp the basics of neural networks, including understanding how common activation functions like ReLU, sigmoid, tanh, etc., operate. These advanced concepts are undoubtedly fascinating and might offer incremental improvements, but even most of today's state-of-the-art models primarily use these "standard" activation functions. So, starting with a solid foundation is key.