gelu

Star

Here are 8 public repositories matching this topic...

zer0int / CLIP-fine-tune-registers-gated

Star

Vision Transformers Needs Registers. And Gated MLPs. And +20M params. Tiny modality gap ensues!

transformers vision fine-tune clip text-to-image registers finetune relu gelu gated comfyui flux1 register-tokens

Updated Mar 12, 2025
Python

knotgrass / Griffin

Star

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

h3 linear attention language-model griffin mamba gelu conv1d rmsnorm rg-lru shift-ssm

Updated Dec 23, 2024
Python

kastalimohammed1965 / CLIP-fine-tune-registers-gated

Star

Vision Transformers Needs Registers. And Gated MLPs. And +20M params. Tiny modality gap ensues!

transformers vision fine-tune clip text-to-image registers finetune relu gelu gated comfyui flux1 register-tokens

Updated Mar 13, 2025
Python

jElhamm / Activation-Functions

Star

"The 'Activation Functions' project repository contains implementations of various activation functions commonly used in neural networks. "

Updated Mar 18, 2024
Python

dcarpintero / transformer101

Star

Annotated vanilla implementation in PyTorch of the Transformer model introduced in 'Attention Is All You Need'.

pytorch feedforward-neural-network softmax transfomer attention-is-all-you-need multihead-attention self-attention gelu dot-product-attention encoder-decoder-architecture positional-encoding linear-layers dropout-layers normalization-layers

Updated Feb 19, 2024
Jupyter Notebook

himanshuvnm / PyTorch-regime-for-Deep-Kolmogorov-Method-for-Heat-Equation

Star

Here, we will provide a PyTorch regime to handle the partial differential equation solution of the heat equation by executing Deep Kolmogorov Method of Beck et. al.

pytorch partial-differential-equations heat-equation initial-value-problem gelu relu-activation deep-kolmogorov-method

Updated Apr 16, 2024
Jupyter Notebook

NamanMakkar / UoE-INFR11031-Advanced-Vision-CW-21-22

Star

This repository contains the code and the report for the coursework of INFR11031 Advanced Vision, a postgraduate course offered at The University of Edinburgh. The task was to train on limited and improve the accuracy of the ResNet-50 classifier on a small subset of the ImageNet dataset containing 50K training images and 50K test images. Achieve…

swish imagenet resnet limited-data adversarial-training autoaugment gelu mixed-precision-training mish randaugment mixup-cutmix cutmix-augmentation

Updated Jul 2, 2022
Jupyter Notebook

Nandan91 / relu-revival-normfree

Star

PyTorch implementation of normalization-free LLMs investigating entropic behavior to find desirable activation functions

pythia leaky-relu relu privacy-preserving-machine-learning pytorch-implementation gelu gpt-2 model-optimization transformers-models normalization-free-training llm-inference llm-evaluation llm-architecture private-inference entropy-collapse attention-we

Updated Nov 2, 2024
Python

Improve this page

Add a description, image, and links to the gelu topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gelu topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gelu

Here are 8 public repositories matching this topic...

zer0int / CLIP-fine-tune-registers-gated

knotgrass / Griffin

kastalimohammed1965 / CLIP-fine-tune-registers-gated

jElhamm / Activation-Functions

dcarpintero / transformer101

himanshuvnm / PyTorch-regime-for-Deep-Kolmogorov-Method-for-Heat-Equation

NamanMakkar / UoE-INFR11031-Advanced-Vision-CW-21-22

Nandan91 / relu-revival-normfree

Improve this page

Add this topic to your repo