Skip to content
View mgp87's full-sized avatar
📓
Learning
📓
Learning

Highlights

  • Pro

Block or report mgp87

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
37 stars written in C++
Clear filter

LLM inference in C/C++

C++ 77,363 11,249 Updated Mar 30, 2025

Port of OpenAI's Whisper model in C/C++

C++ 38,837 4,056 Updated Mar 30, 2025

Carbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README)

C++ 32,790 1,492 Updated Mar 30, 2025

ncnn is a high-performance neural network inference framework optimized for the mobile platform

C++ 21,204 4,229 Updated Mar 29, 2025

MLX: An array framework for Apple silicon

C++ 19,980 1,148 Updated Mar 29, 2025

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 16,121 3,121 Updated Mar 30, 2025

NoSQL data store using the Seastar framework, compatible with Apache Cassandra and Amazon DynamoDB

C++ 14,273 1,350 Updated Mar 30, 2025

Tensor library for machine learning

C++ 12,208 1,191 Updated Mar 29, 2025

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 11,393 2,172 Updated Mar 11, 2025

FlashMLA: Efficient MLA decoding kernels

C++ 11,391 812 Updated Mar 1, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 10,040 1,270 Updated Mar 29, 2025

WasmEdge is a lightweight, high-performance, and extensible WebAssembly runtime for cloud native, edge, and decentralized applications. It powers serverless apps, embedded functions, microservices,…

C++ 9,092 806 Updated Mar 29, 2025

OpenVINO™ is an open source toolkit for optimizing and deploying AI inference

C++ 8,062 2,509 Updated Mar 30, 2025

lightweight, standalone C++ inference engine for Google's Gemma models.

C++ 6,322 533 Updated Mar 28, 2025

A flexible, high-performance serving system for machine learning models

C++ 6,255 2,200 Updated Mar 19, 2025

Transformer related optimization, including BERT, GPT

C++ 6,097 900 Updated Mar 27, 2024

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

C++ 5,336 630 Updated Mar 25, 2025

MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.

C++ 5,006 824 Updated Jun 17, 2024

Tengine is a lite, high performance, modular inference engine for embedded device

C++ 4,453 971 Updated Mar 6, 2025

Stable Diffusion and Flux in pure C/C++

C++ 3,957 359 Updated Mar 9, 2025

Fast inference engine for Transformer models

C++ 3,711 342 Updated Mar 28, 2025

ONNX-TensorRT: TensorRT backend for ONNX

C++ 3,051 546 Updated Mar 7, 2025

The Lobster Programming Language

C++ 2,398 126 Updated Mar 25, 2025

An Open Source Machine Learning Framework for Everyone

C++ 1,117 166 Updated Sep 25, 2024

Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure

C++ 833 337 Updated Mar 27, 2025

ONNX Model Exporter for PaddlePaddle

C++ 787 175 Updated Mar 24, 2025

LLama.cpp golang bindings

C++ 753 87 Updated Mar 21, 2025

ONNX Optimizer

C++ 687 92 Updated Mar 15, 2025

vendor independent TinyML deep learning library, compiler and inference framework microcomputers and micro-controllers

C++ 588 88 Updated Oct 29, 2022

Source code for 'Design Patterns in Modern C++' by Dmitri Nesteruk

C++ 552 195 Updated Feb 5, 2023
Next