Build software better, together

dissorial / doc-chatbot

Document chatbot — multiple files, topics, chat windows and chat history. Powered by GPT.

chat typescript reactjs mongoose nextjs chatbot openai vectorization pinecone document-embedding tailwindcss pdf-processing gpt-3 openai-api gpt-4 langchain

Updated Jul 21, 2023
TypeScript

allenai / papermage

Star

library supporting NLP and CV research on scientific papers

python machine-learning natural-language-processing computer-vision scientific-papers multimodal pdf-processing

Updated Nov 8, 2024
Python

ahmedkhemiri95 / PDFs-TextExtract

Star

Multiple and Large PDF Documents Text Extraction.

python pdf parser data-science pdf-document text-analytics pdfs pypdf2 extract-text pdfminer pdf-processing pdfs-textextract

Updated Feb 2, 2024
Python

aws-samples / document-processing-pipeline-for-regulated-industries

Star

A boilerplate solution for processing image and PDF documents for regulated industries, with lineage and pipeline operations metadata services.

Updated Oct 25, 2021
Python

Govind-S-B / pdf-to-text-chroma-search

Star

Python scripts that converts PDF files to text, splits them into chunks, and stores their vector representations using GPT4All embeddings in a Chroma DB. It also provides a script to query the Chroma DB for similarity search based on user input.

text-extraction similarity-search pdf-processing vector-embeddings chromadb

Updated Oct 23, 2023
Python

ManasMadan / pdf-actions

Star

A NPM Package built on top of pdf-lib that provides functonalities like merge, rotate, split,download pdf to disk and many more...

react javascript pdf npm reactjs react-component pdf-merge pdf-split pdf-rotate pdf-merger pdf-downloader pdf-lib pdf-splitter pdf-processing pdf-download pdf-free pdf-online

Updated Oct 31, 2023
JavaScript

ManasMadan / PDFActions

Star

Built with pdf-actions NPM package.

react pdf reactjs react-component react-components pdf-merge pdf-split pdf-rotate pdf-merger pdf-downloader pdf-lib pdf-splitter pdf-processing pdf-download

Updated May 27, 2024
JavaScript

ranguy9304 / LangGraphRAG

Star

LangGraphRAG: A terminal-based Retrieval-Augmented Generation system using LangGraph. Features include message history caching, query transformation, and vector database retrieval. Ideal for NLP researchers and developers working on advanced conversational AI and information retrieval systems.

python natural-language-processing information-retrieval chatbot web-scraping nlp-machine-learning rag terminal-application pdf-processing vector-database openai-api langgraph

Updated Jul 13, 2024
Python

Inc44 / MaTools

Star

An all-in-one GUI management toolkit built with PyQt6, offering a suite of tools for file synchronization, media organization, PDF merging, code formatting, and more.

python rust productivity application gui qt ocr image-processing video-processing speech-recognition youtube-downloader file-management audio-processing pdf-processing code-formatting

Updated Nov 16, 2024
Python

Yardenrsk / PsychometryReceiverCV

Star

A side project to easily get and annotate questions and answers to the PsychometryBot project DB using computer vision and pdf parsing

pandas opencv-python pdf-processing

Updated Sep 18, 2022
Python

Aleptonic / PdfSnipper

Star

PdfSnipper is a lightweight and efficient Python package designed to simplify the management of PDF files, pages, and their conversions during various NLP, Computer Vision (CV), or other data processing tasks. The package eliminates the need for repetitive code by providing intuitive, ready-to-use functions for common PDF-related operations.

utilities pdf-processing nlp-tools

Updated Feb 3, 2025
Python

thinhuos0913 / python_useful_mini_projects

Star

This is some useful mini projects that I had worked for self-learning Python programming.

python opencv ocr image-processing pdf-processing

Updated May 20, 2024
Python

Al-shwaib / Book-Preparation-for-Printing

Star

A web application for preparing books and magazines for offset printing. Automatically arranges PDF pages for commercial A3 printing, supporting both Arabic (RTL) and English (LTR) books. تطبيق ويب لتحضير الكتب والمجلات للطباعة على مطابع الأوفست. يقوم تلقائياً بترتيب صفحات PDF للطباعة التجارية على ورق A3، مع دعم الكتب العربية والإنجليزية.

flask-application pymupdf pdf-processing rtl-support offset-printing book-preparation arabic-books commercial-printing a3-printing order-to-print

Updated Jan 6, 2025
Python

arsath-eng / RAG1-NVIDIA-GENAI

Star

A powerful Retrieval Augmented Generation (RAG) application built with NVIDIA AI endpoints and Streamlit. This solution enables intelligent document analysis and question-answering using state-of-the-art language models, featuring multi-PDF processing, FAISS vector store integration, and advanced prompt engineering.

embeddings question-answering document-analysis faiss rag pdf-processing streamlit llm langchain vector-store nvidia-ai-faundry llama-models

Updated Oct 31, 2024
Python

dsckiet / covid-tracker-android-app

Star

A statistical data display and notifier app for Covid-19 pandemic.

statistics mvvm dagger2 pdf-processing

Updated May 15, 2022
Kotlin

ydvrahul19 / Invoice-Manager

Star

A modern, intelligent invoice processing system with advanced multi-format data extraction capabilities. Process invoices from PDFs, Excel files, and images with smart data recognition.

react firebase material-ui data-extraction invoice-management pdf-processing framer-motion redux-toolkit invoice-processing

Updated Nov 23, 2024
JavaScript

Farhaj499 / RAG_with_Weaviate_DB

Star

This project implements a Retrieval Augmented Generation (RAG) system that answers questions based on the PDF document. It utilizes Weaviate as a vector database for efficient retrieval of relevant information and Gemini to generate natural language responses.

python embeddings semantic-search rag weaviate pdf-processing vector-database huggingface-transformers langchain retrieval-augmented-generation agentic-ai

Updated Jan 12, 2025
Jupyter Notebook

9-5 / Chromium-Intelligence

Star

A powerful Chromium extension that leverages the multiple AI APIs to assist with various text operations, image analysis, and PDF processing.

chrome-extension productivity natural-language-processing text-summarization text-processing image-analysis browser-automation content-analysis proofreading gemini-api browser-tools pdf-processing ai-assistant manifest-v3 custom-prompts tone-adjustment

Updated Jan 11, 2025
JavaScript

akshatpunia26 / berrylit_pdf_chat

Star

Berrylit is a simple chatbot interface that allows users to upload a PDF file and ask a question related to its contents. The chatbot uses the Berri API for processing.

python api natural-language-processing chatbot pdf-processing streamlit

Updated Jun 26, 2023
Python

mohamedelareeg / ImageAutomaticCroppingWatcher

Star

Image Automatic Cropping Watcher: A tool that automatically detects PDF files, converts them to images, corrects perspective distortion, and compiles them back into PDFs.

pdf opencv json ai itextsharp pdf-generation pdf-processing autoskew

Updated Feb 29, 2024
C#

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pdf-processing

Here are 58 public repositories matching this topic...

dissorial / doc-chatbot

allenai / papermage

ahmedkhemiri95 / PDFs-TextExtract

aws-samples / document-processing-pipeline-for-regulated-industries

Govind-S-B / pdf-to-text-chroma-search

ManasMadan / pdf-actions

ManasMadan / PDFActions

ranguy9304 / LangGraphRAG

Inc44 / MaTools

Yardenrsk / PsychometryReceiverCV

Aleptonic / PdfSnipper

thinhuos0913 / python_useful_mini_projects

Al-shwaib / Book-Preparation-for-Printing

arsath-eng / RAG1-NVIDIA-GENAI

dsckiet / covid-tracker-android-app

ydvrahul19 / Invoice-Manager

Farhaj499 / RAG_with_Weaviate_DB

9-5 / Chromium-Intelligence

akshatpunia26 / berrylit_pdf_chat

mohamedelareeg / ImageAutomaticCroppingWatcher

Improve this page

Add this topic to your repo