Projects
Browse curated open source AI tools

Unstract is an open-source No-Code LLM Platform designed to automate data extraction from unstructured documents for data engineers and developers. It transforms complex files like PDFs, emails, and images into structured formats ready for downstream analysis.
Enables automated ETL pipelines that ingest unstructured sources and output structured JSON or database records.
Implements a visual workflow builder to design extraction logic using LLMs without writing boilerplate code.
Supports native integration with providers like OpenAI and Anthropic, plus local models via Ollama.
Provides pre-built connectors for S3, Google Drive, and SQL databases to streamline data ingestion.
Includes evaluation tools to measure extraction accuracy and optimize prompt performance.
Built on a Python-based architecture utilizing LangChain for orchestration and Pydantic for schema validation.
Deployable as a self-hosted Docker container or via a managed cloud environment.
Automating financial report analysis to extract key indicators for investment databases.
Processing insurance claims to standardize data for automated risk assessment.
Deploy Unstract via Docker Compose to start building extraction workflows locally.