# PDF2Text I've converted 8gb of PDF's to text in one afternoon on a decade old x270 using this script. Performant enough imho. Try to get 8Gb in your LLM and getting it to actually use it. That's the challenge. ## Convert all PDF's to text This is an script for converting a batch of PDF's to text for machine learning. It only has two dependencies: - python3 - pdf.miner (python requirement, specified in requirements.txt file) ## Installation ```bash python3 -m venv .venv source .venv/bin/activate pip install -r requirements.txt ``` ## Usage: Activate your virtual environment. ```bash source .venv/bin/activate ./pdf2text [source/destination dir] ``` You read that correctly, the source directory is also the destination directory.