These notes work as of 07/11/2023 using Xubuntu 22.04 - your milage may vary.
PrivateGPT
PrivateGPT is a production-ready AI project that allows you to ask questions about your documents using the power of Large Language Models (LLMs), even in scenarios without an Internet connection. 100% private, no data leaves your execution environment at any point.
Repo
https://github.com/imartinez/privateGPTDocs
https://docs.privategpt.devInstall
https://docs.privategpt.dev/#section/Installation-and-SettingsInstall git
sudo apt install git
Install python
sudo apt install python3
Install pip
sudo apt install python3-pip
Install pyenv
cd ~ curl https://pyenv.run | bash
Add the commands to ~/.bashrc by running the following in your terminal:
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc echo 'command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc echo 'eval "$(pyenv init -)"' >> ~/.bashrc
If you have ~/.profile, ~/.bash_profile or ~/.bash_login, add the commands there as well. If you have none of these, add them to ~/.profile:
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.profile echo 'command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.profile echo 'eval "$(pyenv init -)"' >> ~/.profile
Restart your shell for the changes to take effect.
Install Python 3.11
pyenv install 3.11 pyenv local 3.11
If you see these errors and warnings, install the required dependencies:
ModuleNotFoundError: No module named '_bz2' WARNING: The Python bz2 extension was not compiled. Missing the bzip2 lib? Traceback (most recent call last): File "", line 1, in File "/home/adrian/.pyenv/versions/3.11.6/lib/python3.11/curses/__init__.py", line 13, in from _curses import * ModuleNotFoundError: No module named '_curses' WARNING: The Python curses extension was not compiled. Missing the ncurses lib? Traceback (most recent call last): File " ", line 1, in File "/home/adrian/.pyenv/versions/3.11.6/lib/python3.11/ctypes/__init__.py", line 8, in from _ctypes import Union, Structure, Array ModuleNotFoundError: No module named '_ctypes' WARNING: The Python ctypes extension was not compiled. Missing the libffi lib? Traceback (most recent call last): File " ", line 1, in ModuleNotFoundError: No module named 'readline' WARNING: The Python readline extension was not compiled. Missing the GNU readline lib? Traceback (most recent call last): File " ", line 1, in File "/home/adrian/.pyenv/versions/3.11.6/lib/python3.11/ssl.py", line 100, in import _ssl # if we can't import it, let the error propagate ^^^^^^^^^^^ ModuleNotFoundError: No module named '_ssl' ERROR: The Python ssl extension was not compiled. Missing the OpenSSL lib? ModuleNotFoundError: No module named '_sqlite3' WARNING: The Python sqlite3 extension was not compiled. Missing the SQLite3 lib? Traceback (most recent call last): File " ", line 1, in File "/home/adrian/.pyenv/versions/3.11.6/lib/python3.11/tkinter/__init__.py", line 38, in import _tkinter # If this fails your Python may not be configured for Tk ^^^^^^^^^^^^^^^ ModuleNotFoundError: No module named '_tkinter' WARNING: The Python tkinter extension was not compiled and GUI subsystem has been detected. Missing the Tk toolkit? Traceback (most recent call last): File " ", line 1, in File "/home/adrian/.pyenv/versions/3.11.6/lib/python3.11/lzma.py", line 27, in from _lzma import * ModuleNotFoundError: No module named '_lzma' WARNING: The Python lzma extension was not compiled. Missing the lzma lib?
Install dependencies:
sudo apt update sudo apt install libbz2-dev sudo apt install libncurses-dev sudo apt install libffi-dev sudo apt install libreadline-dev sudo apt install libssl-dev sudo apt install libsqlite3-dev sudo apt install tk-dev sudo apt install liblzma-dev
Try installing Python 3.11 again:
pyenv install 3.11 pyenv local 3.11
Install pipx
python3 -m pip install --user pipx python3 -m pipx ensurepath
Restart your shell for the changes to take effect.
Install poetry
pipx install poetry
Clone the privateGPT repo
cd ~ git clone https://github.com/imartinez/privateGPT cd privateGPT
Install dependencies
poetry install --with ui,local
Download Embedding and LLM models
poetry run python scripts/setup
Run the local server
PGPT_PROFILES=local make run
Navigate to the UI
http://localhost:8001/Shutdown
ctrl-c
GPU Acceleration
Verify the machine has a CUDA-Capable GPU
lspci | grep -i nvidia
Install the NVIDIA CUDA Toolkit
sudo apt update sudo apt upgrade sudo apt install nvidia-cuda-toolkit
Verify installation
nvcc --version nvidia-smi
Install llama.cpp with GPU support
Find your version of llama_cpp_python:
poetry run pip list | grep llama_cpp_python
Substitue the version in the next command:
cd ~/privateGPT CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python==0.2.13
If you see an error like this, try specifitying the location of nvcc:
Building wheels for collected packages: llama-cpp-python Building wheel for llama-cpp-python (pyproject.toml) ... error error: subprocess-exited-with-error × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [35 lines of output] *** scikit-build-core 0.6.0 using CMake 3.27.7 (wheel) *** Configuring CMake... loading initial cache file /tmp/tmp591ifmq4/build/CMakeInit.txt -- The C compiler identification is GNU 11.4.0 -- The CXX compiler identification is GNU 11.4.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Found CUDAToolkit: /usr/local/cuda/include (found version "12.3.52") -- cuBLAS found -- The CUDA compiler identification is unknown CMake Error at /tmp/pip-build-env-h3vy91ne/normal/lib/python3.11/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCUDACompiler.cmake:603 (message): Failed to detect a default CUDA architecture. Compiler output: Call Stack (most recent call first): vendor/llama.cpp/CMakeLists.txt:258 (enable_language) -- Configuring incomplete, errors occurred! *** CMake configuration failed [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for llama-cpp-python Failed to build llama-cpp-python ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects
Build with the location of nvcc:
CUDACXX=/usr/local/cuda-12/bin/nvcc CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python==0.2.13
Start the server
cd ~/privateGPT pyenv local 3.11 PGPT_PROFILES=local make run
If you see this error, configure the number of layers offloaded to VRAM:
CUDA error 2 at /tmp/pip-install-pqg0kmzj/llama-cpp-python_a94e4e69cdce4224adec44b01749f74a/vendor/llama.cpp/ggml-cuda.cu:7636: out of memory current device: 0 make: *** [Makefile:36: run] Error 1
Configure the number of layers offloaded to VRAM:
cp ~/privateGPT/private_gpt/components/llm/llm_component.py ~/privateGPT/private_gpt/components/llm/llm_component.py.backup vim ~/privateGPT/private_gpt/components/llm/llm_component.py
change:
model_kwargs={"n_gpu_layers": -1},
to:
model_kwargs={"n_gpu_layers": 10},
Try to start the server again:
cd ~/privateGPT pyenv local 3.11 PGPT_PROFILES=local make run
If the server is using the GPU you will see something like this in the output:
... ggml_init_cublas: GGML_CUDA_FORCE_MMQ: no ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes ggml_init_cublas: found 1 CUDA devices: Device 0: NVIDIA RTX A1000 Laptop GPU, compute capability 8.6 ... llm_load_tensors: ggml ctx size = 0.11 MB llm_load_tensors: using CUDA for GPU acceleration llm_load_tensors: mem required = 2902.35 MB llm_load_tensors: offloading 10 repeating layers to GPU llm_load_tensors: offloaded 10/35 layers to GPU llm_load_tensors: VRAM used: 1263.12 MB ............................................................................................... llama_new_context_with_model: n_ctx = 3900 llama_new_context_with_model: freq_base = 10000.0 llama_new_context_with_model: freq_scale = 1 llama_new_context_with_model: kv self size = 487.50 MB llama_build_graph: non-view tensors processed: 740/740 llama_new_context_with_model: compute buffer total size = 282.00 MB llama_new_context_with_model: VRAM scratch buffer: 275.37 MB llama_new_context_with_model: total VRAM used: 1538.50 MB (model: 1263.12 MB, context: 275.37 MB) AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | ...
Ingest
For example, to download and ingest an html copy of A Little Riak Book:cd ~/privateGPT mkdir ${PWD}/ingest wget -P ${PWD}/ingest https://raw.githubusercontent.com/basho-labs/little_riak_book/master/rendered/riaklil-en.html poetry run python scripts/ingest_folder.py ${PWD}/ingest
Configure Temperature
cp ~/privateGPT/private_gpt/components/llm/llm_component.py ~/privateGPT/private_gpt/components/llm/llm_component.py.backup vim ~/privateGPT/private_gpt/components/llm/llm_component.py
change:
temperature=0.1
to:
temperature=0.2
Restart the server
crtl+c cd ~/privateGPT pyenv local 3.11 PGPT_PROFILES=local make run