Sunday, 5 November 2023

PrivateGPT Installation Notes

These notes work as of 07/11/2023 using Xubuntu 22.04 - your milage may vary.

PrivateGPT

PrivateGPT is a production-ready AI project that allows you to ask questions about your documents using the power of Large Language Models (LLMs), even in scenarios without an Internet connection. 100% private, no data leaves your execution environment at any point.

Repo

https://github.com/imartinez/privateGPT

Docs

https://docs.privategpt.dev

Install

https://docs.privategpt.dev/#section/Installation-and-Settings

Install git

sudo apt install git

Install python

sudo apt install python3

Install pip

sudo apt install python3-pip

Install pyenv

cd ~
curl https://pyenv.run | bash

Add the commands to ~/.bashrc by running the following in your terminal:

echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo 'command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(pyenv init -)"' >> ~/.bashrc

If you have ~/.profile, ~/.bash_profile or ~/.bash_login, add the commands there as well. If you have none of these, add them to ~/.profile:

echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.profile
echo 'command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.profile
echo 'eval "$(pyenv init -)"' >> ~/.profile

Restart your shell for the changes to take effect.

Install Python 3.11

pyenv install 3.11
pyenv local 3.11

If you see these errors and warnings, install the required dependencies:

ModuleNotFoundError: No module named '_bz2'
WARNING: The Python bz2 extension was not compiled. Missing the bzip2 lib?
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/adrian/.pyenv/versions/3.11.6/lib/python3.11/curses/__init__.py", line 13, in 
    from _curses import *

ModuleNotFoundError: No module named '_curses'
WARNING: The Python curses extension was not compiled. Missing the ncurses lib?
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/adrian/.pyenv/versions/3.11.6/lib/python3.11/ctypes/__init__.py", line 8, in 
    from _ctypes import Union, Structure, Array

ModuleNotFoundError: No module named '_ctypes'
WARNING: The Python ctypes extension was not compiled. Missing the libffi lib?
Traceback (most recent call last):
  File "", line 1, in 

ModuleNotFoundError: No module named 'readline'
WARNING: The Python readline extension was not compiled. Missing the GNU readline lib?
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/adrian/.pyenv/versions/3.11.6/lib/python3.11/ssl.py", line 100, in 
    import _ssl             # if we can't import it, let the error propagate
    ^^^^^^^^^^^
ModuleNotFoundError: No module named '_ssl'
ERROR: The Python ssl extension was not compiled. Missing the OpenSSL lib?

ModuleNotFoundError: No module named '_sqlite3'
WARNING: The Python sqlite3 extension was not compiled. Missing the SQLite3 lib?
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/adrian/.pyenv/versions/3.11.6/lib/python3.11/tkinter/__init__.py", line 38, in 
    import _tkinter # If this fails your Python may not be configured for Tk
    ^^^^^^^^^^^^^^^

ModuleNotFoundError: No module named '_tkinter'
WARNING: The Python tkinter extension was not compiled and GUI subsystem has been detected. Missing the Tk toolkit?
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/adrian/.pyenv/versions/3.11.6/lib/python3.11/lzma.py", line 27, in 
    from _lzma import *

ModuleNotFoundError: No module named '_lzma'
WARNING: The Python lzma extension was not compiled. Missing the lzma lib?

Install dependencies:

sudo apt update
sudo apt install libbz2-dev
sudo apt install libncurses-dev
sudo apt install libffi-dev
sudo apt install libreadline-dev
sudo apt install libssl-dev
sudo apt install libsqlite3-dev
sudo apt install tk-dev
sudo apt install liblzma-dev

Try installing Python 3.11 again:

pyenv install 3.11
pyenv local 3.11

Install pipx

python3 -m pip install --user pipx
python3 -m pipx ensurepath

Restart your shell for the changes to take effect.

Install poetry

pipx install poetry

Clone the privateGPT repo

cd ~
git clone https://github.com/imartinez/privateGPT
cd privateGPT

Install dependencies

poetry install --with ui,local

Download Embedding and LLM models

poetry run python scripts/setup

Run the local server

PGPT_PROFILES=local make run

Navigate to the UI

http://localhost:8001/

Shutdown

ctrl-c

GPU Acceleration

Verify the machine has a CUDA-Capable GPU

lspci | grep -i nvidia

Install the NVIDIA CUDA Toolkit

sudo apt update
sudo apt upgrade
sudo apt install nvidia-cuda-toolkit

Verify installation

nvcc --version
nvidia-smi

Install llama.cpp with GPU support

Find your version of llama_cpp_python:

poetry run pip list | grep llama_cpp_python

Substitue the version in the next command:

cd ~/privateGPT
CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python==0.2.13

If you see an error like this, try specifitying the location of nvcc:

Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [35 lines of output]
      *** scikit-build-core 0.6.0 using CMake 3.27.7 (wheel)
      *** Configuring CMake...
      loading initial cache file /tmp/tmp591ifmq4/build/CMakeInit.txt
      -- The C compiler identification is GNU 11.4.0
      -- The CXX compiler identification is GNU 11.4.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: /usr/bin/cc - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /usr/bin/c++ - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
      -- Found Threads: TRUE
      -- Found CUDAToolkit: /usr/local/cuda/include (found version "12.3.52")
      -- cuBLAS found
      -- The CUDA compiler identification is unknown
      CMake Error at /tmp/pip-build-env-h3vy91ne/normal/lib/python3.11/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCUDACompiler.cmake:603 (message):
        Failed to detect a default CUDA architecture.
      
      
      
        Compiler output:
      
      Call Stack (most recent call first):
        vendor/llama.cpp/CMakeLists.txt:258 (enable_language)
      
      
      -- Configuring incomplete, errors occurred!
      
      *** CMake configuration failed
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

Build with the location of nvcc:

CUDACXX=/usr/local/cuda-12/bin/nvcc CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python==0.2.13

Start the server

cd ~/privateGPT
pyenv local 3.11
PGPT_PROFILES=local make run

If you see this error, configure the number of layers offloaded to VRAM:

CUDA error 2 at /tmp/pip-install-pqg0kmzj/llama-cpp-python_a94e4e69cdce4224adec44b01749f74a/vendor/llama.cpp/ggml-cuda.cu:7636: out of memory
current device: 0
make: *** [Makefile:36: run] Error 1

Configure the number of layers offloaded to VRAM:

cp ~/privateGPT/private_gpt/components/llm/llm_component.py ~/privateGPT/private_gpt/components/llm/llm_component.py.backup
vim ~/privateGPT/private_gpt/components/llm/llm_component.py

change:

model_kwargs={"n_gpu_layers": -1},

to:

model_kwargs={"n_gpu_layers": 10},

Try to start the server again:

cd ~/privateGPT
pyenv local 3.11
PGPT_PROFILES=local make run

If the server is using the GPU you will see something like this in the output:

...
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA RTX A1000 Laptop GPU, compute capability 8.6
...
llm_load_tensors: ggml ctx size =    0.11 MB
llm_load_tensors: using CUDA for GPU acceleration
llm_load_tensors: mem required  = 2902.35 MB
llm_load_tensors: offloading 10 repeating layers to GPU
llm_load_tensors: offloaded 10/35 layers to GPU
llm_load_tensors: VRAM used: 1263.12 MB
...............................................................................................
llama_new_context_with_model: n_ctx      = 3900
llama_new_context_with_model: freq_base  = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_new_context_with_model: kv self size  =  487.50 MB
llama_build_graph: non-view tensors processed: 740/740
llama_new_context_with_model: compute buffer total size = 282.00 MB
llama_new_context_with_model: VRAM scratch buffer: 275.37 MB
llama_new_context_with_model: total VRAM used: 1538.50 MB (model: 1263.12 MB, context: 275.37 MB)
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 
...

Ingest

For example, to download and ingest an html copy of A Little Riak Book:
cd ~/privateGPT
mkdir ${PWD}/ingest
wget -P ${PWD}/ingest https://raw.githubusercontent.com/basho-labs/little_riak_book/master/rendered/riaklil-en.html
poetry run python scripts/ingest_folder.py ${PWD}/ingest

Configure Temperature

cp ~/privateGPT/private_gpt/components/llm/llm_component.py ~/privateGPT/private_gpt/components/llm/llm_component.py.backup
vim ~/privateGPT/private_gpt/components/llm/llm_component.py

change:

temperature=0.1

to:

temperature=0.2

Restart the server

crtl+c
cd ~/privateGPT
pyenv local 3.11
PGPT_PROFILES=local make run

Sunday, 10 September 2023

Riak-like secondary index queries for S3

This is an idea for how to provide secondary index queries, similar to Riak 2i, on top of Amazon S3, using nothing but S3, boto3 and some Python.

This code hasn't been anywhere near a production environment, never benchmarked, only processed trivial amounts of data and tested only against localstack. It's not even commented. As such, it should not be used by anybody for any reason - ever.

If you do give it a try, let me know how it went.

s32i.sh

from concurrent.futures.thread import ThreadPoolExecutor
import re

from botocore.exceptions import ClientError


class S32iDatastore():

    __EXECUTOR = ThreadPoolExecutor(max_workers=os. cpu_count() - 1)

    INDEXES_FOLDER = 'indexes'
    LIST_OBJECTS = 'list_objects_v2'

    def __init__(self, s3_resource, bucket_name):

        self.s3_resource = s3_resource
        self.bucket_name = bucket_name

    def __run_in_thread(self, fn, *args):

        return self.__EXECUTOR.submit(fn, *args)

    def get(self, key):

        record = self.s3_resource.Object(self.bucket_name, key).get()
        indexes = record['Metadata']
        data = record['Body'].read()

        return data, indexes

    def head(self, key):

        record = self.s3_resource.meta.client.head_object(Bucket=self.bucket_name, Key=key)
        return record['Metadata']

    def exists(self, key):

        try:
            self.head(key)
            return True
        except ClientError:
            return False

    def put(self, key, data='', indexes={}):

        self.__run_in_thread(self.create_secondary_indexes, key, indexes)

        return self.s3_resource.Object(self.bucket_name, key).put(
            Body=data,
            Metadata=indexes)

    def delete(self, key):

        self.__run_in_thread(self.remove_secondary_indexes, key, self.head(key))

        return self.s3_resource.Object(self.bucket_name, key).delete()

    def create_secondary_indexes(self, key, indexes):

        for index, values in indexes.items():
            for value in values.split(','):
                self.put(f'{self.INDEXES_FOLDER}/{index}/{value}/{key}')

    def remove_secondary_indexes(self, key, indexes):

        for index, values in indexes.items():
            for value in values.split(','):
                self.s3_resource.Object(self.bucket_name, f'{self.INDEXES_FOLDER}/{index}/{value}/{key}').delete()

    def secondary_index_range_query(self,
                                    index,
                                    start, end=None,
                                    page_size=1000, max_results=10000,
                                    term_regex=None, return_terms=False):

        if end is None:
            end = start

        if term_regex:
            pattern = re.compile(f'^{self.INDEXES_FOLDER}/{index}/{term_regex}$')

        start_key = f'{self.INDEXES_FOLDER}/{index}/{start}'
        end_key = f'{self.INDEXES_FOLDER}/{index}/{end}'

        paginator = self.s3_resource.meta.client.get_paginator(self.LIST_OBJECTS)
        pages = paginator.paginate(
            Bucket=self.bucket_name,
            StartAfter=start_key,
            PaginationConfig={
                'MaxItems': max_results,
                'PageSize': page_size})

        for page in pages:
            for result in page['Contents']:

                result_key = result['Key']

                if result_key[0:len(end_key)] > end_key:
                    return

                if term_regex and not pattern.match(result_key):
                    continue

                parts = result_key.split('/')

                if return_terms:
                    yield (parts[-1], parts[-2])
                else:
                    yield parts[-1]

s32i_test.sh

import json
import unittest

import boto3

from s32i import S32iDatastore


class S32iDatastoreTest(unittest.TestCase):

    LOCALSTACK_ENDPOINT_URL = "http://localhost.localstack.cloud:4566"
    TEST_BUCKET = 's32idatastore-test-bucket'

    @classmethod
    def setUpClass(cls):

        cls.s3_resource = cls.create_s3_resource()
        cls.bucket = cls.create_bucket(cls.TEST_BUCKET)
        cls.datastore = S32iDatastore(cls.s3_resource, cls.TEST_BUCKET)
        cls.create_test_data()

    @classmethod
    def tearDownClass(cls):

        cls.delete_bucket()

    @classmethod
    def create_s3_resource(cls, endpoint_url=LOCALSTACK_ENDPOINT_URL):

        return boto3.resource(
        's3',
        endpoint_url=endpoint_url)

    @classmethod
    def create_bucket(cls, bucket_name):

        return cls.s3_resource.create_bucket(Bucket=bucket_name)

    @classmethod
    def delete_bucket(cls):

        cls.bucket.objects.all().delete()

    @classmethod
    def create_test_data(cls):

        cls.datastore.put(
            'KEY0001',
            json.dumps({'name': 'Alice', 'dob': '19700101', 'gender': '2'}),
            {'idx-gender-dob': '2|19700101'})

        cls.datastore.put(
            'KEY0002',
            json.dumps({'name': 'Bob', 'dob': '19800101', 'gender': '1'}),
            {'idx-gender-dob': '1|19800101'})

        cls.datastore.put(
            'KEY0003',
            json.dumps({'name': 'Carol', 'dob': '19900101', 'gender': '2'}),
            {'idx-gender-dob': '2|19900101'})

        cls.datastore.put(
            'KEY0004',
            json.dumps({'name': 'Dan', 'dob': '20000101', 'gender': '1'}),
            {'idx-gender-dob': '1|20000101'})

        cls.datastore.put(
            'KEY0005',
            json.dumps({'name': 'Eve', 'dob': '20100101', 'gender': '2'}),
            {'idx-gender-dob': '2|20100101'})

        cls.datastore.put(
            'KEY0006',
            json.dumps({'name': ['Faythe', 'Grace'], 'dob': '20200101', 'gender': '2'}),
            {'idx-gender-dob': '2|20200101', 'idx-name': 'Faythe,Grace'})

        cls.datastore.put('KEY0007', indexes={'idx-same': 'same'})
        cls.datastore.put('KEY0008', indexes={'idx-same': 'same'})
        cls.datastore.put('KEY0009', indexes={'idx-same': 'same'})

        cls.datastore.put(
            'KEY9999',
            json.dumps({'name': 'DELETE ME', 'dob': '99999999', 'gender': '9'}),
            {'idx-gender-dob': '9|99999999'})

    def test_get_record(self):

        data, indexes = self.datastore.get('KEY0001')

        self.assertDictEqual({'name': 'Alice', 'dob': '19700101', 'gender': '2'}, json.loads(data))
        self.assertDictEqual({'idx-gender-dob': '2|19700101'}, indexes)

    def test_head_record(self):

        indexes = self.datastore.head('KEY0002')

        self.assertDictEqual({'idx-gender-dob': '1|19800101'}, indexes)

    def test_2i_no_results(self):

        keys = self.datastore.secondary_index_range_query('idx-gender-dob', '3|30100101')

        self.assertListEqual([], list(keys))

    def test_2i_index_does_not_exist(self):

        keys = self.datastore.secondary_index_range_query('idx-does-not-exist', '3|30100101')

        self.assertListEqual([], list(keys))

    def test_2i_exact_value(self):

        keys = self.datastore.secondary_index_range_query('idx-gender-dob', '2|20100101')

        self.assertListEqual(['KEY0005'], list(keys))

    def test_2i_gender_2(self):

        keys = self.datastore.secondary_index_range_query('idx-gender-dob', '2|')

        self.assertListEqual(['KEY0001', 'KEY0003', 'KEY0005', 'KEY0006'], sorted(list(keys)))

    def test_2i_gender_2_max_results_2(self):

        keys = self.datastore.secondary_index_range_query('idx-gender-dob', '2|', max_results=2)

        self.assertListEqual(['KEY0001', 'KEY0003'], sorted(list(keys)))

    def test_2i_gender_1_dob_19(self):

        keys = self.datastore.secondary_index_range_query('idx-gender-dob', '1|19')

        self.assertListEqual(['KEY0002'], list(keys))

    def test_2i_gender_2_dob_19(self):

        keys = self.datastore.secondary_index_range_query('idx-gender-dob', '2|19')

        self.assertListEqual(['KEY0001', 'KEY0003'], sorted(list(keys)))

    def test_2i_gender_2_dob_1990_2000(self):

        keys = self.datastore.secondary_index_range_query('idx-gender-dob', '2|1990', '2|2000')

        self.assertListEqual(['KEY0003'], list(keys))

    def test_2i_term_regex(self):

        keys = self.datastore.secondary_index_range_query('idx-gender-dob', '1|', '2|', term_regex='[1|2]\|20[1|2]0.*')

        self.assertListEqual(['KEY0005', 'KEY0006'], list(keys))

    def test_2i_return_terms(self):

        key_terms = self.datastore.secondary_index_range_query(
            'idx-gender-dob', '1|', '2|',
            return_terms=True)

        self.assertListEqual([
            ('KEY0001', '2|19700101'),
            ('KEY0002', '1|19800101'),
            ('KEY0003', '2|19900101'),
            ('KEY0004', '1|20000101'),
            ('KEY0005', '2|20100101'),
            ('KEY0006', '2|20200101')],
            sorted(list(key_terms)))

    def test_2i_term_regex_return_terms(self):

        key_terms = self.datastore.secondary_index_range_query(
            'idx-gender-dob', '1|', '2|',
            term_regex='[1|2]\|20[1|2]0.*',
            return_terms=True)

        self.assertListEqual([('KEY0005', '2|20100101'), ('KEY0006', '2|20200101')], list(key_terms))

    def test_exists(self):

        self.assertTrue(self.datastore.exists('KEY0001'))
        self.assertFalse(self.datastore.exists('1000YEK'))

    def test_multiple_index_values(self):

        indexes = self.datastore.head('KEY0006')
        self.assertDictEqual({'idx-gender-dob': '2|20200101', 'idx-name': 'Faythe,Grace'}, indexes)

        keys = self.datastore.secondary_index_range_query('idx-name', 'Faythe')
        self.assertListEqual(['KEY0006'], list(keys))

        keys = self.datastore.secondary_index_range_query('idx-name', 'Grace')
        self.assertListEqual(['KEY0006'], list(keys))

    def test_multiple_keys_same_index(self):

        keys = self.datastore.secondary_index_range_query('idx-same', 'same')
        self.assertListEqual(['KEY0007', 'KEY0008', 'KEY0009'], sorted(list(keys)))

    def test_delete(self):

        self.assertTrue(self.datastore.exists('KEY9999'))

        keys = self.datastore.secondary_index_range_query('idx-gender-dob', '9|99999999')
        self.assertListEqual(['KEY9999'], list(keys))

        self.datastore.delete('KEY9999')

        self.assertFalse(self.datastore.exists('KEY9999'))

        keys = self.datastore.secondary_index_range_query('idx-gender-dob', '9|99999999')
        self.assertListEqual([], list(keys))