Building on poprox-recommender#

Warning

This guide is still under development. It is possible that essential steps have been omitted. Use at your own risk!

Prerequisites#

The POPROX recommender code has several software dependencies, as well as requirements for data storage, deployment, and hardware.

Software Dependencies#

To begin working with the code, you need the following:

  • git

  • Pixi, our dependency manager

All other dependencies (including DVC) are specified in our Pixi dependency file.

Once you have Pixi installed and the repository cloned, you can start a shell with access to the development dependencies:

pixi shell -e dev

Data Dependencies#

To build and evaluate a recommender, you need a repository to store the data (training data, checkpoints, and output files) and share them with your team. We ship the code with read-only access to a repository we provide with MIND outputs, but you will need a repository for your own outputs (unless you will only be working on a single machine and not deploying).

Any DVC remote type can be used for this repository; the easiest is an S3 bucket. Once you have created your bucket, you can add it as the default remote to DVC (run from within a Pixi shell, noted above):

dvc remote add cloud s3://my-bucket-name
dvc config core.remote cloud

If you have the AWS CLI installed and log in with aws sso login, DVC will use that authentication session automatically. If you want to use access keys, can store your AWS credentials in ~/.aws/credentials, environment variables, or put them in the file .dvc/config.local.

.dvc/config.local example

.dvc/config.local is a local (not shared via Git) configuration file for DVC. You can use it to store credentials (shared access key):

['remote "cloud"']
access_key_id = "<ACCESS KEY ID>"
secret_access_key = "<SECRET ACCESS KEY>"

Deployment Requirements#

To deploy your recommender using our template, you will need AWS credentials capable of creating lambdas, containers, and CloudFormation deployments, and accessing S3 (if that is where you have your repository stored); see Deploying the Recommender below.

Deployment is handled via Serverless v3, and is automated by the deploy.sh script and the .github/workflows/deploy.yml GitHub Actions workflow.

Hardware Requirements#

For working on the code and evaluating recommender outputs, the hardware requirements are relatively modest: a reasonable laptop with sufficient CPU, memory, etc. for software development and basic Python analytics computing. The software dependencies and data take 2–5 GB (5-10GB on Linux, for the CUDA-based components).

For batch-generating recommendations over the test data, a GPU is very helpful; both A40 and A4000 chips significantly accelerate this process.

Getting Started#

  1. Fork the poprox-recommender repository into your personal or organizational account. If you do not want to make your customizations public yet, fork it as a private repository.

  2. Install the Software Dependencies above.

  3. Clone your fork of the repository with git clone (or gh repo clone).

  4. Start a Pixi shell:

    pixi shell -e dev
    

    Note

    pixi shell starts a new shell with the specified environment active and on its $PATH. This is a good way to use the repository and its dependencies for development and testing. You can also run individual commands within dev (or any other environment) with poprox-run:

    popprox run -e dev dvc pull
    
  5. Obtain a copy of the MIND data set, specifically the Validation and Test sets. Save these files in the data directory of the repository.

  6. Obtain our public data, model checkpoints, and evaluation results (in a pixi shell):

    dvc pull -r public
    

Repository Layout#

The recommender repo is organized into several directories for ease of navigation and modification:

src/

Contains the source code for the recommender pipeline components, evaluation logic, and other supporting code.

data/

Data used to evaluate (or train, if training is integrated into the repository) the recommender pipeline and components.

models/

Model checkpoints; this includes both pre-trained third-party models from sources like HuggingFace, and checkpoints for custom models. Training for those checkpoints can be integrated into the poprox-recommender repo and automated with DVC, or it can be done in a separate project or repository and the checkpoint files copied to this directory.

tests/

Test suite for the code in src/

outputs/

Evaluation outputs.

Running the Evaluation#

You can re-run our evaluations with DVC:

dvc repro

This will ensure the entire chain of generating recommendations and measuring them against the test data is up-to-date with current files and code. As you add new configurations to test, you can connect them in to the evaluation pipeline to reproducibly test them.

Tip

If you are using a CUDA-enabled Linux system, you can use the eval-cuda or dev-cuda Pixi environment, and set the environment variable POPROX_REC_DEVICE=cuda to use your GPU for batch inference.

Writing Components#

Todo

Document how to write new components.

The pipeline documentation describes how the POPROX recommendation pipelines are configured. To write new recommendation logic for POPROX, you will create or modify components to fit into these pipelines.

Adding Dependencies#

Seeing Results#

For offline evaluation, you can view the current set of metrics with dvc:

dvc metrics show

The outputs are in outputs/:

  • mind-val-metrics.csv contains the summary metrics (same as dvc shows), one value per algorithm

  • mind-val-user-metrics.csv.gz contains user-level metrics for more in-depth analysis (e.g. variance or statistical inference)

  • mind-val-recommendations.parquet contains all of the recommendation lists produced, from multiple pipeline stages (e.g. both top-K and final reranked or sampled recommendations).

dvc repro reproduces or updates these files.

Testing Code#

The POPROX recommender code includes a range of unit and integration tests to ensure that the code is functional and deployable. Most of these tests are run with pytest.

Some of the integration tests depend on serverless, which is installed via npm:

npm ci

You can run the test swith pytest:

pytest tests

Note

Currently, the integration tests only fully work on macOS and Linux. Some tests will be skipped on Windows.

We strongly encourage you to write tests for your own components.

Continuous Integration Testing#

Our repository is configured with GitHub Actions to test the code:

  • Run the unit tests.

  • Run the integration tests (with serverless).

  • Run an integration test for deployment with the Docker image. This creates a Docker image with the recommender code and checkpoints, as it would be deployed to AWS Lambda, and tests that this image runs and correctly returns recommendation results.

This continuous integration depends on access to the DVC repository.

Todo

Document how to set up these credentials.

Deploying the Recommender#

To deploy manually, log in to the AWS CLI (aws sso login) with a user who has access to read from your DVC repo and to create container images, lambda functions, and CloudFormation deployments. Then run:

./deploy.sh

Automatic deployment from GitHub Actions is also possible.

Todo

Document automatic deployment.