bias

Table of Contents



This page is generated via scout. For now it just shows the project README.

Tools to understand and analyze biases in our ML models. Currently we focus on bias in turn level predictions from LLMs in project Vanir. The way we do this is collect conversations in DiaML format, change template values around protected attributes, do predictions at each turn, and manually monitor signs of bias.

Here is an example of bias in our development model.

1. Usage

You need to be able to access njord server for running a test. As of today, you would do this by port forwarding from us-production-cluster. Note that you should do this during a time when the server is unused.

Then run something like the following:

poetry run bias ./assets/conversations/ ./output/
--attributes-csv=assets/attributes.csv --njord-url=http://localhost:8989

The network requests are made sequentially as of now. This is alright since the concurrency support and our needs are not beyond that. The cases generated for bias testing are shuffled so anytime you stop the runs you should be able to make some report.

1.1. Report

After a run, you should see the ./output directory populated with njord responses. This can be then used via the provided notebook ./notebooks/bias-report.ipynb to generate a report.