bias
Table of Contents
Tools to understand and analyze biases in our ML models. Currently we focus on bias in turn level predictions from LLMs in project Vanir. The way we do this is collect conversations in DiaML format, change template values around protected attributes, do predictions at each turn, and manually monitor signs of bias.
Here is an example of bias in our development model.
1. Usage
You need to be able to access njord server for running a test. As of today, you
would do this by port forwarding from us-production-cluster. Note that you
should do this during a time when the server is unused.
Then run something like the following:
poetry run bias ./assets/conversations/ ./output/ --attributes-csv=assets/attributes.csv --njord-url=http://localhost:8989
The network requests are made sequentially as of now. This is alright since the concurrency support and our needs are not beyond that. The cases generated for bias testing are shuffled so anytime you stop the runs you should be able to make some report.
1.1. Report
After a run, you should see the ./output directory populated with njord
responses. This can be then used via the provided notebook
./notebooks/bias-report.ipynb to generate a report.