Pattern

Avoid Ambiguous Use Cases

As a dataset producer, if you are able to take a strong stance on how the dataset should or should not be used in your Data Card, your dataset will feel more controllable.

This, in turn, increases the Data Card readers’ confidence in their ability to use your dataset. But use cases are rarely suitable or unsuitable. There may be several acceptable use cases of your dataset that come with some caveats, which will need to be explained in your Data Card. Provide readers with ways to navigate ambiguous use cases you provide, that may be conditionally acceptable as the means to offer the most current or relevant information on your dataset.

Start with use cases to help readers understand the most prevalent risks.

Identify the most important and likely risks across multiple use cases and investigate those first. Provide readers with ways to spot check for undesirable outcomes and conduct more in-depth analyses of performance failures where there is an understanding of how these can negatively impact downstream users and society.

For example, you might report the expected and anomalous behaviors of a benchmark model that reflects the intended uses of your dataset when trained or tested on your dataset.

The More Inclusive Annotated People Data Card contains labels for perceived age and gender, which when used incorrectly can be problematic. The Data Card clearly describes which use cases are unsafe and describes the rationale behind it – so that potential users of the dataset are able to make responsible decisions about how they use the dataset.

Expect datasets users to color outside the lines.

Be aware that readers might use your dataset in ways you haven’t thought of, which could introduce unintended implications for your dataset.

In addition to preventing inappropriate or misuse of your dataset by clearly stating the intended use, give readers a strategy to institute mitigations (e.g. identify and set failure modes) and set up monitoring systems so risks and failures can be caught in time.

The WikiDialog Data Cardsummarizes the expected behaviors of the dataset on a benchmark model (T5-Base DE) and compares it to the model that it was designed for (T5-Large DE), and desribes the setup. It also describes the known caveats and points readers to a paper that contains extended evaluation results.