Missing Information?

The absence of information is still information that can be factored in decisions about completeness and use of the dataset.

Creating a Data Card can reveal several gaps in knowledge that can be updated over time – so take note of any new questions, intuitions that need to be validated, or uncertainties that need to be explored.

Acknowledge what you don't know about your dataset.

Sometimes a natural response to not being able to reach agreement on an analysis or the framing of a response is to omit it – but this can often raise more questions, and reduce the generalizability of your Data Card.

Instead, clearly let readers know if the answer to a question in the Data card is unavailable for public consumption or if it’s an unknown-unknown. Refer to the uncertainty table to find alternative ways of framing your responses.

Authors of the Conversational Weather Data Card in the GEM Benchmark dataset collection are unsure if there are any documented social biases in their dataset. This is further unpacked in their response about the language representation of the data producers.

Don't hit the delete button too quickly.

New information may become available as you complete your Data Card, and aligning templates post-hoc can result in fragmented Data Cards. Don’t delete empty sections or unanswered questions from your Data Card until your Data Card is entirely ready for publication.

Instead, include a note on which sections were not filled and/or deleted, why and when you make your Data Card available to readers. Where applicable, use these notes to inform future work.

The Turku Hockey Data2Text Data Card in GEM does not provide any information about known technical limitations. However, an empty "block" in their Data Card indicates as such. This prompts potential users of the dataset that they may need to run additional analyses to gauge the technical suitability of this dataset for their purposes.