Iterate with Experts

The best Data Cards are created over iterations, and with the input of diverse voices. It helps to engage subject matter experts, specialized job roles, and plan at least two rounds of review or feedback to get your Data Card right.

Data Cards are transparency artifacts that emphasize information and context that shape the data, but cannot be inferred from the dataset directly. Framing explanations and rationales about essential transparency topics (such as provenance, representation, usage and fairness) are harder to answer than technical documentation because they can be interpretive and have specific socio-cultural contexts.

Ask for documents.

For historical datasets or large projects which involve multiple stakeholders, answers may be distributed across documents and in the working memory of collaborators – much of which would never have made it to documentation.

Ask your collaborators for answers, and rely on multiple sources of information for the truth.

The Relative Movie Attributes Data Card aggregates content from at least three distinct sources related to the dataset. In this case, the documents have been linked so interested readers can verify the accuracy of content and find out more information as needed.

Don't underestimate the time needed to fill out a Data Card.

Filling out technical fields and summarizing disparate documents into the Data Card can take a few hours. The work of framing, validating, and conducting additional analyses can take much longer.

Identify and involve experts and stakeholders early, and account for a few rounds of feedback.

When working with reviewers, the authors of the More Inclusive Annotated People (MIAP) Data Card found a significant number of unknown attributes, prompting additional analysis. While this was determined to be fairly typical of similar computer vision datasets, authors felt it was important to disclose their learnings. They worked with visualization experts and content writers who helped them translate a large table into a meaningful summary visualization, adding 2+ weeks to their timeline.