Fortifying Internal Regulation: Data Collection, Retention, & Usability

Scaling AI Marie Merveilleux du Vignaux

Many individuals work with data each day, but do they spend enough time thinking about the source and function of this data? During a recent Egg On Air Episode, Nicole Alexander, Global Web Strategy Lead at Facebook and Professor of Marketing and Technology at NYU, shared some questions that data users should continuously ask themselves to ensure explainability in their organization.

→ Download the Ebook Black-Box vs. Explainable AI: How to Reduce Business Risk  and Infuse Transparency

3 Pillars of Leveraging Data

Let’s start by quickly outlining what it takes to ensure a successful data strategy.

Pillar #1: Internal Regulation — Do you have internal policies about how you acquire, retain, and use data?

Pillar #2: External Regulation — How do you adhere to and oversee federal and state regulation when it comes to data protection laws and privacy?

Pillar #3: Academia — How do you move internal policy and frameworks forward?

Before moving on, it's important to note that you need all of these pillars. They are all interconnected and organizations should look at them as such. This includes reflecting on whether or not all the necessary safeguards have been put in place from a regulatory perspective and whether there is enough diversity to really put a mirror to the work that is being done.

Looking at one or two pillars doesn't give you the full picture and it doesn't allow you to safeguard the way that you're leveraging data. You could have some of the most amazing internal policies in place, but if you ignore academia and fall behind on new advances, your value to shareholders will decrease. So while reading this blog, remember to think about how these three pillars work together.

A Deep Dive Into Internal Regulation

Every organization should think about having an internal policy about how data is acquired, retained, and used. This can come in the form of some sort of framework on the use of data. It could be a fairness and equitability policy about what data can be used or what third-party data needs to be pulled in to mitigate any bias in the way that you're leveraging your data. It could also include ensuring that there are multiple stakeholders with different backgrounds that are looking at the way data is being leveraged.

In other words, it should not be just the data science team, but also C-level execs, sales people, product development people, etc. It’s crucial to have different perspectives and different backgrounds in the room, because it's not just about people that understand data. It's also about people who understand the customer, the shareholder, and all the other nuances about how that product or service is being developed.

Now let's take a look at what organizations should think about when creating their internal policies.

1. Collection: How are you bringing in data and where is it coming from?

Sometimes, companies receive primary data from their consumers or audience, and other times they get their data from third parties. They can also aggregate data from sources to generate more complete profiles. But the most important point here is to think about what data you actually need to collect in order to reach the outcome you are targeting. The goal is not to collect as much data as possible thinking you might be able to use it in the future.


My constant recommendation to organizations is to only collect and house the data that you need in order to operate your business.”

2. Retention: How are you retaining the data? How long are you keeping this information for? And do you need to retain all of the data that you originally collected?

Retention of data is about how you house that data. It’s not just about how long you hold on to it, but also about how you share that data even across your own properties. Are you crossing international lines with data? If so, you want to ensure you are staying on top of external regulation about allowing data to be shared across borders.

“Have that transparency given to your audience and consumers. It's really important that people understand. If they're open and willing to share their information, they should also know how long you're going to use that data for.”

3. Usability. How are you using that data? Do you have the right to use the data in all the ways that you want to use it or are you really stretching the confines of using the data in a different way that it was collected for?

Usability is the meat of how organizations operate. It’s about how you activate the data that you've collected and retained and understanding how you implement this data into the framework of your organization to develop the outputs from a product or service perspective. You have to ensure you are using your data in a responsible way, under an ethical framework.

Usability is extremely important, because it can make or break an organization.”

This framework should be consistent within an organization. Writing everything out could help make sure everyone is on the same page and working towards a common goal. This is where explainable AI comes in. Explainable AI benefits collaboration and enables colleagues to pick up with others left off, for example, through clear written explanations, markups, or other sorts of communication ingrained in the process. This way, anyone would know exactly how to move forward with a project without there being any black box to the way that the model is.

Coming Full Circle With Accountability and Transparency

There is a common thread across these three pillars: accountability. Accountability is a measurement that everyone in an organization should constantly be considering.

  • Are we accountable when it comes to the way that we said we will collect data?
  • Are we accountable with the duration that we're going to hold on to the data?
  • Are we accountable with how we're using that data on a day-to-day basis?
Accountability is in your data center and sits with each individual leader within your organization and right next to it comes transparency. And the best, most efficient way to express accountability and transparency is to let stakeholders and consumers know and understand what you want to do with their data.

You May Also Like

Explainable AI in Practice (In Plain English!)

Read More

Democratizing Access to AI: SLB and Deloitte

Read More

Secure and Scalable Enterprise AI: TitanML & the Dataiku LLM Mesh

Read More

Revolutionizing Renault: AI's Impact on Supply Chain Efficiency

Read More