1.4 Structure data science team

A vast amount of data has become available and readily accessible for analysis in many companies across different business sectors during the past decade. The size, complexity, and speed of increment of data suddenly beyond the traditional scope of statistical analysis or business intelligence (i.e., BI) reporting. To leverage the big data collected, do you need an internal data science team to be a core competency, or can you outsource it? The answer depends on the problems you want to solve using data. If they are critical to the business, you can’t afford to outsource it. Also, each company has its business context, and it needs new kinds of data as the business grows and uses the results in novel ways. Being a data-driven organization requires cross-organization commitments to identify what data each department needs to collect, establish the infrastructure and process for collecting and maintaining that data, and standardize how to deliver analytical results. Unfortunately, it is unlikely that an off-the-shelf solution will be flexible enough to adapt to the specific business context. In general, most of the companies establish their data science team.

Where should the data science team fit? In general, the data science team is organized in three ways.

A standalone team

Data science is an autonomous unit parallel to the other organizations (such as engineering, product, etc.). The head of data science reports directly to senior leadership, ideally to the CEO or at least someone who understands data strategy and is willing to invest and give it what it needs. The advantages of this type of data organization are:

The data science team has autonomy and is well-positioned to tackle whatever problems it deems important to the company.
It is advantageous for people in the data science team to share knowledge and grow professionally.
It provides a clear career path for data science professionals and shows the company treats data as a first-class asset. So, it tends to attract and retain top talent people.

The biggest concern of this type of organization is the risk of marginalization. Data science only has value if data drives action, which requires collaboration among data scientists, engineers, product managers, and other business stakeholders across the organization. Suppose you have a standalone data science team. It is critical to choose a data science leader who is knowledgeable about the applications of data science in different areas and has strong inter-discipline communication skills. The head of data science needs to build a strong collaboration with other departments.

As companies grow, each department prefers to be self-sufficient and tries to hire its data personal under different titles even when they can get support from the standalone data science team. That is why it is unlikely for an already mature company to have a standalone data science team. If you start your data science team in the early stage as a startup, it is important that the CEO sets a clear vision from the beginning and sends out a strong message to the whole company about accessing data support.

An embedded model

There is still a head of data science, but his/her role is mostly a hiring manager and coach, and he/she may report to a senior manager in the engineering department. The data science team brings in talented people and farms them out to the rest of the company. In other words, it gives up autonomy to ensure utility. The advantages are:

Data science is closer to its applications.
There is still a data science group, so it is easy to share knowledge.
It has high flexibility to allocate data science resources across the company.

However, there are also concerns.

It brings difficulty to the management since the designated team’s lead is not responsible for data science professionals’ growth and happiness. In contrast, data science managers are not directly vested in their work.
Data scientists are second-class citizens everywhere, and it is hard to attract and retain top talent.

Integrated team

There is no data science team. Each team hires its data science people. For example, a marketing analytics group consists of a data engineer, data analyst, and data scientist. The team leader is a marketing manager who has an analytical mind and in-depth business knowledge. The advantages are apparent.

Data science resource aligns with the organization very well.
Data science professionals are first-class members and valued in their team. The manager is responsible for data science professionals’ growth and happiness.
The insights from the data are quickly put into action.

It works well in the short term for both the company and the data science hires. However, there are also many concerns.

It sacrifices data science hires’ professional growth since they work in silos and specialize in a specific application. It is also difficult to share knowledge across different applied areas.
It is harder to move people around since they are highly associated with a specific organization’s specific function.
There is no career path for data science people, and it is difficult to retain talent.

There is no universal answer to the best way to organize the data science team. It depends on the answer to many other questions. How important do you think the data science team is for your company? What is the stage of your company when you start to build a data science team? Are you a startup or a relatively mature company? How valuable it is to use data to tell the truth, how dangerous it is to use data to affirm existing opinions.

Data science has its skillset, workflow, tooling, integration process, and culture. If it is critical to your organization, it is best not to bury it under any part of the organization. Otherwise, data science will only serve the need for a specific branch. No matter which way you choose, be aware of both sides of the coin. If you are looking for a data science position, it is crucial to know where the data science team fits.