Building The Analytics Team At Wish Part 3— Scaling Data Analysis
The data infrastructure is only as useful as the people using it. In this section, we’ll talk about how to build an equally strong analyst team.
Setting a Vision for the Team
In the beginning, all we had time for on the data analyst side was pulling data and building reports. We didn’t have time to understand the full context of every request, and it was difficult to focus our work. In spite of all this, we substantially helped the company to grow using data and generated lots of value.
But this didn’t feel right. Nobody becomes a data analyst so that they can tackle endless streams of reporting & data pulls. There was no ownership over the end result — the actual decisions being made. We were just API’s to providing summarized information.
Analysts can provide the most value when they have ownership over the decisions being made from data. This means having a deep understanding of the system and a close working relationship with stakeholders, so that they know enough to provide recommendations and proposals for changes and actions. Analysts should have a strong stake in the success and failure of the teams they support.
Three Analyst Skills
Analysts specialize into the areas and teams they support. The skills required for each specialization is different — what makes a marketing analyst successful is different that what makes a logistics analyst. But there’s three skills that everyone on the data team needs in order to be effective.
Understanding The Goal
First, analysts should understand the true underlying goals behind their tasks, and to optimize for impact.
To give an example, a project we worked on last year was to improve the shipping estimates for our 7 day Wish Express shipping option. This estimate was usually +7 days to the current UTC date. But this meant orders near the start of UTC day were given more time to ship and reach their destination. We were tasked to even this out, and give a more clever estimate that would reduce the % of orders that were delivered late.
We then set out and did our analysis, and a week later we gave our recommendation. We found that there was a way to reduce late rate by adding more a bit logic to the estimator, and it would reduce late rate by 3%. This turned out to be too small to be worth implementing.
Was the task successful? We explored a potential angle at solving the problem, and then found that it wasn’t worth pursuing. But in the end, we made no impact.
What we should have done is questioned the original goal. Was the goal to test out a hypothesis for improving the estimates? Yes. But the main goal was to reduce the Wish Express late rate. If we had this understanding in the beginning, we would have quickly estimate the most upside that could come from tweaking the estimates, and realized that it wasn’t worth it. We would have then spent our week looking at the system for other ways to reduce the late rate. This would have been a more fruitful use of our time, and could potentially have made significant contributions to a strategic program for the company.
Understanding The System & Data
Second, analysts should have a strong understanding of the systems they work on, and how that’s reflected in the data. This is a repeat of what I said about data quality earlier, but its important so I’m putting it again in this section. Taking the data for granted is the biggest bad habit to watch out for when analysts join the team. This leads to incomplete business logic and bad data that leaks into analysis. Shipping out bad data is adding negative value to the company, and very quickly destroys an analysts credibility.
Lastly, analysts have to be diligent about closing the loop. It’s very easy as things get busy to miss out on the underlying goal of the task at hand. The end result should not be a deck summarizing findings, the end result should be recommendations or action items that lead to real impact. Closing the loop means pushing action items through, and following up after implementation to measure real impact.
One of the major initiatives we worked on last year was creating a policy for merchants where we made it mandatory to ship to a set list of countries using last mile tracking if an order was above a certain price. This was a massive project involving analysis for each country involved, done over months. We released the policy in stages, first to countries where we could make the most impact by lowering refund rates with higher quality shipping.
There were issues that came up after we released this policy to the first country, Canada. Merchants increased prices higher than we expected in response to this policy. And user engagement decreased higher than expected.
If we didn’t follow up on the Canada release, we could have negatively impacted the company with the full policy release. And we almost didn’t — not following up extensively on Canada almost slipped due to time constraints.
Strong data teams have the discipline to close the loop on their projects.
Reducing Risk In Growth
As companies grow, so do the amount of moving and interconnected pieces in the company and its people and systems. This creates risk. The more complex a system becomes, the greater risk there is of major issues slipping through unnoticed. If this risk isn’t mitigated, it’s only a matter of time before these issues start regularly impacting growth.
We can deal with this in two ways.
The first is by having every major KPI in the company owned by an analyst and team. Each week, every KPI owner updates a deck which outlines major changes to their metrics, and annotates major projects and changes to the system that have impact to their KPIs. This is then sent up the chain to leadership for review.
Together these decks form a top level view of the company & its activities. They reduce the chance of major growth impacting issues going unnoticed.
Alongside top level KPI monitoring, the risks & downside potential of each team needs to be audited. Figuring out what can go wrong, and then building reporting & monitoring around these systems, further helps mitigate downside risk.
Both of these initiatives should be owned by the data analyst team. Analysts should be responsible for being one of the first lines of defense for catastrophic systems issues.
Supporting Business Intelligence
The holy grail of business intelligence tooling is the goal of democratizing data — giving everyone in the company the ability to pull their own data and use it to make decisions and arguments.
This was our mission when we first rolled out Looker — give everyone accounts and have access to data be self serve through a drag and drop UI.
But issue was that not every team had the skills to run their own analysis without help. Bad data & analysis started being reported, reaching top leadership.
Giving teams self serve access also resulted in a flood of new requests. Instead of asking for data, teams started asking for new measures and dimensions and to help debug LookML issues — things that were far removed from the end result and hard to prioritize.
We rolled out self serve data tooling when we weren’t ready. The result was that we had to remove access.
When we hired more analysts, we tried self serve again. This time with enough analysts that help stakeholders understand the tooling and provide support. And this time, with much more success.
Building The Team
Unless you’ve built out an incredible talent brand and have non stop godlike analysts asking to join every week, you have to be strategic with the skills required for each role. I find analyst skill sets fall on three criteria: business, technical and statistical competency. I’ll talk about briefly about the archetypes.
These are analysts that have a record of driving impactful changes in previous companies using data. They typically come from consulting, finance, and operations backgrounds.
People that can build decks have the ability to organize quantitative arguments that make sense from a business perspective. They’re useful in roles where they need to be the main drivers of proposals for change.
At Wish, KPI owners are all deck builders.
They should know SQL, and are not expected to know Python (but encouraged to learn!)
If the data analyst is in a role where their stakeholder can handle deck building if needed, or there’s already a deck builder supporting, then we can drop that requirement and boost the requirement of technical skills.
With strong Python and SQL skills, an analyst can be very efficient at querying for data. If they need new tables, they can build their own ETL pipelines. They can also automate recurring analysis and build reusable scripts to speed up future analysis.
Data analysts don’t need to be statisticians. But having no statistical intuition on the team can be detrimental. Using summary metrics in analysis without confidence bounds, taking the result of A/B tests naively with limited data, can result in a lot of negative value being generated with bad analysis.
If A/B testing is hard to do for a problem, statisticians can build inferential models from exiting behavior and data.