How to building an ad dashboard to make publisher’s work easier?

Karen Hsieh
14 min readApr 17, 2022

This article was written on 2020/10/3 to note my learning as a publisher. It’s actually the source for Self-service product analytics.

Ad is the most traditional business modal for the internet

Here are 3 ways to monetize in the internet industry: Ad, EC, Paid/SaaS. Let’s check GAFAM. Google and Facebook are Ads. Amazon is EC and also starts an ads business. Microsoft and Apple are Paid/SaaS. Apple has ads and they are expanding them. Oh, Microsoft has bing ads. 😂

Most people feel that the internet is free 🆓, that’s because Ad is chosen as the business modal. People use the internet to read, search, store, check email…etc. for free because they are creating the inventory for the publishers so the advertisers can bid.

😞 I can’t discuss the issue caused by ads here though I truly worry about it. I don’t have solutions. Yet. I hope I can have one someday. However, encourage you to watch the social dilemma so you can start to think this is an issue.

So, as a publisher, the first thing is to create inventory. The inventory comes from usage. Therefore, all PMs are working on building good products that people want to use. Whether that’s a website with great content or a tool that solve people’s problem. That’s quite difficult. Building a product is so hard. I don’t discuss this part here, either.

Let’s assume a good product is built. Then what? Do you see the money? 🤑

Well, Ad is another difficult part. ⛰

The Ad industry is so crowded and competitive. The ad tech ecosystem is so large and attracted so many talents. The picture below is from a 2010’s article in AdExchanger. It’s way more crowded now in 2020.

From AdExchanger 2010

On the rightest side, the publishers, how to monetize through ads?

The advertisers on the left side are the buyers. They own the money. All companies on the right as publishers are counting on them. You can imagine the money passing from the left to the right. (The funny part is. The audience is the real rightest one who pays for the advertisers.)

Here is the evolution of digital display advertising from IAB.

The evolution of digital display advertising from IAB.

According to how the publisher gets in contact with the advertisers, here are different methods for monetization. Roughly there are:

  • Direct sales: the publishers get ads directly from the advertisers or the agencies.
  • Programmatic ads: the publishers get ads through the ad networks or the ad exchanges.

Each one requires lots of know-how to learn, different resources, experiences, and solutions to pursue better outcomes. (Or incomes 💰). I worked on programmatic ads for 2 years. So I’ll explain the difference a little bit but focus on the programmatic more.

What’s the difference between direct sales and programmatic ads?

I worked in the internet industry in 2004 at Yahoo! but never really work on Ads. I remembered there are lots of ad ops dealing with the ad campaigns that got by the sales. One time, I was putting a few HTML codes, just text, and links, on https://tw.yahoo.com. This travel module is still there now, in 2020. And plus images.

The travel module on Yahoo!Taiwan’s front page.

There was a sales I didn’t know him/her using yahoo messenger who texted me and asked me to be very careful about these codes because each link is worth > $10M NTD etc. ( I cannot remember the precise numbers. That’s a lot of money to me.) I didn’t understand that Yahoo! is an ad company then. Though there were lots of ad ops and salespeople, I didn’t know how complicated ad is.

That’s a direct sale.

The travel agencies, the advertisers, pay directly to Yahoo!, the publisher, to put their travel content on yahoo’s front page. Because yahoo has a great inventory that can reach lots of people, the audience. The deal is large enough that may be closed by yahoo’s sales.

However, the travel agencies not only buy inventory on Yahoo!. They also want to reach people who are reading travel blogs and searching for backpacks or attractions for their travel. These agencies cannot contact the travel blogs, and the publishers, one by one; Nor contact Google directly to discuss each search keyword. What can they do?

That’s why advertising revenue is the majority of Google’s revenue. Ad revenue hits $37.93 billion USD in Q4 2019. Google resolves these travel agencies’ problems. They can reach the audience through Google Ads. No matter the audience is reading the travel blogs, AdSense/AdX, or searching, AdWords, or using travel apps, AdMob, or watching travel videos, YouTube Ads. Google has ad solutions.

From another direction, the travel blogs also want to get travel ads, more relevant to their content and may be useful for their readers. These blogs cannot contact the travel agencies one by one, but they can join GDN. Through integration with AdSense/AdX/AdMob, these travel blogs can sell their inventory to travel agencies.

That’s programmatic ads.

Travel agencies, the advertisers, upload their campaigns to Google Ads. They set how much they pay, what time they want, and maybe select some criteria for displaying. Travel blogs, the publishers, integrate with AdSense / AdMob to monetize their inventory. OR use Google Ad Manager / AdMob mediation to mediate more other Ad Networks / AdX except for Google.

It’s not building relations between the travel agencies and the travel blogs. Travel agencies can buy inventory from any website/app in GDN, not limited to travel blogs. Actually, Google is so good at audience targeting, e.g. Find someone who is planning travel but reading finance news. Google shows the travel ads on that finance news. The finance news blog will happy to accept the travel ads as long as they pay more than the finance ads. So as to travel blogs, can accept any kind of ads.

More, there is an exception to Google when people are reading a Facebook post, FB Ads, or reading Instagram photos, IG Ads, or checking Twitter, Twitters Ads. Then the travel website pays for other companies. Facebook’s advertising revenue rose 25% to a record $20.7 billion USD in Q4 2019. Twitter reported ad revenue growth of 12% for a total of $885 million USD in Q4 2019. The list goes on like Pinterest, Tiktok…. etc. The travel agencies are so busy paying money. 💸

There is MarTech to support the advertisers. e.g.

  • How to make sure the ads are real? e.g. People really see the ads. The ads really show up in the right inventories.
  • How to make sure the ads are effective? e.g. Shown to the people who will travel soon.

Here is a more crowded infographic for the MarTech landscape. Really beyond my knowledge. 😳

From Marketing Technology Media 2019

There is also AdTech to support the publishers. And that’s what I’m working with.

It’s all about performance, as I try to make it simple and easy to understand.

The travel agencies, the advertisers, care about their spending. How to make sure the money they spend really earns more?

The travel blogs, the publishers, care about their earnings. How to make sure the money they earn really the most?

Keeping an eye on the ad performance is a must. But How?

When it’s all about performance, I need to check how it performs as I just started to be a publisher in 2018 Q1. My company integrated with AdSense, AdMob, and several ad networks. I try to use spreadsheets to check the performance.

The work steps were:

  1. Download each raw report from the ad networks.
  2. Transfer data from .xlsx, .txt or .json to .csv , removing error and clean format for uploading.
  3. Save raw data in each raw tab so I can append data afterward.
  4. Map the raw data to the reporting data format. e.g. There are clicks and impressions from raw data, so I calculated CTR for reporting data.
  5. Then do an analysis of the reporting data. I used a lot of pivot tables to analyze and compare the performance.

After 6 months, the spreadsheet has 27 tabs with 10 ad networks.😫

There is a super useful website https://www.benlcollins.com/ helps me a lot with spreadsheets. I also learned some AppScript and wondering if this can resolve my problem. I tried to best use the spreadsheets but still think that’s maybe not the right tool. Though the numbers are under the limit of Google Spreadsheet, that spreadsheet ends up taking about >10 mins to open the whole spreadsheet. I can drink a cup of coffee if I did some calculations or added a pivot table. ⏳

IT MUST HAVE A BETTER WAY. 😭

SQL may be the solution

I cannot remember how, but I learned a new keyword “SQL” in about Q3 2018. It looks like something can resolve my problem. Thanks to my boss. We use GCP so he grand permissions for me to use BigQuery. I guess I just google a lot of articles and YouTube videos to learn and then try on BigQuery. I started to build the first version of ad data and visualize the report on Data Studio in Dec 2018.

This is the first version of ad data. It’s very simple and still required manual work to upload, formate…etc.

Version 1

The benefit is I can finally see the overview at the all_report. E.g. If the performance goes down, I could filter by an ad network or a page to find the problem. The report and basic analysis were all there, I didn’t need to produce a spreadsheet anymore if someone wants to check viewability, CTR, or compare different pages, and ad networks. They can do it by themselves on that dashboard.

I think SQL is the right solution. ✌️

Bored is why I improve

The solution worked but there were many details. Not every ad network provides downloadable data. Some ad networks even don’t have a backstage 😫. They just send out emails or attach their monthly XLS.

The steps above took me about 1 hour. In that 1 hour, I just moved my finger to click download, find email attachments, transfer format.. and then upload and then….etc. And repeat 10 times. My brain is not thinking so I was a robot 🦾 (and yes, only hand) in that 1 hour.

IT MUST HAVE A BETTER WAY. 😭

Thanks for Qwiklabs. Google provides lots of training so I complete this one BigQuery for Data Analysis to learn more. Also thanks to my boss again. He asked me to learn Python at the NTU training course. Hired an intern and assigned a data engineer to support me.

The intern is awesome! He is a junior undergrad student majoring in business. Not only did he cover my routine robot job but also self-learn Python and SQL to help me upgrade to version 2. The data engineer is my coach! (He has a fitness coach license 🏋️‍♀️. I consult him about training as well.) He reviewed my SQL/Python code and taught me to use Cloud Function to trigger and get data by API daily. My python code is similar to the writing of a kindergarten student 😳. It took me about 1 month to complete my first .py to get data from an API. To speed up, he did most of the API work actually.

Version 2 of ad data

So.. here comes version 2. I tried to reduce the manual work. I create a BQ dataset called “gam”. Then use the spreadsheet All_Adunits_v2 to create the tables in BQ.

Version 2

All_Adunits_v2

  • This one is unavoidable and maintained manually. Only we humans know we cooperate with which ad networks on what ad units.
  • This is a spreadsheet but connected to BigQuery tables. A spreadsheet is better for us to edit; Connected to BigQuery so we can join and reference by SQL later.
  • There are 4 major tabs in this spreadsheet. Maintain the relationships and provide the dimensions for later reporting.

gam_adunits: Record all ad units we define on our webpage or app.

  • There are the dimensions for analysis. e.g. Which platform, Web, iOS, Android? Which web, MB, AMP, PC?
  • Record mapping with gam_adunits. We use GAM for mediation but some adunits may be integrated directly, e.g. recommend module.

adnetwork_adunits: Record all ad networks integration.

  • One master_adunits may be integrated with many ad networks. E.g. The top ad unit on the webpage may serve ads from AdX, OpenX, Appier, ..etc.
  • The names of ad units are different for each ad network. The naming may be named by us or that ad network. E.g. The top ad unit on our webpage
  • master_adunit is named by us: Website_Homepage_Top
  • adnetwork_adunit in AdX is named by us then we like to use the same name: Website_Homepage_Top
  • adnetwork_adunit in Adnetwork B is named by them: 6203
  • adnetwork_adunit in Adnetwork C is named by them: Web — TW — CDB — PB — 300x250
  • Since I want to compare which ad network performs better in Website_Homepage_Top, I need a mapping.

ad networks: Record all ad networks and map to the ad companies.

adcompany: Record the contact and payment information. The purpose is for requesting payment. Some ad networks request us to send out monthly invoices.

raw_XX

  • These are tables for raw data get from each ad network. The data appends afterward.
  • The data may be get by API, crawler, or manually.
  • The data varies. E.g. AdMob provides details like measurable impression, impression. Some provide daily revenue only.

report_adnetwork

  • Map the data from raw_XX into a unified table.
  • It’s important to understand what data is provided and what’s the meaning of that data. E.g. The “impression” may have different definitions in different ad networks. According to the Google Ad Manager report metric, Ad impression is a subset of ad requests returning at least one ad to the site that are counted as ad impressions. Facebook treats impression as the number of times ads were on screen. Therefore, I put google’s ad impression into the “impression” column, but put FB’s impression into the “viewable impression” column.
  • Sometimes there is no help page explaining what the metrics are. Then I’ll ask the contact person of that ad network and make sure he/she provides a clear explanation and then check if the data looks like what he/she said.

all_USDTWD

  • Thanks the data engineer. He creates a daily auto-update for USD to TWD exchange rate.
  • CPM, Revenue is important in comparison, so I need to make sure the currency is the same.

report_adrevenue

  • Map the data from report_adnetwork so I can compare the performance between different ad networks. And this is the final reporting.

Automate the data process

The process starts when raw data is uploaded. The second step is to transfer data to report_adnetwork and then to report_adrevenue. Therefore, if the raw data can be updated automatically, the process can be triggered automatically.

I create a .py file for each ad network. In the py files:

  1. Use API to requests raw data.
  2. Upload to BQ raw_XX tables. Using pandas_gbq makes this process easier.
  3. Use SQL to transfer data from raw_XX then append to report_adnetwork.
  4. Use SQL to transfer data from report_adnetwork, all_USDTWD then append to report_adrevenue.

Then put this py file into Cloud Function. Create a job in Cloud Scheduler for a topic in Cloud Pub/Sub. Set the py file in Cloud Function triggered by that Pub/Sub topic. Finally, the process is automated. 🤖 It saves my time!

Some follow-up improvements.

  • In order not to duplicate data, I add some checking before data is appended. The checking can also identify some human typos, such as forgetting to add net master_adunit, forgetting to add a new ad network…etc. The checking is in that py file. If the checking fails, then the uploading or transferring will be stopped.
  • Add retry in API requests since some APIs fail often.
  • The data engineer creates an automatic notification: If the raw data requests fail, a notification sends to Slack and mentions me. 😅
  • The .py files are stored on my local computer as well. So I can adjust the date and quick fix data if the API fails at automatical triggered.
  • The data engineer also creates a function to deploy code from my local to Cloud Function.

Easy to see the ad performance

The reason to start all these is to see the ad performance easily. Now I have the report_adrevenue data, I set this as a data source in Data Studio. I add different charts to visualize the performance.

Here are the pages on the ad dashboards.

  • Latest day overview: Show the latest revenue by ad networks.
  • Overview: Show the revenue in different dimensions.
  • By ad network: Compare CTR, viewability, and impressions by ad networks.
  • By ad unit: Comparing CTR, viewability, and impression by ad units.

The information is useful and just there. It’s easy to identify abnormal.

A glance at the dashboards. Sorry for hiding most numbers due to confidentiality. 😳

If I see revenue increases or decreases, I can view and filter the revenue by date, ad network, and ad unit to find the root cause. Or I can just check every ad network's performance daily in a few minutes by filtering the data on the dashboard. The best benefit is, that when someone asks questions about ads, most questions can be answered by using this dashboard. More, they can actually check the dashboard by themselves. 🤣. It really makes my work(life) easier.

The time I saved from preparing data should be used to monitor, and review the data. To find what can be improved, what’s the potential optimization, what’s may impact revenue..etc. That’s the correct way to spend time.

What I want to upgrade to version 3

Now I pass the ad job to another colleague who is good at ad buying and eager to learn about ad sales. He doesn’t need to learn SQL, and Python as I did. That’s the benefit to lower the job criteria. Learning about how to do publisher’s work is hard enough. He can still use the ad dashboard to monitor the performance without knowing the process.

But of course, he understands the process. He still needs to maintain the big spreadsheet. Here causes a simple bug 🐞 that if someone edits the spreadsheet with a typo then the SQL crashes or error….. e.g. an empty row is added or a row is duplicated. That’s the first thing I want to resolve.

The second thing is to reduce the maintenance. Each ad network has a py file. The most difference between these files is the raw data requesting part. Rest data transferring parts are similar but not exactly the same. I’m wondering how to make the similar part the same and separate from each py file. I start to use dbt this year for other data analysis. Maybe it can help me to modularize these codes as well. I don’t know how yet. 🤷‍♀️

The third thing is I’m still not good at python. 🤯 It takes me at least a week to write code requesting data from API. Not to mention writing crawlers. I need an upgrade of myself. 🤪

Keep learning

This is my journey about ads. I learned how the ad works, SQL, Data Studio, some GCP features, and a little bit of Python. This is an exciting journey that really opens my mind. There are lots of diligent and smart talents working here. My boss found the Taiwan Publisher Group encouraging publishers to share knowledge and experience. I appreciated joining such a cooperative environment and learned a lot from them.

All the tools or techniques aim to resolve the problem that people have. Just like I try to build the dashboard to monitor and compare ad performance easily. To get closer to the core of the problem, how to monetize as a publisher, I just unlock the first checkpoint.

As you can see the level of problems between my boss and me are different. I’m trying to resolve my problem while he is trying to resolve all publishers’ problems. That’s why I’m an employee. 😂

There are lots changing this year, esp. IDFA. It’s fascinating to me, to see the whole industry moving and people responding to these changes. I’m glad to be involved.

🤩 I’m happy to hear from you. Feel free to reach out to me on LinkedIn Karen Hsieh or Twitter @ijac_wei.

🙋🙋‍♀️ Welcome to Ask Me Anything.

--

--

Karen Hsieh
Karen Hsieh

Written by Karen Hsieh

Data📊 Empower 🙌 Product 💜

No responses yet