I love this scene from Jurassic Park
People always remember this scene for the could/should line but I think that really minimizes Malcolms holistically excellent speech. Specifically, this scene is an amazing analogy for Machine Learning/AI technology right now. I’m not going to dive too much into the ethics piece here as Jamie Indigo has a couple of amazing pieces on that already, and established academics and authors like Dr. Safiya Noble and Ruha Benjamin best deal with the ethics teardown of search technology.
I’m here to talk about how we here at LSG earn our knowledge and some of what that knowledge is.
“I’ll tell you the problem with the scientific power that you are using here; it didn’t require any discipline to attain it. You read what others had done and you took the next step.”
I feel like this scenario described in the screenshot (poorly written GPT-3 content that needs human intervention to fix) is a great example of the mindset described in the Jurassic Park quote. This mindset is rampant in the SEO industry at the moment. The proliferation of programmatic sheets and collab notebooks and code libraries that people can run without understanding them should need no further explanation to establish. Just a basic look at the SERPs will show a myriad of NLP and forecasting tools that are released while being easy to access and use without any understanding of the underlying maths and methods. $SEMR just deployed their own keyword intent tool, totally flattening a complex process without their end-users having any understanding of what is going on (but more on this another day). These maths and methods are absolutely critical to be able to responsibly deploy these technologies. Let’s use NLP as a deep dive as this is an area where I think we have earned our knowledge.
“You didn’t earn the knowledge for yourselves so you don’t take any responsibility for it.”
The responsibility here is not ethical, it’s outcome oriented. If you are using ML/NLP how can you be sure it’s being used for client success? There is an old data mungling adage “Garbage In, Garbage Out” that is about illustrating how important initial data is:
The stirring here just really makes this comic. It’s what a lot of people do when they don’t understand the maths and methods of their machine learning and call it “fitting the data.”
This can also be extrapolated from data science to general logic e.g. the premise of an argument. For instance, if you are trying to use a forecasting model to predict a traffic increase you might assume that “The traffic went up, so our predictions are likely true” but you literally can’t understand that without understanding exactly what the model is doing. If you don’t know what the model is doing you can’t falsify it or engage in other methods of empirical proof/disproof.
Exactly, so let’s use an example. Recently Rachel Anderson talked about how we went about trying to understand the content on a large number of pages, at scale using various clustering algorithms. The initial goal of using the clustering algorithms was to scrape content off a page, gather all this similar content over the entire page type on a domain, and then do it for competitors. Then we would cluster the content and see how it grouped it in order to better understand the important things people were talking about on the page. Now, this didn’t work out at all.
We went through various methods of clustering to see if we could get the output we were looking for. Of course, we got them to execute, but they didn’t work. We tried DBSCAN, NMF-LDA, Gaussian Mixture Modelling, and KMeans clustering. These things all do functionally the same thing, cluster content. But the actual method of clustering is different.
We used the scikit-learn library for all our clustering experiments and you can see here in their knowledge base how different clustering algorithms group the same content in different ways. In fact they even break down some potential usecases and scalability;
Not all of these ways are likely to lead to positive search outcomes, which is what it means to work when you do SEO. It turns out we weren’t actually able to use these clustering methods to get what we wanted. We decided to move to BERT to solve some of these problems and more or less this is what led to Jess Peck joining the team to own our ML stack so they could be developed in parallel with our other engineering projects.
But I digress. We built all these clustering methods, we knew what worked and didn’t work with them, was it all a waste?
Hell no, Dan!
One of the things I noticed in my testing was that KMeans clustering works incredibly well with lots of concise chunks of data. Well, in SEO we work with keywords, which are lots of concise chunks of data. So after some experiments with applying the clustering method to keyword data sets, we realized we were on to something. I won’t bore you on how we completely automated the KMeans clustering process we now use but understanding the ways various clustering maths and processes worked to let us use earned knowledge to turn a failure into success. The first success is allowing the rapid ad-hoc clustering/classification of keywords. It takes about 1hr to cluster a few hundred thousand keywords, and smaller amounts than hundreds of thousands are lightning-fast.
Neither of these companies are clients, just used them to test but of course if either of you wants to see the data just HMU 🙂
We recently redeveloped our own dashboarding system using GDS so that it can be based around our more complicated supervised keyword classification OR using KMeans clustering in order to develop keyword categories. This gives us the ability to categorize client’s keywords even on a smaller budget. Here is Heckler and I testing out using our slackbot Jarvis to KMeans cluster client data in BigQuery and then dump the output in a client-specific table.
This gives us an additional product that we can sell, and offer more sophisticated methods of segmentation to businesses that wouldn’t normally see the value in expensive big data projects. This is only possible through earning the knowledge, through understanding the ins and outs of specific methods and processes to be able to use them in the best possible way. This is why we have spent the last month or so with BERT, and are going to spend even more additional time with it. People may deploy things that hit BERT models, but for us, it’s about a specific function of the maths and processes around BERT that make it particularly appealing.
“How is this another responsibility of SEOs”
Thanks, random internet stranger, it’s not. The problem is with any of this ever being an SEO’s responsibility in the first place. Someone who writes code and builds tools to solve problems is called an engineer, someone who ranks websites is an SEO. The Discourse often forgets this key thing. This distinction is a core organizing principle that I baked into the cake here at LSG and is reminiscent of an ongoing debate I used to have with Hamlet Batista. It goes a little something like this;
“Should we be empowering SEOs to solve these problems with python and code etc? Is this a good use of their time, versus engineers who can do it quicker/better/cheaper?”
I think empowering SEOs is great! I don’t think giving SEOs a myriad of responsibilities that are best handled by several different SMEs is very empowering though. This is why we have a TechOps team that is 4 engineers strong in a 25 person company. I just fundamentally don’t believe it’s an SEO’s responsibility to learn how to code, to figure out what clustering methods are better and why, or to learn how to deploy at scale and make it accessible. When it is then they get shit done (yay) standing on the shoulders of giants and using unearned knowledge they don’t understand (boo). The rush to get things done the fastest while leveraging others earned knowledge (standing on the shoulders of giants) leaves people behind. And SEOs take no responsibility for that either.
Leaving your Team Behind
A thing that often gets lost in this discussion is that when information gets siloed in particular individuals or teams then the benefit of said knowledge isn’t generally accessible.
Not going to call anyone out here, but before I built out our TechOps structure I did a bunch of “get out of the building” research in talking to others people at other orgs to see what did or did not work about their organizing principles. Basically what I heard fit into either two buckets:
- Specific SEOs learn how to develop advanced cross-disciplinary skills (coding, data analysis etc) and the knowledge and utility of said knowledge aren’t felt by most SEOs and clients.
- The information gets siloed off in a team e.g. Analytics or Dev/ENG team and then gets sold as an add on which means said knowledge and utility aren’t felt by most SEOs and clients.
That’s it, that’s how we get stuff done in our discipline. I thought this kinda sucked. Without getting too much into it here, we have a structure that is similar to a DevOps model. We have a team that builds tools and processes for the SMEs that execute on SEO, Web Intelligence, Content, and Links to leverage. The goal is specifically to make the knowledge and utility accessible to everyone, and all our clients. This is why I mentioned how KMeans and owned knowledge helped us continue to work towards this goal.
I’m not going to get into Jarvis stats (obviously we measure usage) but suffice to say it is a hard-working bot. That is because a team is only as strong as the weakest link, so rather than burden SEOs with additional responsibility, orgs should focus on earning knowledge in a central place that can best drive positive outcomes for everyone.