DataRobot is home to a treasure trove of Data Scientists. We’ve started a series called “Data Scientist Spotlight” as a way to meet the people behind the technology and introduce you to our great team.
Introducing Belén Sánchez. Belén works on accelerating AI adoption in enterprises in the United States and in Latin America and contributes to the design and development of AI solutions for customers in the retail, education and healthcare industries.
1. What novel methods and techniques are you currently using?
I am using so many novel methods and techniques that have been incorporated into DataRobot that it is hard to give a simple answer. But let me share a few useful and exciting methods that I have been using lately:
- Time series – new series modelers: One of the challenges that you can face when forecasting demand is how to work around products that are seasonal or are newly introduced. With DataRobot, I am now able to predict new series with no history that have not been seen in training with the introduction of several techniques that help to keep the predictions more defined without wild outlier predictions. Some of these techniques include single modeler with several estimators and model collections.
- Bias and Fairness: This topic has been at the top of my mind since I started my career as a data scientist. However, for years I felt that many discussions on this topic were not able to land concrete guidance on how to work around this. Finally, with DataRobot I am able to have a clear AI bias and fairness workflow that helps me recognize and fix bias in my models. This workflow includes the identification of protected features in your dataset, the selection of an appropriate fairness metric, and the generation of insights to identify and understand the model potential bias. And it looks like soon we will even have ways to mitigate bias uncovered in your data through the platform!
- MLOps: This topic has been relevant for years, but certainly it became even more relevant once the pandemic hit us. So it is pretty exciting to be working with state of the art technology and practices that provide a scalable and governed means to deploy and manage ML applications in production environments. Monitoring accuracy over time, data drift and challenger models are certainly some of my favorite practices.
2. If 85% of models fail to make it to production – how do you deal with failure?
I think the best way to deal with failure is to recognize it, accept it and learn from it as fast as you can. As the question points out, there is a high percentage of machine learning models that fail, and here are a couple of things I have learned from my own failed models:
- Ground your model on a real pain or business problem.
- When designing your solution, include the voices and perspectives of the stakeholders and users that will be affected by your model outcome.
- Iterate and make sure you can explain how your model works and make predictions.
3. How do you see the role of the data scientist evolving in the future?
I was discussing this with one of my colleagues and we both agreed that in the coming years, we will see more companies and organizations taking advantage of the workforce augmentation and therefore investing in the development of AI based solutions across different areas. This means that there will be more opportunities to apply data science across businesses and industries, and domain or industry expertise will be very valuable. At the same time, data scientists will have more opportunities to contribute to the R&D or product development from a ML engineer perspective.
4. Does DataRobot make a data scientist’s job easier? How?
It certainly eases some parts of your work as a data scientist, but I think the benefits of DataRobot go beyond that. DataRobot makes you more productive, it accelerates the speed to build a good model, it exposes you to novel and robust methods and algorithms, it allows you to tackle more strategic problems, it provides solutions that cover the whole life cycle and management of a model and last but not least it facilitates collaboration among people with different roles than yours.
5. Do you have any passion projects?
Time series projects have captured my interest and passion during the last year. I enjoy working on time series projects with my clients. I think that having the capability of forecasting business metrics such as sales, turnover, website traffic, etc. brings a lot of value to a company. Yet, it is also one of the areas that needs more expertise.
I also have a true passion for contributing to the reduction of the gender gap in the AI industry. This year I was able to lead a DataRobot University program that provided scholarships to 60 Latin American women living across 11 countries. They have been learning about AI and applied data science in a seven7 week Spanish training program. This has certainly become one of the most satisfactory accomplishments in my data science career so far.
About the author