Building Beautiful Algorithms for Walmart
My journey with Engage3 led to a new role with Walmart
Earlier this month, I was offered an exciting opportunity to work as an advanced analytics specialist at Walmart eCommerce in San Francisco. Fresh from the Master of Science in Business Analytics (MSBA) program, I’ll now have the opportunity to offer strategic recommendations to the world’s largest company with more than $500 billion in annual revenue.
It’s been a whirlwind of activity lately with graduation last month, our team’s final practicum project presentation and classes wrapping up this summer. It’s also an extremely gratifying experience to know my skills as a business analyst will be put to the test to define the future state of retailing.
I’m thrilled to help Walmart define customer expectations, analyze trends and model successful supply chain logistics. But I didn’t land this role overnight.
Putting Our Practicum to Work
Our practicum project this past year for Engage3 proved to me that data science isn’t as easy as you expect sometimes. Engage3 helps retailers and manufacturers manage their price image through accurate competitive data, data science and Artificial Intelligence-powered software solutions.
I learned a valuable lesson working on this project—there may be times you believe an algorithm can be adjusted quickly to solve your problem, only to discover you have to re-work it entirely. I encountered that firsthand.
One of the biggest challenges from our partnership with Engage3 came from what I believed was a simple solution that turned out to be more complex. We had previously used data from sales of one of Engage3’s clients and finished our first version of machine learning algorithms. Before we celebrated our achievement and caught a breath, another test came up — Engage3 had a new client with a tight deadline.
We thought it could be a simple modification to our algorithm, adjusting for the new data of this client.
I knew our algorithms were well-structured, so I figured we would just need to change our inputs and rerun the code again to solve our new clients’ problem. However, it turned out to be more complicated.
Innovating on the Fly
Our team needed to adjust for real-time inputs. That was more challenging than I had expected.
- First, we needed to change static inputs into real-time. When designing the algorithms, we extracted the data into multiple CSV files and built our algorithms based on the dataset. This time, we needed to connect to the cloud database directly and extract real-time data of the new client. We spent lots of time having to find and apply the right APIs in Python to connect our notebooks with the cloud database and write queries to select all the data required for performing the algorithms.
- Second, our new client was from another country, thus many packages we used in our project before needed to be adjusted. For example, in our algorithms, we explored the sales during holidays and combined this seasonal pattern into our sales prediction. Holiday packages needed to be updated to reflect holidays in this new country. Moreover, in our notebook, we visualized store locations on a map. So we also needed to adjust our packages and functions and enable them to read the updated zip codes.
- Lastly, after all packages and functions were updated—here came the most complicated part—we had to modularize all notebooks and validate them with the new data. Once we updated them, we had three notebooks serving three different functions, but would they work together on the new dataset? We needed to change all of the computations in the three notebooks into callable functions. We then passed column names as input arguments in the functions instead of passing the whole data frame. In this way, our functions became more flexible and could be reused for future clients. We validated the algorithm with one last notebook using extracted sales data directly from the cloud database. Finally, we had algorithms that updated continuously.
Big Data Science Demands Collaboration
Through this process I gained a deeper understanding of data science. It’s not just about algorithm modeling.
Behind the algorithm are many details you need to consider. A seemingly easy problem on the surface might require a lot of detailed adjustments and even function restructuring.
That’s why experience is so crucial in this field. To solve a data science problem, it’s important to first have a comprehensive view, and an understanding what problems could arise along the way. Breaking problems down into small tasks, and having everyone in the group on the same page can help in the long run.
My last takeaway from this practicum project is that in data science, always pay attention to the flexibility and reusability of your code. These algorithms we create are not always intended to provide one-time answers, instead try to create a framework and build a pipeline to solve similar problems faster in the future.
On My Way with Walmart
Going into the interview with Walmart, I was admittedly a little nervous. It is great opportunity to stay in the Bay Area, and do something I love. I knew that I had the relevant experience and proven track record with notable organizations, and a strong educational side to get the job done—I just had to show them that.
I’ve worked as an operations, risk, and data analyst for DHL, EY and Engage3. I was able share a great deal of knowledge from those experiences and parlay it into this job offer.
The MSBA program and my internships have prepared me well. Now I’m ready to help Walmart tackle real-world data science problems and create a more efficient, better shopping experience. Let’s get to work!