eBay’s 3rd University Machine Learning Competition: Predicting Shipping Delivery Dates

For our annual ML competition, we challenged university students to predict how many days a carrier takes to deliver packages.

In ecommerce, different carriers tend to take different lengths of time to deliver packages. For our annual Machine Learning competition, we challenged university students to predict how many days a carrier takes to deliver packages. Our competition garnered over 1,100 submissions, from 843 students at 169 universities and colleges, over a six month period.

Why predict shipping delivery dates? During the COVID-19 pandemic, ecommerce overall has grown by approximately 20% year over year, for two years in a row.1 One thing all of these online-purchased goods have in common is that they have to be shipped to their buyer.

The accuracy of shipping estimates plays a significant role in providing a hassle-free and trusted customer experience on a platform such as eBay. The journey of a package from seller to buyer is made up of two parts. The first part is the handling time, which covers the time taken by the seller to package the item up until it is handed over to the carrier. The second part is the transit time, which is the time taken by the carrier to deliver the package. It was this second part, the transit time, which we asked students to predict.

The previous two eBay University Machine Learning Competitions highlighted the unstructured nature of data—like listing titles, descriptions and images—common in many parts of eBay’s platform. The data for this year’s challenge, however, was highly structured and lends itself well to supervised methods, showcasing another important side of machine learning and data science employed at eBay. The dataset size was carefully chosen to be large enough for deep learning models (15 million records for the training set) but also small enough to fit within the computational resources available to students. As the data columns were well-documented and interpretable, this dataset lends itself to classical feature engineering, which could be combined with non-deep-learning approaches such as gradient boosted machines or other regressor models.

We hoped that the approachable nature of the data would place this challenge problem within the reach of a wide range of students, from undergraduate to Ph.D level students, and would provoke a variety of different approaches. We were thrilled to find that this is exactly what happened, with highly engaged teams of various academic backgrounds making up the leaderboard.

Following a thorough evaluation of the top scoring teams’ submissions, we were excited to determine the winner of eBay's 2021 University ML Challenge. The student winner, Yuyang Xu of Carnegie Mellon University, was the top-ranked team on the leaderboard hosted on eval.ai, and accepted our offer of an internship this coming summer.

Congratulations to the winner and heartfelt thanks to all the participants for their continued interest, enthusiasm and support!

1 Impact of COVID Pandemic on eCommerce, International Trade Administration, U.S. Department of Commerce, trade.gov/impact-covid-pandemic-ecommerce