Botnet Concept

Spotlight: E-Commerce’s Bot and Mouse Game

This week’s podcast is sponsored by DataDome. In it, we speak with DataDome’s Benjamin Fabre about how bot activity has greatly increased as a result of the COVID-19 Pandemic, and how inauthentic activity driven by bots is driving up operating costs for e-commerce companies of all shapes and sizes. We also talk about how bot prevention is complicated by the shift from web pages to mobile applications and APIs.

As always,  you can check our full conversation in our latest Security Ledger podcast at Blubrry. You can also listen to it on iTunes and Spotify. Or, check us out on Google PodcastsStitcherRadio Public and more. Also: if you enjoy this podcast, consider signing up to receive it in your email. Just point your web browser to securityledger.com/subscribe to get notified whenever a new podcast is posted. 


Everyone, no matter their age or origin, can tell you what a “robot” is. A simple Google search will tell you that it’s a machine that resembles a human being, which also performs human-like functions and movements automatically. The word itself is actually derivative of the Czech word “robota,” meaning forced labor. It’s no wonder then that “bot,” which is an automated computer program that pretends to be human, is short for “robot.”

Benjamin Fabre is the co-founder and President at DataDome.
Benjamin Fabre is the President and co-founder at DataDome

Turing Tests To Chatbots

Robot programs, have been a staple of modern computing going back to British computer scientist Alan Turing’s famous Imitation Game – the “Turing Test” – which was designed to judge a machine’s ability to exhibit – or mimic – intelligent behavior well enough to fool a human observer. 

New Rapidly-Spreading Hide and Seek IoT Botnet Identified by Bitdefender

But in recent decades, as online commerce and activity have grown to account for a bigger and bigger share of the world’s economy, bots have taken on new prominence and importance. No longer the stuff of laboratory experiments in machine intelligence, bots these days perform a dizzying array of tasks: from indexing the contents of web pages, to assisting online shoppers to providing customer support

On the other side of the ledger, bot driven distributed denial of service attacks are a source of website outages and other disruptions online. And bots have increasingly inserted themselves into online commerce. They can drive up ad rates by flooding websites with bogus traffic. And they reap huge profits from shadowy online operators: exploiting online arbitrage by scooping up deeply discounted items at big sale events like Black Friday and Cyber Monday, only to flip them for resale at a steep markup elsewhere online. 

As Mobile Fraud Rises, The Password Persists

A Pandemic Boosts Bots

And then came COVID 19. As it has in other contexts, the pandemic has poured rocket fuel on bot activity, first by driving home-bound shoppers to e-commerce sites. More online traffic fueled an increased bot activity online, especially on e-commerce websites. Moreover, pandemic-driven shortages in everything from toilet paper and gamin consoles has led to an increase in the use of bots. Pandemic or no, the costs of this kind of fraudulent activity are felt acutely by online retailers – who pay to support the added, inauthentic bot traffic, and see customer sentiment sour as promised deals melt away before their eyes. 

But the challenge posed by bots and other automated behavior are not going unanswered. In this episode of the podcast,  we’re joined by Benjamin Fabre, the Co-Founder and President at DataDome, an international bot detection software service. In this conversation, Benjamin and I talk about the nature of bots and the types of actors behind bad bot activity. He also explained to us how bots negatively impact e-commerce, through both the customer journey as well as the check-out process. Ben and I also talk about the evolving risks to organizations as e-commerce shifts from web pages to mobile applications and APIs.



Episode Transcript

[START OF RECORDING]

PAUL: This Spotlight edition of The Security Ledger podcast is sponsored by DataDome. DataDome is a leading bot protection vendor based in New York, Paris and Singapore. DataDome beats illegitimate traffic so that sensitive data remains safe and online platforms can perform at optimum speed. Based on AI and machine learning, DataDome’s cybersecurity solution detects and blocks in real time the most advanced bot attacks. You can check them out at DataDome.co. That’s DataDome.co.

PAUL: Hello and welcome to this Spotlight edition of The Security Ledger podcast. I’m your host, Paul Roberts, Editor In Chief of The Security Ledger. In this episode of the podcast sponsored by DataDome:

NEWS RECORDING: Ticket bots, software that mimics humans to buy large blocks of tickets fast. Last summer, bots were slammed for shutting fans out of tickets to the tragically Hips tour. Earlier this year, Eric Church canceled 25,000 tickets scooped up by bots.

PAUL: Bots aren’t a new phenomena. Automated robot programs have been a staple of modern computing, going all the way back to the British computer scientist Alan Touring and his famous imitation game. But in recent decades, bots have taken on a new prominence and importance. They’re no longer the stuff of laboratory experiments and machine intelligence bought these days perform a dizzying array of tasks from indexing the contents of webpages to assisting online shoppers and providing customer support. They’re also a frequent nemesis of security groups. Bought driven, distributed denial of service attacks are a source of website outages and other online disruptions, and bots have increasingly inserted themselves into online commerce. They can drive up ad rates by flooding websites with bogus traffic, and they reap huge profits by exploiting online arbitrage, scooping up deeply discounted items at big sale events like Black Friday and Cyber Monday only to flip them at a steep markup elsewhere online. But the challenge posed by bots and other automated behavior are not going unanswered. In this episode of the podcast, we’re joined by Benjamin Fabre, the cofounder and President of DataDome, which provides bot protection services. In this conversation, Benjamin and I go deep on the bot problem, including a discussion of how shortages of everything from toilet paper to gaming consoles led to an increase in the use of bots during the pandemic. We also talk about the growing risks posed by bots as Ecommerce shifts from Web pages to mobile applications and API’s. To start off, I ask Benjamin to talk a little bit about Data Dome and how the company’s technology works.

BENJAMIN: Benjamin Fabre, I’m DataDome co founder and CTO.

PAUL: So Benjamin, for our listeners who aren’t familiar with DataDome, explain to us what the company, what your company does and kind of what your technology is all about.

BENJAMIN: Yeah so DataDome is a cyber security solution. We are protecting digital businesses against the bad bots. So all automated threats that can hurt Ecommerce classified media on the Internet by trying to reach their content to generate data breaches or to scrap precious content.

PAUL: Ben, tell me just a little bit about who you’re working with at DataDome and kind of their superpowers what you are bringing to the table as a company technologically.

BENJAMIN: Yeah, sure. So today, the team is composed by engineers, cybersecurity experts and developers. We have offices here in New York, in Paris and in Singapore, and we are hiring massively expert data analysis, data scientists to keep looking at the data, improving our machine learning models and to have the fastest response time against the different threats. And on the other side, we have a strong DevOps team because we have deployed our infrastructure in 25 points of presence, to have the best response time possible and to be able to protect our customers without impacting their business.

PAUL: So you guys have a bird’s eye view on bot activity. What can you tell us? In terms of the trend lines?

BENJAMIN: We can see that the bot activity is literally going through the roof and that hackers are keep improving the technology to distribute massively their attacks. We are seeing also that the bot activity is moving from the web browser to the mobile because there are really many different websites that are not seriously protecting their mobile application. And if you don’t protect your mobile application, the future might be dangerous.

PAUL: Some listeners might say, are there any good bots? Because often at least on this podcast, when we’re talking about bots, we’re almost always talking about malicious bots that are out there seeking social media or doing denial of service attacks or so on, I guess. Answer that question. Are there good bots and what defines what characterizes a bad bot?

BENJAMIN: This is a good question. So first, maybe talk a bit about the bots themselves. So we are working to split the human traffic. Is the bot traffic. What is human first? It’s a real legitimate user that is using a website or mobile application in a regular way. But it’s an automated software that will run many actions most of the time super fast to generate a different threat faster than what a human can do. So that’s our job to detect the human as the bot. And when we are detecting a bot the question is, is it a good one or a bad one? And you’re right. There are some good bots running. For instance, Google, in order to create the search engine, they have to fetch the content on all websites on the internet to be able to create this huge index. So Google bot is one of the good bots that all ecommerce website wants to gain access on their website. You can use also good bot to automate some action on your website, for instance, making sure that your website is working fine, that it is going fast to check some SEO statistics, et cetera. There are many different good bots, but as soon as bot is not wanted on your website because it’s your competitor that is trying to fetch your pricing list because it’s a hacker that is trying to run credentials to finger attack, et cetera. That’s the moment when we are classifying this, but as a bad one.

PAUL: Often in cybersecurity, when we talk about where we write about bots, it’s in the context of distributed denial of service attacks. That’s just like one of the applications of bots that if you’re write about cybersecurity or you follow cybersecurity is really common. But you mentioned you talk about also just the role that they play in ecommerce and all the different types of I guess, in authentic activity that is linked to bots describe the spectrum of different problems end up causing online.

BENJAMIN: DDoS is obviously the most common attack that is generated by a bot. But there is an organization called the OWASP organization that has classified the automated threat in modern dozens of different categories. So for instance, all the but that are trying to reach the login section. They are running credential stuffing attack contact attacks. So they are trying every single login password that have linked on the dark web to try to see if this is not a valid combination on all ecommerce websites. For instance, there are robots that are trying to find some vulnerability on a website or on mobile application to generate new data leakage and to try to gain access to a database, for instance, with new login password with address, etc.

PAUL: When we talk about bots, when we look at them from the sort of application standpoint, what are they like? I think people think of bots is very sophisticated, but my understanding is it’s actually fairly easy to create them, and you read about people creating them for things like grabbing dinner reservations or movie tickets and stuff like that. So what is a bot kind of under the hood?

BENJAMIN: Trying to generate a request on a mobile application or on a website so it can be as simple as a common line curl, for instance, even if it’s pretty easy to detect. So that’s not the best way to work around the bot protection, but that’s an option to do. And by the time we’ve seen arrays of what we are calling, bots as a service, it means you don’t necessarily have to be an expert in development or format. In order to develop this kind of robot, you can just use your mouse, go on a website, you pay $10 and then you get access to an online interface that can make the complex part of the bot for you. They will use a less browser technology that is great to try to work around the basic protection. They will provide you some proxy addresses to distribute your robot across millions of different IP addresses, et cetera. So today it can be super easy to create a robot, and it’s far more complex to detect and to prevent them.

PAUL: I like that you used the term robot because of course, that is the origin of the term bot, right? And robot itself, I think, is a Czech word originally. If I’m not mistaken, it either means “worker” or “slave.”

BENJAMIN: Yeah, interesting.

PAUL: You know, COVID, the pandemic, has really altered everybody’s pattern of living and behavior. And there’s been a huge spike, obviously in the use of and reliance of ecommerce: food delivery, delivering goods versus going into stores. What has DataDome seen in terms of the impact of the pandemic and the increased reliance on ecommerce with the bot activity that’s going on out there?

BENJAMIN: Yeah so the pandemic have generated a massive move from the physical world to the digital world. We’ve seen a huge increase in the traffic of all our ecommerce customers, and as always, when the money to make the hackers have also evolved in the way they are threatening the website and the ecommerce website. Using our own intelligence, we’ve been able to measure that the bad bot activity on all our ecommerce customers have increase of 50% during the last six months with more and more attack, especially on the login page with the credential stuffing attacks.

PAUL: So DataDome does detection of bots… Oh, and by the way, I did look up the etymology of robot and it is Czech, and it does mean forced labor “robotah.” Then you learned something new on this podcast today. Next time that term comes up, you can say it’s actually from the Czech.

BENJAMIN: Yes.

PAUL: So you’re in the business of detecting good versus bad bots and automated behavior related to bots. What are some of the tells that distinguish a bot from a human actor?

BENJAMIN: In order to split the human beings from the bot traffic, and first you have to collect as much data as possible. As always, data is key to train models and to take a decision. So we are collecting today more than 1000 billions of events every day that can be from the network level. So how the request was made. What is, of course, the user agent, the IP address, the IP on, et cetera. So we are collecting more than 100 of information at the network level. And on the other side, we have a mobile SDK and client side JavaScript to collect the user interaction with the page and the mobile application. So how the touch events are done on the device on the mobile device, how the call are done, how the mouse is moving on the page, how the user jump from one thing to another, and we are collecting a huge amount of data. Then the second part is to update the model in real time and be able to take a decision below two milliseconds. Because every time someone is trying to get access to a page, we have to take a decision if it’s a human or if it’s a robot, and if we have to let the request go through or if we have to block it, and we are doing that below two milliseconds. So that means we have to do match learning with a huge volume of data below two milliseconds. And then when we’ve done that, we have to be able to measure the effectiveness of our detection. The file positive on one side and the file positive is the file negative on the other.

PAUL: And you mentioned ecommerce as one of the I mean, I think we think of bots with like, okay, if you’re an Amazon or Target or one of these huge ecommerce vendors… Yes, this is a big problem, but I know just looking at some of your customer success stories that really I mean, pretty much every business of any size these days probably has some kind of ecommerce branch of what they’re doing and the cost for these businesses, whether it’s having price and part. I know you had a profile of a comment company that sold very specialized, like hydraulic parts for hydraulic systems. So the damage or the threat to them is both that their proprietary pricing information is going to get scraped, and then also that these companies are basically paying out of pocket for the bot traffic. They’re paying to support a whole bunch of illegitimate traffic to their sites. That actually is probably a threat to their business, not a help to it.

BENJAMIN: Yes, exactly. When we are seeing that the bad, bad traffic and earns 30, 40 or sometimes even 50% of your traffic, you are paying bandwidth, you CPU, you are paying storage to serve content to. But so that’s a bit crazy. And on the other side, of course, they are hurting your business. They are creating security issues. So regardless the size of your website, as soon as you are generating revenue with it, you have to be protected.

PAUL: You’re listening to a spotlight edition of the Security Ledger podcast sponsored by DataDome.

PAUL: One of the industries where this has really become an issue is obviously with online advertising because advertisers will base rates on traffic. But if that traffic is bogus basically then the end customer is paying for a lot of worthless traffic. How big a problem is that to date are we getting any better at sorting out the bogus automated traffic from the real people sitting at real computers or smartphones?

BENJAMIN: I think the problem is definitely not solved at all. And that’s a complex problem because there are different stakeholders with different objectives, someone to pay only for legitimate users that are seeing advertisements, someone to maximize their revenue. They are in between company that are trying to find the right balance between the the aggressiveness of the detection and the quality of the traffic. But that can be up to 30% of the traffic that you are paying for. That is not legitimate human. That’s what we are seeing inside.

PAUL: That’s a big number.

BENJAMIN: Yeah thats a big number and that’s hundreds of billions of dollars per year of fake traffic that the companies are paying. That’s a bit insane.

PAUL: Do we know who the cast of characters behind the bot problem, obviously, part of it is just straight up cyber criminals running DDoS campaigns and automating command and control infrastructure for malware and other stuff like that. But who are some of the other? I’m guessing they’re not all cyber criminals who are some of the other players responsible for some of this bot activity.

BENJAMIN: There are also many kind of newbies that can just rent a bot with the bot as a service solution. They are in the gray area. They might not be real hackers, real bad guys, but they can generate revenues by running a robot today. You know, the famous PS5 to pitch where if someone can buy a PS5, it can be immediately sell three, five times more. So there are revenue that you can generate by just running a very simple, but to try to buy a PS5 or some sneakers. For instance, that is a huge currency.

PAUL: Kind of arbitrage, basically.

BENJAMIN: Exactly.

PAUL: Yeah. There was a big shortage of gaming consoles with the pandemic, and they were getting a premium online. So yeah, if you could find one inexpensively and then upsell it. So people were using bots to do that, that’s interesting.

BENJAMIN: On the dark side, because today there are two pieces in creating, but the first one is the bot itself. So the technology to get the content and the second one is to get access to as many IP addresses as possible, because if you are running a bot on just one single laptop or even a web server, then it will be blocked by just any basic rate limiting. Right now, the bot, they have to be distributed across thousands or millions of different IP addresses to try to evade the detection in place and to do so, there are two or three different techniques. Today, you can rent those boxes that might be a zombie computer or botnet computer that you can rent by the hour, so you can distribute your butt across millions of botnet computer during 10 minutes or 15 minutes to try to work around the protection in place. There are also the mobile device that are using applications that are infected. And while we are talking, maybe you have inside your mobile device and application that is using your Internet connection as a proxy to distribute a robot and to try to work around some protection and run some credential stuffing, attack or certain details attack. And the last option is to use proxy or VPN that can rent you access to millions of different devices. Also by the hour with clean ISP reputation IP address, for instance, that might be complex to block without a real time protection like we are doing.

PAUL: You mentioned in some of your research the coincidence of bot activity around things called “hype” sales or “big” sales. Can you talk about that phenomena? First of all, what is a “hype” sale?

BENJAMIN: So a hype sale is a very short period of time where an ecommerce website will provide items with a very limited stock. So that’s the PS5 example or the sneaker example. There are just like 100 gaming PS5 that are sold, and then it will be a competition between humans and bots to try to buy as fast as possible. And in that situation, a bot will always be faster than a human.

PAUL: So this would be like the Black Friday or any of these kind of time limited limited quantities, limited time type sale events.

BENJAMIN: Yeah, exactly. And we can see the traffic going through the roof from a few requests per second up to millions of requests per second in just 10 seconds because the bird creator press the enter button and then the ecommerce website will start seeing a huge volume of requests coming from all the Internet on the website to try to add to Cart, the PS5 or a super rare sneaker on Nike or Adidas, for instance.

PAUL: And from the retailers perspective, I mean, is it a problem for them? Do they care if it’s a bot versus a human being in terms of their promotion? Yeah. Should they care?

BENJAMIN: Yeah. They really care because they want to make sure that the products are directly going to the end users and that they won’t generate massive revenue for hackers for kind of gray area of legitimate activities. And so in them of reputation, this is really key today for the Ecommerce website to make sure that the brand reputation will be preserved by making sure that only real consumer will directly buy the product at the right price.

PAUL: Consumers or customers are frustrated, he says, oh, I went on during the sale and they were already gone by the time I got there.

BENJAMIN: Exactly. And they can be mad against the brand even if it’s not their responsibility.

PAUL: Okay. So having asked that question, what should e commerce companies do to protect their businesses and their customers from this bot activity? What is the checklist, if there is one, for addressing the bot problem?

BENJAMIN: Yeah. So they have first to make sure they are protecting all the customer journey from the home page and the product page to protect against the competition that they might try to have a tracking of their pricing evolutions. They also want to protect their login section against all the contact over credence to finger attack. They have to protect the cart funnel to avoid the scalping and inventory holding during those hype sales. And they want to protect the checkout page because the card payment is a huge topic today and they want to make sure that they don’t face massive issue the credit card information, for instance. So make sure you are protecting all your customer journey also when it’s possible, provide second factor authentication on the login section and on the payment using the authentication solution provider or the credit card payment provider and make sure that we are protecting not only the web traffic, but also the mobile, because today the business is shifting from the web browser to the mobile, and we are seeing that API now is key, and you have to protect your API first, because if not, it’s like if you are doing nothing.

PAUL: So there’s the ecommerce part of bots. But then there’s also what Facebook refers to as coordinated inauthentic behavior issue, which is on social networks, Facebook, Twitter, Instagram, these kind of automated networks. I know there’s one that’s been researched in China. They call “spamoflauge,” these kind of global networks of fake profiles and so on that are automated, or maybe semi automated. Is any of this technology useful for that problem? Or is that something… Is that a different type of automated behavior problem than what we’re talking about with these ecommerce bots?

BENJAMIN: I think it’s really close in terms of technology used, especially today. We are seeing that there are some coordination between bots and human, so it’s not 100% bot, it’s not 100% human, but they are doing a mix to automate actions, so that’s basically using the same technology to fight against different kind of food and threat. I think one of the difference is that we are the strongest signal that what the social networks can have, because they are fighting against fake retweet or fake sharing or fake light. So that’s kind of which way to differentiate or legitimate and estimate.

PAUL: There’s not much for them to go on.

BENJAMIN: Yeah, exactly. On our side, we are login, we have payments, so we have more signals than what the social networks can have. So it’s a bit different in that way.

PAUL: Final question: is there a role for policymakers government to address the situation through laws, either new laws or enforcement of existing laws to crack down on this problem? Or is this something really that businesses are going to have to contend with on their own?

BENJAMIN: So there are different layers here. The first one is the term of service, so we always recommend our customers what inside the term of service to prevent the usage of automated software on their website. On the other side, the governments are working on that we are seeing in some countries or some States, some first jurisprudence around preventing the scraping or the scalping, and on the other side to prevent the tools used by the bot we are seeing strong actions done by government and by the ISPs or Microsoft, for instance, to fight against the botnet, because at the end, to reduce, but they have to be able to distribute their technology across millions of devices. So making sure that they are as low as possible available effective device is also a way to reduce the risk. But at the end, it’s always a cat and mouse topic. So we will always need solution to protect, because we are seeing that hackers are always finding a new way to work around the different security technologies. And that’s why we are investing in massively on the Orange and that we are always working to be a step ahead the hackers.

PAUL: So what’s coming next from DataDome?

BENJAMIN: Today, we are protecting website against the bot, as we said. But the technology that we have developed to protect the mobile applications and the website is being used by our customers to fight against different threats that are not necessarily done by Robot, but that can be done by human. So the fraud payments done by hackers, for instance, or also some account covers action done by human are the new threats that we are working on and we will be able to protect our customers against them in the next couple weeks.

PAUL: Ben, thanks so much for coming on and speaking to us on the Security Ledger Podcast it’s been great speaking with you.

BENJAMIN: Thank you. That was the pleasure.

PAUL: Benjamin Fabre is the cofounder and President of DataDome. You’ve been listening to a Spotlight edition of the Security Ledger Podcast, sponsored by DataDome. DataDome is a leading bot protection vendor based in New York, Paris and Singapore. DataDome beats illegitimate traffic so that sensitive data remains safe and online platforms can perform at optimum speed. Based on AI and machine learning, DataDome’s Cybersecurity Solution detects and blocks in real time the most advanced bot attacks. You can check them out at DataDome.co that’s DataDome.co.

[END OF RECORDING]