Understanding and Mitigating Third Party JS Risks
Introduction
Dominic Lovell introduces himself and the topic of his talk: understanding and mitigating third-party JavaScript risks. He discusses his background in web performance and security, and his role at Akamai.
Akamai and Third-Party JavaScript
Dominic provides an overview of Akamai's services, including their CDN, cloud security solutions, and future vision for cloud computing. He highlights the prevalence of third-party JavaScript on the web and the potential security risks it poses.
Formjacking and its Process
Dominic defines formjacking, also known as clickjacking, web skimming, or Magecart, and outlines the three-step process involved: injecting malicious code, constructing the code, and extracting data. He emphasizes the attackers' goal of obtaining PII and payment information.
The Role of Third-Party Scripts
Dominic acknowledges the benefits of third-party scripts in creating a positive user experience but cautions about the potential vulnerabilities they introduce. He uses Atlassian's website as an example to illustrate the sheer number of third-party domains often integrated into websites.
Supply Chain Attacks and Server Compromise
Dominic explains the concept of supply chain attacks, where attackers exploit vulnerabilities in third-party scripts to compromise websites. He also touches upon the risks of server compromise, including vulnerabilities in build servers like Jenkins.
Limitations of Existing Security Measures
Dominic discusses the limitations of common security measures like X-Frames-Options, Subresource Integrity (SRI), and Content Security Policy (CSP). He argues that these measures, while valuable, are not always practical or effective in protecting against sophisticated attacks.
Advanced Exfiltration Techniques
Dominic delves into advanced exfiltration techniques employed by attackers, including encryption, obfuscation, stenography, and mimicking legitimate services like Google Analytics. He emphasizes the difficulty of detecting these techniques using traditional security tools.
Real-World Examples of Formjacking Attacks
Dominic presents a series of real-world examples showcasing various formjacking techniques. He illustrates how attackers exploit vulnerabilities, bypass security measures, and employ sophisticated methods to steal sensitive data from websites.
The Importance of Real-Time Monitoring
Dominic stresses the need for real-time monitoring of scripts to mitigate the risks associated with third-party JavaScript. He highlights the financial, regulatory, and reputational implications of data breaches for businesses.
Akamai's Client-Side Protection and Compliance
Dominic promotes Akamai's client-side protection and compliance solutions as a way to address the challenges of third-party JavaScript security. He encourages the audience to learn more about Akamai's research and offerings.
Conclusion and Call to Action
Dominic concludes his talk by summarizing the key takeaways and urging the audience to prioritize third-party JavaScript security. He provides his contact information and encourages attendees to connect with him and learn more about the topic.
Alright, there's some really great talks on at the moment, so I appreciate you jumping into my session.
We are here to talk about, understanding and mitigating third party JavaScript risks.
So a little bit about me, I work for Akamai, I've done web performance for a long time, I now focus a lot on web security, while this is a React track, I'm not a React expert, but hopefully after this you'll be hooked.
I am a dad, so I can make dad jokes.
Have a caboodle named Bonnie.
She's actually named after the, little girl from Toy Story three.
And I host and run the, Sydney Cybersecurity Meetup, SydCyber.
I don't have any animations or AI generated images, but there'll be plenty of memes.
This is my world.
My three kids, two boys and a girl, and my wife there.
We also do BJJ, my boys and I.
So I said Simon and James, if you're here, we can spar in the hallway, right?
So who's got a better talk?
And that's Bonnie.
All right, so I work for Akamai.
I've been there a long time.
It'll be reaching 10 years soon.
We are typically known for our content delivery network.
We pioneered and invented the space, 25 years ago.
The last decade, we've been focusing a lot on cloud security, so we do cloud firewalls, bot mitigation, DDoS, part of what I'm talking about today.
Our future vision is, cloud computing.
We acquired a company called Linode.
Some of you may be familiar with it.
The future vision is around multi cloud, edge computing, bringing containers to the edge, and not just in central compute locations.
If you want to scan this QR code, 100 Linode credit for compute.
No questions asked.
See a few people with their phones up.
You failed already.
It's a cyber security talk.
No, I'm joking.
I have, you should be careful of scanning QR codes and be mindful of phishing links, but you can manually go to that link, linode slash web directions, all right?
It's safe.
It's going to be up in the corner of a few slides.
Okay?
I'm here to talk about form jacking.
Sometimes it's known as click jacking, web skimming, or Magecart, it's popularly referred to.
And what this is, it's a technique where attackers inject malicious code into a website and try and extract data from that website.
Typically, that data is then submitted off to another third party or to a server that the attacker has control over, okay?
So there's a three part process.
You need to get the code in.
Have the malicious code constructed, and then extract the data.
More often than not, this process is involved around, taking out, PII, personal identifiable information, or credit card and payment information, and sending it off to somewhere else, collecting that data in full, and then usually it's sold again on, in, in the dark web.
We've done some studies at Akamai and different places, but we see that, most of the web actually uses some third party JavaScript, right?
Think about all your widgets, all your marketing scripts, all these different things.
The size of those JavaScript packages are increasing exponentially.
And when we do a scan over those packages, about 83 percent or more of them have a vulnerability or a known vulnerability.
So this is a visualization of what that looks like, right?
You've got your central website and then you've got your, site that then calls on third party and fourth party and fifth party resources depending on what you're loading.
Scripts are good, right?
Third parties are good.
It creates a good customer ecosystem, good customer experience.
We want to be able to integrate different things on our website, whether it's payment gateways or social media or different things to help us build our websites.
So it's not always bad.
This is a, a website that, there has been a talk from this company here, if someone wants to guess, you might be able to notice it on the next slide, as I drill down, but you can see here they've got a lot of third and fourth and fifth party resources, and I drill down again, this is actually So Atlassian are hosting 62 different third party domains on their homepage, and so 62 opportunities of potential areas of attack.
And this is happening all the time, right?
It's a warning from the FBI, there is lots of different examples of companies and websites being compromised with this type of attack.
It's not always third party.
Sometimes it's first party, actually.
Sometimes your own scripts are vulnerable to these attacks, or they do get compromised.
But more often than not, we're talking about a supply chain attack, where you're leveraging scripts or resources from these third parties, and they are vulnerable.
And I talked about this Server that collects the data.
This is known as a command and control server in the security landscape or C2 server where you want to be able to have an entry point where that data in the front end can collect that data and siphon it off.
All right, so but how is this happening, right?
Let's talk about a few different things and ways in which maybe you could be compromised.
But the first thing is, it's all okay, right?
We're all talking about AI, Web Directions has been a lot of AI focus.
So AI is going to solve everything, right?
Actually, there's a study that came out recently that found out that about 35 percent of all the code that ChatGPT recommends doesn't even exist.
It hallucinates, right?
It makes up that code.
We heard about that hallucination yesterday in the keynote.
Really good example of that is here where somebody's asked for a node code to connect with a database.
ChatGPT gives back a suggestion of a module for ArangoDB.
That npm package doesn't exist.
You can go and search for it and it never existed.
So somebody was quick to work that out and then created a, a module to go and, call off to a third party and, siphon off some data.
And what it does is it has a pre install hook to call the index when the package gets installed.
Snyk and these other security tools don't even pick it up, right?
There's no vulnerabilities found.
But if you monitor the traffic, actually it is, it's collecting all this data about the host that it's running on and all sorts of things being sent off to a third party server.
There's lots of examples of this.
Actually, here's one where, the UAParser.js was compromised.
Lots of big companies were using this file, and it was getting about 7 million weekly downloads.
And so it's amazing to think files like this are being compromised and being used by these large organizations.
Another really good example of where things might go wrong is a case in a guy called Dominic in New Zealand.
Had somebody come to him and say, hey, I can see you're maintaining this open source project.
I'm happy to take it off your hands for a fee.
And in doing so, the person then, distributed some, crypto malware, into the, module and pushed that out to all these different services.
There is a, great quote from Jake Williams who, looks at malware, online and he says, this is, everything that's wrong with open source.
A guy builds an NPM module, transfers the control of the module to another user, and then that's used to deploy malware, right?
So it happens time and time again.
If your Jenkins server or your build server, be it Jenkins or Bamboo or others, are publicly facing to the internet, chances are that somebody's trying to attack them, right?
And so Jenkins in particular has vulnerabilities where you've been able to bypass the authentication using the groovy script control.
Once you have that authentication, you can then manipulate files on the file system, right?
Including JavaScript and other bits of code.
Or, you may be compromised in full.
You could have your server compromised.
There's a whole thing around how your application may be compromised through what's known as the OWASP Top 10.
I'm not going to talk about that.
It's a lot of detail here.
There is a whole separate talk I could give around how your backend could be compromised, right?
Attackers could get into your server, manipulate the code, manipulate your JavaScript.
And I know many of you, hardcore devs in here are going, look, there's standards in place already to protect this, right?
We've got things in place to protect these kinds of mechanisms already, particularly on the front end, right?
Let's talk about those, right?
Let me jump in.
X Frames options, right?
This is a header where you can specify that, only certain domains can, host frames, and it protects you about cross site scripting and those kinds of things.
Often what we find now is that, websites aren't really using frames much anymore.
Some of the third parties are, but People don't really use frames, so it's not really something that, we, look at much these days.
And so again, it goes from this third party to first party, but there's a really good example of, an attack that happened to British Airways a few years ago.
And they had a S3 bucket that was open.
And then an attacker came in, Added only a few lines of extra code to Modernizr and then all of a sudden data gets siphoned off.
It was collecting, PII data and credit card information from the users, with only a few bits of code because an S3 bucket was open and they could come in and make a change to a JavaScript file.
And this is your job, right?
As, as, devs, cloud is a shared responsibility model.
Cloud vendors are going to give you the infrastructure but you need to look after your own code.
What about sub resource integrity?
That's a good option, right?
We create a checksum, we can tell the browser what the checksum is of the contents of the file, and then we know whether it's been tampered with, right?
That's a lot of work.
I think it's really repetitive stress injury.
What we find with a lot of these widgets, particularly like these social widgets, these big tech companies, some of them are doing like thousands of deployments every day.
So how can you create an SRI for this thousands of times every day when these widgets are changing?
It makes it really hard to do.
We've done our own research at Akamai and we've found out that, when we analyze hundreds of thousands of JavaScripts over a 90 day period, only about 25 of them were still in use for that same period.
With all these different third parties.
And 75 of them had turned over completely.
So they no longer were in use.
And so it becomes very hard to maintain some of these things.
Okay, what about content security policy, right?
That's a good option.
This is where, again, you have a header and you can say which particular host names or domains are allowed to do certain things or access certain resources on the page.
Now this is GitHub.
I think we have a talk from GitHub next, but this is crazy, right?
Look at their CSP.
It's it's massive, right?
So many different hosts.
And if I, parse this out, right?
They still have third parties in there, right?
Optimizely, don't have control over what Optimizely are doing, whether they get breached or compromised.
They do have Azure buckets and S3 buckets that potentially are at risk, and so CSP is always not a good option, right?
It's a lot of work, right?
It's like a full time job on top of my full time job.
Just to do CSP.
But what happens when you do have CSP on and your third parties do get compromised, right?
This is where the real problem lies, right?
You trust a third party and then they become compromised, right?
And so not only are CSPs hard to implement, but you need to trust what's on, what they're doing.
There's a really good example here where Google Analytics was, used to, circumvent, web skimming, right?
And what the attackers worked out was, even though there was a CSP header in there for Google Analytics, people could create code that would override the UAID and siphon off the data.
This is from Sansec, they've discovered this, where, they're collecting all the different data and, they're sending it off to their own, Google Analytics account as a custom field, and then they've overwritten the page's custom field, right?
And so it's quite interesting the way this is done.
There are a bunch of tools like static site scanning and CSP tools and a whole bunch of things, but they don't always work, so we need to be mindful of what we need to, to check.
Let me show you some more examples.
Some really interesting ways that hackers are getting in and circumventing some of your websites.
Today's advanced exfiltration techniques are things like encryption and obfuscation, I'll show you that, there's a method known as stenography, there's execution on action, anti debugging, injecting a fake form, checking network requests, using known services like Google Analytics.
Or pretending to be one of those known services, okay?
So these are the ways that people are trying to exfiltrate your data at the moment.
And I'll go through some tangible examples of some of this, right?
You don't even need to do it yourself.
There are groups out there that you can hire.
Some hacker groups that you can go and pay a thousand bucks.
They'll give you 24 7 support for the code, right?
And then they'll go and create this malicious code for you.
And then, they'll deploy it, right?
So you can even do these, things for hire.
There was an example with an e com site, Newegg, that worked out that they had some malicious code on their website.
This is it here, 15 lines of code, right?
JQuery, check certain fields, particularly the credit card field, and when there's a mouse up or a touch end, send off the field to a third party.
Very simple code, right?
The problem was, static scanners didn't work don't pick up on this thing, right?
Because unless there was a mouse up or a touch end, if a static scanner came and reviewed all the code, it wouldn't pick up on this.
It needed to know that the network request was being made.
So we don't always pick up on these things.
But we're seeing this time and time again with web skimmers that are trying to mimic Google Analytics or the Meta pixel and different examples, but often what they will do is do like a base 64 encoding of the hostname.
So if you're looking at the code, you actually can't see the hostname, you don't see what it is.
And then they will decode the hostname before they make the outbound AJAX call.
So the hostnames are hidden.
If you're reviewing the code, the third party code, you may not even pick up what's going on.
So often the code is obfuscated, it's really hard to read, you don't know what's going on.
Again, because it's third party, you don't know line by line what it's supposed to be doing.
But once you, de obfuscate the code, we can actually see that it's actually looking for a number of different fields, be it, name, field, First name, and all these different things, right?
And so they're trying to hide the code of what the code is doing for malicious reasons.
Often it is just this mechanism of trial by fire, right?
It's going to check for a whole bunch of different fields.
It's looking for zip, looking for country fields, do these fields exist on the page?
And if they do, capture the data and send it off somewhere.
These screen grabs were from Malwarebytes.
Sometimes this form jacking technique is where they hide or they remove the, original form on the page and then pop up their own form that looks and feels exactly like your form.
And so this is a mechanism where they will overlay or replace your, your data capture forms or your payment forms and then, try and, send the data off first.
So this is a live example from, one of the analysis we've done at Akamai, where you see that the site will, ask you for your payment details and once you've done it, and hit place order, an error pops up.
But behind the scenes, this has actually sent the data off to somewhere else.
And then, it will then show you the original form, it asks the user to re enter the details because there was an error, and then the order goes through as normal, right?
So the user doesn't know.
The user just thinks maybe there was an error on the site, or, there was a hiccup or what have you, and it tries again.
But meanwhile, the data's been siphoned off somewhere else.
This is an interesting one here.
So we want to go and check the code and see what's happening.
Maybe open up DevTools, but if you have a look, not only has it created a, a domain that looks like, a CDN domain like jQuery static, it's hosting a static version of jQuery, but it's a fake one, it's got code in there that says if DevTools is open, don't do anything, don't send off the data, don't make a network request, Don't issue any JavaScript, right?
Similarly, if it's Google or maybe some other static scanner that's coming in looking for, these kinds of things, don't do anything either.
Just return, right?
And they're getting crafty that you can't go and inspect through the, the, developer tools to see what's going on in some cases, right?
It's clever.
This is quite an interesting one.
Where the attackers got access to the back end server and instead of having the data sent off to a remote third party, what they did was they wrote JavaScript that would collect that PII and the payment data, send it to its own server, so it was going back to the server that it was hosting on, but it was encoding that data as an image file and then putting the image file on the same server.
And then it removed the code, so it was capturing, the PII data for an amount of time, removed that malicious code, those image files were sitting there, and then later on they came and crawled all the images to get the data out.
So it's quite, quite interesting to see.
Here's an example of where, it's calling a png file, but it's some JavaScript trying to load in some other JavaScript.
And so what happens is the browser is ignoring the extension.
Doesn't matter that it says png, because the response header that comes back, the content type of the response header says it's application JavaScript.
So the browser will interpret it as JavaScript.
So when you're looking at the code on face value, it looks like a logo file, but it's JavaScript that's coming back.
Again, a similar sort of technique where a logo file is being accessed.
And then if you look at the underlying code, you can actually see at the top there it says PNG.
So a PNG file has been encoded into that file.
But then later on, there's some code that's another block down the bottom.
And if you decode that.
It's some JavaScript doing some malicious things, okay?
It's actually the JavaScript code embedded into the image.
It gets very clever.
What they will do is they will say, only retain the last 3, 000, the 34, 905 bytes.
I only want the JavaScript, content in that particular image file.
And then it discards the PNG content, which is very interesting.
So that's that again.
Here's one where, It's WebSockets, so it keeps an open connection to a remote server, and then the remote server is checking what it should do.
And then as the connection stays open to the WebSocket, the WebSocket will check what page it's loading it from, right?
And so in this case, it'll check the referrer header, and it won't do anything unless it sees that it's on a checkout page.
And you could be browsing, doing different things on the site, until you hit the checkout page, the malicious code won't be sent back down through the web socket.
This is another technique where something like a static scanner wouldn't pick it up.
There is JavaScript that's encoded on the, error handler.
And when the, obviously when the, image tag loads, and no image is loaded, that error, handler will fire off that JavaScript.
So the JavaScript is hidden as a broken image.
It helps bypass some of those scanning tools that may inspect the network up front.
Often it's the case where these attackers will, purchase a domain that looks similar to a legitimate domain.
So your end users that maybe are not super technically savvy, or even just, on quick glance, you may not notice.
Things like Google Tag Manager without the e.
Com or, Facedook.Host, they're creating these fake domains, and then in this example, there is the RocketLoader.js file that's hosted under a malicious domain that looks very similar to a normal Cloudflare domain.
It's still being loaded over HTTPS, but because it's www http dot ps TLD, it looks like a normal domain, and they've issued their own certificate.
So it's still being issued over HTTPS.
Okay, here's another one, where your favicon is being loaded, content type is image, But they've actually gone to the trouble of creating a clone of a real website, a real Icon website, and they have a malicious version that's myicons dot net, and the real one is iconarchive dot com.
Myicon, myicons.
Net doesn't do anything except load a normal image under normal circumstances.
If you're on a checkout page, it's looking at the referrer and then sends back a, Fake form and HTML and JavaScript.
So it's doing the dynamic check to see when it should load, what it's loading, right?
And so if you're just googling for an icon, and then you put that on your website, may not look like it's doing anything under normal circumstances, but once you hit the checkout page, it's going to show malicious credit card forms to your users.
Interesting one here, so it's actually only going to fire if it loads, if it sees that Stripe is on the page.
So if Stripe's on the page, then it does its magic.
Otherwise, it doesn't do anything, right?
And in this example, it creates a fake iframe, to try and look and feel like a normal payment checkout.
You can see down the bottom, it's actually sending that iframe off to some malicious domain somewhere.
Interestingly, this one will, look like Google Analytics.
So even, as a developer, if you're looking at this looks like normal, GA code that needs to fire, but it's actually this host name that gets loaded, and it's the ATOB method that will, decode the value of this, variable into the host name, right?
So it looks like it's loading Google Analytics, but it's not.
There is another example here, where, a file, an icon is being loaded on, GitHub, but then there are references to, references of it in, in code, and it actually ends up loading JavaScript under the hood.
And so that's hosted on GitHub, and, there might be reasons where, on Stack Overflow, people are copying code, or, other things are happening there.
It's a publicly referenceable image that's actually loading JavaScript on Github.
This one, again, is for the devs where you're looking and you're saying, okay, this is my jQuery UI, okay, this is, this looks fine, but, the first part of it's jQuery UI, and then inline is this malicious script that's got the malicious code off to this third party, Google bucket, and then jQuery UI continues to load down, right?
You looking at it on face value, it looks like jQuery UI, but in fact it's something else, doing something else.
And this is a very interesting one actually.
So they created a web skimmer to do domain name generation.
So it would dynamically create a subdomain of the malicious hostname.
So if you're trying to do fingerprinting or domain reputation or something like that, it would randomly generate a CNC server for you using that hostname.
And so it's typical tools that might have to hard code the, malicious hostname in, into their checks wouldn't pick this up.
And then, the last one here is this pipka attack, and so Visa have actually written a white paper about this.
Very interesting piece of code.
It would do web skimming on a bunch of checkout pages, but after it had done some skimming, it would self delete itself.
So it would get there, get injected, capture some data and then delete itself off the page.
Very, interesting code.
So you can check that out online if you're interested.
And that's how it's done.
Okay?
So there's lots of different ways.
You, your site could be compromised.
Your third parties could be compromised.
You've seen the different methods that are available.
And how we can do this, whether it's creating fake domains, it's encoding JavaScript in images, it's getting data off third parties that could have malicious code.
And so the question is, should it matter, right?
A lot of it in cybersecurity now is not if you're going to get breached, it's when.
Because it's happening every day.
All of these big companies are being breached all the time.
And so what do you need to do about it as, as, as devs here?
There are major implications for some of these breaches.
So that British Airways example that I talked about earlier, the European Union were, recommending that they were fining British Airways, 329 million as a part of that, right?
And that was one of the first groups as a part of GDPR that were going to be fined because, all this data, this PII data, and, and payment data went off to a third party.
There's also something that's come out recently called the PCI DSS version 4.
And so previously, some companies might say it's not my problem, right?
We just overlay Stripe or we have a PayPal checkout button and that's PayPal's problem, right?
With DSS, with PCI DSS 4, it now also becomes your problem.
So if you're doing payments, or if you're doing any sort of transactions online, you also need to have those controls in place, and you also could be liable.
So this is a big issue for companies now, they need to make sure that they are compliant.
The deadline for that is 2025, so people are working on it now.
A breach, typically this year, it will go up towards an average of 4.45 million, so it's happening all the time.
But as I said, there's regulatory concerns, there's revenue loss, of customer trust, big problems.
So what do you do now?
It's easy.
I've got the answer for you.
Don't load JavaScript on your site, right?
No.
You need something that will monitor scripts in real time.
These implications I've talked about are real.
You need to make sure that your third parties are not exfiltrating your data.
Domain reputation plays a massive role.
We're doing a lot of research in this space at Akamai.
We're doing something called client side protection and compliance.
So there's, a whole bunch of research that we've published about that.
But ultimately, you want to consider how your JavaScript changes over time.
That you're not, getting data exfiltrated.
That third parties don't have access to sensitive fields and there aren't vulnerabilities in that code.
That you have the ability to, to have domain reputation checks in place.
And you know what your third parties are doing.
We've published a whole bunch of research that I've talked about today on akamai dot com slash blog, if you want to check it out, and that is my talk.
Thank you.
That's me on Twitter, LinkedIn, and if you're interested in joining our, tech meetup, there's the link as well.
And I think we have a couple of minutes for questions.