Monday, April 27, 2009

Who Cloned the Web Site? Here's how to Tell...

One of the problems regularly encountered when dealing with phishing, fraud and other flavors of counterfeit Web site content is the process of tracking back who the original perpetrator of the crime was.

Given the stateless nature of Web application technologies and the abundance of tools capable of conveniently cloning and creating "off-line" copies of popular transactional Web sites, it's damned near impossible to tell where the copy came from unless you can uniquely "seed" the content in some way.

Over the years I've been asked by dozens of financial institutes around the world as to the best techniques and technologies that can be used to seed Web application content and tag it in such a way that it's possible to figure out who the original "copier" was - without alerting them to the fact.

There are a number of techniques available to Web application designers and architects, but I've found the best solutions (i.e. least detectable and least prone to tampering or removal) revolve around tagging the images used within the application.

It's not an easy solution to implement, but the principle of "Distribution Tracing" can be applied to Web applications in the form of anti-fraud images.

I've finally had a chance to knock up a whitepaper describing the relative merits of the different techniques after all these years, and you can find it on my main Web site under the topic "Anti-fraud Image Solutions".

Now I'm sure some people are going to question the merits of the solution - and rightly so. I'm not an overly strong proponent of this kind of tracking solution. It needs to be used carefully and with an expert eye in order to yield prosecutable results, but for some organizations (particularly financial services organizations) it adds an extra arrow to their quiver in hunting down criminals who try to defraud their customers.

So, here's a question for readers (after they've read the paper of course)... can you name some of the large international banks that have already implemented Distribution Tracing within their secure customer portals?

The whitepaper's abstract:

The Use of Distribution Tracing Within Web Content to Identify Counterfeiting Sources

Many of today’s more successful Internet-based fraud tactics require the counterfeiting of popular transactional Web sites such as financial portals, stock-trading platforms and online retail sites. For the fraud to be successful, the cyber-criminal must typically clone most, if not all, of the targeted site’s content and host the counterfeit site on a Web server under their control. With some minor modifications to the underlying HTML code and changes to the application logic, the cyber-criminal will seek to steal the personal authentication or authorization credentials of unlucky victims who fall to the counterfeit site. Armed with these credentials, the cyber-criminal will subsequently attempt to defraud the accounts of their victim.

The major subclass of this attack is often referred to as “phishing” and typically targets the customers of major financial organizations; with the cyber-criminals end-goal being the removal of monies from their victim’s bank accounts. However, over time, phishing attacks have increasingly targeted a broader range of online consumer.

One key problem facing organizations targeted by these cyber-criminals is the identification of the perpetrators. While it is sometimes a simple task to shut down or have removed a counterfeit site, it is much more difficult to uncover the identity of those responsible for its creation.

Since the counterfeit sites are predominantly clones of a legitimate site, there are a number of techniques that can be employed by an organization to essentially “embed” a key in to the duplicated content which can then later be used to trace back to the original source of the content.

This whitepaper provides an overview of the techniques available to organizations that wish to undertake such identification activities – evaluating the pro’s and con’s of the various mechanisms and providing advice on how to employ this class of investigative technology.

<PDF of Anti-Fraud Image Solutions>


  1. have you read ?

  2. looks interesting from a "who's using my image" but it fails to identify the watermarked images used by some of the online banks - so I don't think it's useful in that context.

    With regards to the book link for "Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Program Protection" by Jasvir Nagra and Christian Collberg - it hasn't been published yet. I hope I haven't stolen their thunder on the topic if they were planning on covering the Web application image perspective - but this is a technique that's been used carefully for a few years now.

  3. Excellent whitepaper - it's now in my "key resources" folder. The techniques may also be useful for code/data leakage tracing: