If you don’t know what VirusTotal is yet, it’s a “Free Online Virus Malware (and URL) scanner”. Although many users of VirusTotal are relying on the front-end service to identify whether a file is identified as malware by any of the large number of anti-virus engines, the service is more like Facebook and Google where the regular users are the ones actually providing the service. On the back end, VirusTotal is actually a project of Hispasec Sistemas that relies on the public contribution of new files to provide samples to anti-virus companies for analysis. I have a few concerns with how the front-end service is being used and how it is being provided.
I only have anecdotal evidence of this, but I suspect that a large number of malware authors are using VirusTotal to scan their own software. This is risky for the malware author, because VirusTotal indicates that if just one anti-virus engine identifies the software as malware, that a sample of the software will be sent to the other anti-virus engine vendors that didn’t identify it as malware.
However, if none of the anti-virus engines identify the software as malware, the author not only knows that they have a great 0-day to release, but also that they have just stored cached results on VirusTotal indicating that the file is clean. Many VirusTotal users will now be lead to believe that the software is clean, because the users will be presented with cached results when they scan the file themselves, using VirusTotal.
VirusTotal should always reanalyze the file with the current anti-virus definitions and optionally offer historic results to those interested in them.
Since I’ve had to explain to VirusTotal users the problems with trusting VirusTotal’s results, I know that users aren’t always reading the VirusTotal TOS and FAQs. VirusTotal needs to explain exactly what they’re providing and how it might be used, on the front page, or with the results that they provide.
VirusTotal’s file scanning is being offered with no service level agreement, scanning is not being performed in a typical environment, cached/old results are often being displayed, your submission will be stored indefinitely and may be shared with third parties, and results that do not indicate the presence of malware do not indicate the absence of malware. Assurances from members of VirusTotal’s “community of trust” can actually do the reverse of providing warnings, by offering false assurances.
VirusTotal should clearly provide these warnings up front and reign in their VT Community project.
Not enough transparency
It’s probably not a good idea for the VirusTotal team to detail the environment that they run the anti-virus engines in, because the malware authors would quickly adapt to make VirusTotal useless to Hispasec’s intended use of the project. However, it would be nice to minimally know which options the anti-virus engines are being run with, such as what heuristic levels are being used.
Without this type of information, using VirusTotal purely as the research project that it should be, is rather useless as a research project. Citing VirusTotal is therefore like citing wikipedia.
Additionally, once a file is submitted, control is lost over that data. There is no way to know how it is stored, for how long, and who it will be revealed to. If they track an IP address to you, will they be able to show the world all of the files that you submitted? It’s likely that you violated a software license by submitting any files that you didn’t write and aren’t open source.
Not all bad
VirusTotal is adding new features and new anti-virus vendors. They obviously have some goal in mind, and they are likely meeting their customer’s needs. Unfortunately, those customers are the anti-virus vendors, and not the users. VirusTotal may have a purpose to its users. I just haven’t figured exactly out what it is yet. Please let me know how you currently use VirusTotal and how you would like it to change to meet your needs.