Monday, July 14, 2014

Software development experience

The more related information u see the earlier, the better.
The more flexibility and less assumption you have, the better.
Try to make the workflow clear on the upfront, don't unnecessary hiding some workflow in inner function.

Effective coding
Use the method first, then define it.

1. Sequential Coding - Do not worry about the syntax error. Type it at the end. Write the method then use it.
2. No Design Time - Code during planning, planned during coding. Code whatever you can first, add information when necessary
3. Use first- Use first, then declare
4. Use keyboard - faster
5. Use Logic - not trial and error
6. Copy and paste -whenever code can be duplicated
7. Standardize naming convention - be discipline and do not need to decide

Monday, July 7, 2014

false positive > false negative for malware classifier

1. While this suggests that some malicious contexts are not being classified correctly, for most purposes, having
high overall accuracy and low false positive rate are the most important attributes of a malware classifier.

Reference: ZOZZLE : Fast and Precise In-Browser JavaScript Malware Detection
2. A "false positive" is when antivirus software identifies a non-malicious file as a virus. When this happens, it can cause serious problems. For example, if an antivirus program is configured to immediately delete or quarantine infected files, as is common on Microsoft Windows antivirus applications, a false positive in an essential file can render the Windowsoperating system or some applications unusable.[32] Recovering from such damage to critical software infrastructure incurs technical support costs and businesses can be forced to close whilst remedial action is undertaken.[33][34] For example, in May 2007 a faulty virus signature issued by Symantec mistakenly removed essential operating system files, leaving thousands of PCs unable to boot.[35]
Also in May 2007, the executable file required by Pegasus Mail on Windows was falsely detected by Norton AntiVirus as being a Trojan and it was automatically removed, preventing Pegasus Mail from running. Norton AntiVirus had falsely identified three releases of Pegasus Mail as malware, and would delete the Pegasus Mail installer file when that happened.[36] In response to this Pegasus Mail stated:
On the basis that Norton/Symantec has done this for every one of the last three releases of Pegasus Mail, we can only condemn this product as too flawed to use, and recommend in the strongest terms that our users cease using it in favour of alternative, less buggy anti-virus packages.[36]
In April 2010, McAfee VirusScan detected svchost.exe, a normal Windows binary, as a virus on machines running Windows XP with Service Pack 3, causing a reboot loop and loss of all network access.[37][38]
In December 2010, a faulty update on the AVG anti-virus suite damaged 64-bit versions of Windows 7, rendering it unable to boot, due to an endless boot loop created.[39]
In October 2011, Microsoft Security Essentials (MSE) removed the Google Chrome web browser, rival to Microsoft's own Internet Explorer. MSE flagged Chrome as a Zbot banking trojan.[40]
In September 2012, Sophos' anti-virus suite identified various update-mechanisms, including its own, as malware. If it was configured to automatically delete detected files, Sophos Antivirus could render itself unable to update, required manual intervention to fix the problem.

Reference: http://en.wikipedia.org/wiki/Antivirus_software

3. A further complication of in vivo filtering is the asymmetry in error costs. Judging a legitimate email to be spam (a false positive error) is usually far worse than judging a spam email to be legitimate (a false negative error). A false negative simply causes slight irritation, i.e., the user sees an undesirable message. A false positive can be critical. If spam is deleted permanently from a mail server, a false positive can be very expensive since it means a (possibly important) message has been discarded without a trace. If spam is moved to a low-priority mail folder for later human scanning, or if the address is only used to receive low priority email, false positives may be much more tolerable. In an essay on developing a bayesian spam fi lter, Paul Graham [16] describes the different errors in an insightful comment: False positives seem to me a different kind of error from false negatives. Filtering rate is a measure of performance. False positives I consider more like bugs. I approach improving the filtering rate as optimization, and decreasing false positives as debugging.

Reference: "In vivo" spam filtering: a challenge problem for KDD


4.  Reference: A False Positive Prevention Framework for Non-Heuristic Anti-Virus Signatures