In programming, anonymity can be both a blessing and a curse…
Not so long ago, the benefits of anonymity outweighed the disadvantages. Users could consume, comment, and converse freely without worry of identification. A 1993 cartoon published in the New Yorker summed it up perfectly with the caption ‘On the internet, nobody knows you’re a dog’. But at the same time, nobody knows if you’re a cyberbully or an online troll either.
Apply this to the world of programming, where cybercriminals and hackers are rife, and the downsides of anonymity are clear. Amidst concerns over data protection and company integrity, it has become all the more important for companies to deliver products and services that are traceable and transparent. Not only this, but the rise of cybercrime has encouraged organisations to be especially diligent. Are the days of programmer anonymity numbered?
Decoding artificial language with artificial intelligence
At this year’s DefCon hacking conference, Professor Rachel Greenstadt of Drexel University and Assistant Professor Aylin Caliskan of George Washington University showcased their stylometry based research. The field of stylometry aims to study linguistic style, but Greenstadt and Caliskan’s work was different. By using a machine learning programme, they were able to apply stylometry to artificial language… In other words, code. After studying examples of a certain programmer’s work, the programme could de-anonymise coders to accuracy levels of 83 per cent.
Interestingly, the programme was developed with funding and support from the US Army Research Laboratory. Given the threat from international cybercrime, this isn’t surprising. In theory, the system could be used to identify the creators of malicious code or malware, which has obvious advantages for governments and other large organisations. Tech companies, for example, could use it to help their security teams to track the origin of malware and computer viruses. Not only this, but consumer demand for transparency has made it vital for businesses to know how – and by whom – their digital operations are run. A further application for the programme is in detecting plagiarism, which is equally relevant to the corporate and educational worlds. The problem, however, is that decreased anonymity blurs the already fine lines between security and privacy.
Anonymity vs. privacy
Being able to identify specific programmers using the system devised by Greenstadt and Caliskan clearly has its uses. However, there are instances in which remaining anonymous in the tech community is preferable. Take, for example, Satoshi Nakamoto, the creator or creators of Bitcoin. Nobody knows who Nakamoto is, but this is perhaps a good thing. By its nature, Bitcoin is decentralised, and trust should be in the ledger rather than the creator. Without an identifiable leader, the technology itself is the focus. Another example comes from the open source community. The ability for organisations or individuals to work out exactly who has contributed code may discourage programmers from publicising their work, and as a result, lead to stagnation. Companies that demand to know the identity of all of the programmers who contribute could discourage coders from sharing their skills. Ironically, a desire to be more open about who is developing what may well have the opposite effect.
It is important for organisations to know where their code has come from, but is it necessary to know exactly who created it? Rather than unveil the identity of all coders, Greenstadt and Caliskan want to use their machine learning programme to understand coding patterns. But the implications of being able to identify programmers is clear, especially for large organisations that need to protect both themselves and their consumers’ data. Businesses need to be accountable, but not to the extent that programmer privacy is compromised. Otherwise, the coding community could quickly face a drought of developers.
For DISRUPTIONHUB’s insights straight to your inbox, sign up here.