Aug 09, 2019 newsletter

Julia Developer Survey results indicate Julia may be the language of the future

The Julia community recently took part in the Julia Developer Survey following JuliaCon 2019, where developers gathered to discuss the future of the language and its growing impact on the world of data science. Julia is generating serious developer interest: according to the TIOBE index, a method for measuring the popularity of programming languages, Julia is now ranked 35th after first breaking the top 50 languages just three years ago. Unsurprisingly, Julia was one of the fastest growing languages in 2018.

The most popular features of Julia were its performance, ease of use, and open source license. When asked why they first tried Julia, most developers stated that they believed Julia is the language of the future.

Julia is a high-performance language that works particularly well for scientific computing. Stylistically similar to Python but with the performance of lower-level languages like C++, Julia combines high-level code and an intuitive syntax with production-grade performance. In doing so, Julia solves the infamous two language problem, a common issue in the data world where data scientists prefer to code with high-level languages like R and Python, but performance-critical parts of the codebase must be rewritten in C or C++. Rewriting code wastes time and developer resources and is a costly duplication of effort. By operating across the entire development stack, Julia can noticeably accelerate the development of complex data-driven software.

While Julia may be the language of the future, it must also overcome a number of hurdles before seriously competing with Python, the data science behemoth. Julia’s most significant problem is its package ecosystem: the language’s packages aren’t as mature or well-maintained as required. Conversely, Python benefits from a rich ecosystem of tools and relies on a massive community of developers to extend its functionality.

Python is likely to be the default data science language for many years, but Julia will continue to expand its influence as machine learning and complex data projects become mission critical for more companies. If successful, Julia can radically streamline the relationship between software development and data science and begin a new era of robust data engineering.


DeepCode raises $4M to uncover bugs using models trained on millions of lines of open source code

DeepCode, a Swiss startup building an intelligent code review tool, raised $4M to expand its machine learning models for automatic vulnerability detection. Unlike other tools that simply review imported packages and dependencies, DeepCode can detect more complex issues, such as cross-site scripting and SQL injection vulnerabilities, by understanding the intent behind the code——not just syntax mistakes. DeepCode currently supports Java, JavaScript, and Python, but hopes to expand to C#, PHP, and C/C++.

DeepCode is trained on thousands of open source projects, mostly public repositories on GitHub. By ingesting GitHub’s rich commit history, DeepCode understands changes in code so that it can infer where bugs may have been in the code and what changes were needed to fix them. According to CEO Boris Paskalev, "On average, developers waste about 30% of their time finding and fixing bugs, but DeepCode can save half of that time now, and more in the future."

DeepCode and similar tools are starting to automate code creation, either through smarter code reviews or more accurate code completion. DeepCode, however, adds an extra level of complexity: whereas code completion tools like TabNine are trained by taking snapshots of code to provide suggestions, DeepCode compares snapshots across commits to build its models. By understanding how open source code changes over time, DeepCode can recommend similar changes to developers.

DeepCode also joins a growing number of startups that are working to leverage huge swathes of open, public coding data. As the host for much of the developer world’s open source repositories, GitHub holds a powerful position as the aggregator of terabytes of valuable coding data. GitHub is likely to develop its own capabilities or, as it has done with Pull Panda and Dependabot, acquire fledgling companies. Coding data is GitHub’s most valuable asset, one that it will likely try to take advantage of as it tries to turn GitHub into a hub for code automation and intelligence.


Class-action lawsuit filed against GitHub for hosting instructions to hack Capital One

Last week GitHub and Capital One were accused of negligence in the recent exposure of 106 million individuals’ personal data. The perpetrator behind the attack, a former AWS engineer, was arrested by the FBI for unlawfully gaining access to Capital One’s AWS S3 bucket to copy customer data, in violation of the US Computer Fraud and Abuse Act. The hacker was able to bypass a misconfigured web application firewall to steal roughly 140,000 US Social Security numbers, 80,000 bank account numbers, and 1 million Canadian social insurance numbers. Capital One only discovered the breach after a GitHub user alerted them to a GitHub post documenting the attack.

The hacker created a GitHub Gist post that included instructions on how to download Capital One's customer data and shared those instructions with her friends. GitHub argued that the post contained no Social Security numbers, bank account information, or other stolen data. The Gist, however, did contain content with information about the methods used to steal the data, which GitHub took down once notified by Capital One.

The lawsuit accuses GitHub of "failure to monitor, remove or otherwise recognize and act upon obviously-hacked data that was displayed, disclosed and used on and by GitHub and its website" and, as a result, "the Personal Information sat on GitHub.com for nearly three months."

GitHub takes a passive role in policing content on its platform, preferring to remove content only when requested by a third party. Should GitHub be required to scan for potentially illegal content—similar to how Facebook and Twitter operate — or will that interfere with its reputation as an open platform? GitHub can actively scan repositories for vulnerabilities and private package credentials, leaving open the possibility that code could potentially be analyzed in other ways, such as flagging hackers with malicious intent. As GitHub gets smarter at helping developers patch code vulnerabilities, expect greater pressure from the world beyond the repository platform to take greater responsibility for containing data breaches and other security issues.


Small bytes

  • Deconstructing the monolith: designing software that maximizes developer productivity [SHOPIFY]
  • Reclaim unreasonable software: fixing post apocalyptic code [IRRATIONAL EXUBERANCE]
  • How to write clean code that reduces headaches [TOWARDS DATA SCIENCE]
  • A guide to talent stacking as a developer [DEV.TO]
  • An introduction to domain-driven design: a model for building rich, evolving software [KHALIL STEMMLER]
  • Perils of constructors: a philosophical discussion of the complexity of constructors [MATKLAD]

Tools

  • Dolt lets users collaborate on databases in the same way they collaborate on source code [GITHUB]
  • Lefthook is the fastest polyglot Git hooks manager to make sure not a single line of unruly code makes it into production [LEFTHOOK]
  • Serverless Components enables you to deploy entire serverless use-cases, like a blog, a user registration system, a payment system or an entire application — without managing complex cloud infrastructure configurations [GITHUB]
  • Hackathon-starter is a boilerplate for Node.js web applications [GITHUB]
  • Real Dev is a platform for developers to show off real technical skills [REAL DEV]
  • Codelines is a Visual Studio Code extension that lets you create programming articles inside the IDE by integrating rich text with your code [CODELINES]
  • JSONBase provides API-based JSON storage [JSONBASE]
Never miss the big news

Every week, our team will send you three of the most important stories for developers, including our analysis of why they matter. Software development changes fast, but src is your secret weapon to stay up to date in the developer world.

Featured articles
AI Ethics: How Diverging Global Strategies Open a Gaping Regulatory Void

Today global initiatives on AI are a series of regulatory and ethical gambles—a dangerous, potentially existential game.


Can Master Chief win the day for Microsoft Azure?

Why the Xbox will be Azure’s unlikely hero.


Churn Baby, Churn

Understanding churn rates can help developers be more productive and write quality code

Made with by Software. Read more about our mission.