If you’re a developer, perhaps you’ve landed on one of the many articles at InfoQ, which is an online publication for software developers. They also organize conferences for the developers community (specifically tech leads, architects, project managers and the like) all around the world called QCons. This was my first time attending a QCon, which happened to take place in London, March 7-9, 2016. I’ll go straight to my notes and then dwell a bit on my impressions of the conference in general.
- Modules in JDK 9 probably won’t break your code
- Distributed systems are not trivial. Read science papers and learn from forerunners to get it right. Try testing in production
- Work on your gratitude. It’ll turn crap to gold
- Node.js is not Java. Learn to use async, avoid classes and make use of the npm ecosystem
The Longer Story
So now when we’ve got the execs out of the way, let’s dive into what I as a developer learned from QCon London 2016. There were many sessions. And I’ve yet to find out how to hack the space-time continuum, so of course this is by no means a complete summary of the whole conference.
Modularity in JDK 9
The modularity that project Jigsaw brings to upcoming JDK 9 might not be such a deal breaker as has been feared. Yes, it breaks backwards compatibility, but only for non-official APIs like sun.*. If you use java.* and javax.* there seem to be no worries. Also, you’re no longer allowed to use a single underscore as an identifier in your source code. If that’s breaks your code, I’d say you have bigger problems than JDK 9. Modules are even automatically inferred from JAR names if not specified otherwise.
Conclusion: Keep calm and carry on
We are still struggling to get Microservices right. This is expected since we have gained scalability and resilience by allowing more complexity. Some pitfalls (and some good advice) were highlighted:
- Starting with a full-scale microservices system may be overwhelming. Start small and then divide into microservices
- Remember that changes in data format (database schemas) are also API changes for services accessing the data. Consider having a gatekeeper service being the only one accessing the data source. Offer multiple versions of formats to help consumers of services to migrate to newer versions, preferably semantic versioning (e. g. 1.2.1 = BREAKING_CHANGE.NEW_FEATURE.MINOR_PATCH)
- Spiky load on your services means that your services have to be scaled to peak load all the time. Smooth it out using queues as buffers between services. You’ll have to deal with flow being asynchronous.
- A distributed system is a challenge for operations. Make sure you don’t use hardcoded IP addresses and ports. Use a discovery service or a centralised router (proxy traffic). Then you may also add circuit breakers at these points that keep track of services that are unresponsive, so that you avoid requests piling up (dog pile)
- Monitor using graphs, alerts and pages. Make sure you have centralised logging where correlation ids are generated by initial requests and travel through the execution in the system. An error deep in the system may then be put in context all the way from the point of origin
- Microservices allows for an explosion in different tech used in each service if you’re not careful. Same goes for server configuration. Make sure you don’t have any unique servers that are irreplaceable when they fail (snowflakes). You noticed I didn’t say “_if_ they fail”, right? Have one golden configuration and automate e v e r y t h i n g (last word should be read with a Gary Oldman voice: https://youtu.be/h3ywuv8lJMs)
- Microservices is great for teams. It enables you to work independently on each service, shipping like mad. However, when one service is depending on another service, the boundary is often sketchy and undocumented. If your tech is somewhat aligned, consider having each team providing a client for accessing their API. The client may even have a mock switch for testing locally. Neat-O!
Conclusion: There’s no such thing as a free lunch
Testing in Production
The world is chaotic, so we should plan for chaos. A good time to introduce chaos into your systems is when you’re awake and working. Then you’re able to monitor and control the impact. Make sure your tests can be run in a specified scope, e. g. 1 user, % of users or all users. So how do you go about doing Chaos Engineering? In short:
- Define what’s considered normal behaviour in your system
- Define a test group (users, servers or clusters) and a control group
- Introduce chaos: server crash, network failure, etc
- Note difference between test/control group
As a developer, you tend to begin designing for failure after some visits from the chaos monkey.
OK, there’s no novelty in causing havoc in production infrastructure. Netflix have been doing this for almost half a decade now. But how do you find those really nasty bugs that only appear for a certain user in a specific context and goes undetected by any monitoring? How do you design and automate such tests? Netflix and author of paper Lineage-driven Fault Injection teamed up and is working on a way to automatically insert errors in probable execution paths in distributed systems to find ways it may fail. The test tool gathers all execution paths it can find and calculates which lines of execution make sense to test. This results in a very efficient and comprehensive test of the whole system. A couple of challenges that were specific to Netflix when building this tool:
- They have circuit breaker on most execution paths (Hystrix), which confused the test tool with alternate execution paths when there’s a failure
- No requests are the same. The environment and the context is constantly moving, which made it difficult to repeat requests
Conclusion: Chaos is the neighbour of God
Apparently we’ve come a long way when it comes to collaboration. Until recently (last couple of centuries) we had gang up to kill or imprison people (bosses) to drive change. Now we use politics, which in many cases has to do with passing around blame. Blame used to run downwards from management to the factory floor. But since delivery is continuous and everything is measurable nowadays, blame is also running in the other direction towards management in the form of data. If management is not capable of processing negative feedback, we’re in for a rough ride, because in a constantly changing/improving world we’re always doing it wrong.
Not only management feels the burn of continuous delivery, though. Let’s say it takes three interactions in a team to deploy a new version of a product. That’s not such a high degree of collaboration needed for shipping every week. Now imagine how many interactions are needed to ship five times a day.
So this continuous delivery generates stress in all levels of the organisation. Stress causes your brain to simplify thinking and your body to fight or flee, i. e. you become stupid. One way to deal with this is to change our mindset. What if this shit storm of constant (negative) feedback is really a rain of gold nuggets? Buddhists are thrilled to learn that they did something wrong. It’s a golden opportunity to improve. If you blame someone else, you’re left without anything to work with. You make yourself helpless. With gratitude you can face the stream of uncomfortable data.
Some advice for managers (360 Thinking):
- You’re not the expert
- You should aim to improve the situation, not fix it. It’s chaos
- You may not be able to change anything, apart from your own reaction
- Gather data
OK, there’s a lot of feedback going on in all these interactions. How do we make the feedback more effective? One way is to make sure that it leads to something. You need a feedback loop:
Feedback offered/sought > heard > actioned > offered/sought > …
If any of the steps in the loop fails, the loop breaks down and the feedback has little effect. There are some models about how you could give feedback, e. g. the SBI model. It’s difficult and you need to practise. However, receiving feedback is easy to learn. Just listen to the feedback and say “Thank you”, nothing more. Take the feedback with you and process it later.
Conclusion: Gratitude is fearlessness, “What’s here that I can work with?”
* Beware of callback hell, which leads to high memory usage and make garbage collection wait until all nesting has been executed. Also don’t ignore callback errors
Solution: Use async or promises
Solution: Export functions with interfaces instead
* Don’t use a lot of arguments to function. It’s confusing and not flexible
Solution: Use an option object as argument. If you also have default values in the object, you may avoid a lot of code just testing for argument values
* Using global variable is a common pitfall. It makes the code difficult to read and may lead to unforeseen consequences
Solution: Create a module with the variables and use it with “require”
* The application runs great in the local development environment, but has performance problems when load increases. This can happen if large amounts of data is processed that delays the event loop
Solution: Make sure only data is processed in small chunks
* Tests are written like typical unit tests and doesn’t capture if code is behaving as expected
Solution: Test at right level, the behaviour instead of the internals. Make sure code is modularised. Use npm modules that are tested instead of writing own code
In general try to take advantage of the npm eco-system with thousands of modules. You can use tools like snyk to check for known vulnerabilities in your dependencies
Conclusion: When in Rome, do as the Romans do
The Conference in General
It was the first time I attended QCon, so it was interesting to see their take on how to organize a developer conference. Here’s what I think they did right:
- Have a useful giveaway handed out in the beginning to set the tone. I used the battery pack they gave me all the conference. Very timely and practical
- Divide the sessions in tracks or themes, so that you get how a session fits into the big picture. This has become standard in most conferences, but it’s still a good idea
- Have each track host pitch their sessions at the beginning of each day, so that you get a better understanding of each session before you go. I changed my schedule every time after that presentation
- Have a nice modern venue, preferably close to the city center, so you’re able to take a stroll and see things when not in session. At QCon you even have a nice view over classic landmarks as The Big Ben and The London Eye (Is it moving? I can’t tell)
Conclusion: If they keep it up and you’re not deterred by English food, I can recommend going to next QCon in London
Wow, you made it all the way down here. You deserve a treat. So, here are my tweets from QCon London 2016, with extra material not covered above: https://twitter.com/search?q=%40ocklund%20%23qconlondon
I would also like to recommend my colleague Magnus Ljadas’s take on the same conference: https://magnusljadas.wordpress.com/2016/03/10/qcon-london-2016-mars-7-9/
Feedback is always welcome at @ocklund