Fly on the wall at a core development team meeting

When working alongside a development team every day like I do, there is really very little mystique about how the team functions, communicates or works towards producing new releases. But it strikes me that those who primarily see the end result, may be curious about what happens behind the scenes.

At the moment, we are actively working towards a new major release, HPCC Systems 7.0.0. So I thought you might like hear about how that is coming along, while taking a window into the world of one of our meetings.

Our third party tools, (build process, security, ECL Watch, documentation etc) are developed by a team that is predominantly based in the USA, while the core components are developed by a team of developers located in the USA and Europe. Both teams meet independently and the whole development team meets weekly by conference call. Maybe I’ll blog as a fly on the wall from one of these meetings sometime, or from one of their offsite week long retreats. But for this one, we’re eavesdropping on a core development team meeting. 

The core development team meets weekly by conference call and in person once a month. This is due mainly to geography, since we all work remotely from home and are dispersed between the USA, UK and Ireland. The meeting location rotates around the homes of the European contingent and those in the USA dial in as needed. It’s as much a social gathering as it is a work meeting. While we’ve all worked remotely from home for a very long time, none of us are robots and it’s good to have this contact. Often there will be a group whiteboard coding session focusing on a project that benefits from a ‘heads together’ approach. It’s clear to me that while much is achieved by these coding sessions, the developers also really enjoy the collaboration.

For me, these meetings are a chance to sit in on discussions about planned features and to check up on progress made. I usually use the journey time to harangue Richard about a few things on my list, since he’s a captive audience for a while. So as we make our way to the train station, we’re already talking about our plans to flip the switch on some new JIRA database changes we’ve been planning to improve the issue triaging and monitoring process. We set up a test database a while ago which the team reviewed. Now that HPCC Systems 6.4.0 has gone gold, it seems like a good time to bite the bullet and make that switch. 

Moving on, we start talking about HPCC Systems 7.0.0. We already know this release is all about improving the user experience. There’s a lot on the provisional list, so this is the chance for a high level update and the team will discuss progress made on the core components later.

Coffee first

This month, we are meeting in London, although John will join us from the USA by phone and Gordon will dial in from Ireland. As is customary, coffee and some kind of second breakfast is provided. Today Jake has provided chocolate biscuits. We go around the room adding to the agenda.

First up is a quick discussion about the HPCC Systems Community Day Summit coming up in October. Trish is organising this and needs details of the presentations. Some nagging is needed. It’s a hardship but, ah well if I must…

Immediately after the conference the core team are having an offsite meeting and it’s clear that some of the topics on the agenda today, will also feature there and may get coded at the same time.

Previous offsite team building session

Previous offsite team building session. From the left: Richard Chapman, Mark Kelly, Shamser Ahmed, Tony Fishbeck, Gavin Halliday, Gordon Smith, Yanrui Ma, Jake Smith, Attila Vamos.

Although our discussions relate to planned features and improvements, some are not necessarily overtly visible to our users. Richard starts by talking about the work he has been doing that will allow us to generate C++11 code (ECL is translated into C++). He asks the question, is it worth it? The consensus is yes, because the features included make some things more efficient, for example, loading workunits will be faster. But it needs a bit of work and some adjustments will need to be made to our code generator, plus we need to consider the wider impact it may have. Someone cynically reminds the others that C++ 17 has just been ratified!

We completed some work in the 5.0.x series to improve the workunit statistics in ECL Watch and are extending this work in 7.0.0. With this in mind, Gavin and Shamser start discussing the new scope iterator. This is another behind the scenes change but it’s significant to the planned UI improvements, including the new ECL Watch Dashboard, which Gordon is working on. It’s already possible to get some useful statistics about your running workunit, but the new workunit Dashboard will make it easier to access and display those statistics. The scope iterator is what makes it possible to do this in a consistent way. There’s a discussion about how far they have got already and what needs to be done to enable users to see sub-graphs instantly, as well as improving the responsiveness for large graphs. Enough statistics improvements have already been completed behind the scenes, to allow Gordon to move on with the fast syntax check; another new feature for HPCC Systems 7.0.0.

We break for lunch. Some effort is made to talk about something other than work. For some reason, it’s what jobs will still be done by humans 30 years from now. Someone says beautician. I don’t know what I was expecting, but it wasn’t that.

Lunch break

We reconvene. Gavin is up again for the next topic. People say the worst slot for keeping people’s attention is immediately after lunch. Jake is prepared, providing more good coffee and some sugar in the form of lemon torte for dessert to give us all some zing!

Gavin and Richard have been working on the remote project and filter update feature for dafilesrv (dafilesrv is used to read data remotely from other machines in a cluster that contains stored data files). The idea behind this feature is to push details about which information and rows are needed all the way back to dafilesrv rather than pulling it back then filtering it. This will extend dafilesrv making it a processing engine for local processing, but it will need a new interface because it will be stream based rather than random block. It needs to be JAVA and external call friendly and the guys start bouncing around ideas about the types of syntax it needs to support. JSON, XML, plain text and SQL all get a mention as examples of those users will want to use to describe the data they want to pull from dafilesrv. Gavin says for now we will support project and filter only, although eventually we should support some aggregation too.

John is working on the Spark integration which would be a client of this new feature alongside Thor and maybe Roxie. Jake says that perhaps the method they are discussing should be considered as a separate service completely since it’s nothing like dafilesrv.

Silence; but you can hear the thinking going on and see the stroking of chins. Then agreement. There’s been quite a bit of to-ing and fro-ing on this one to make sure everyone is happy with the implementation, particularly John. It will no doubt be revisited at the offsite meeting.

We move on to discuss Dali (the ‘control centre’ for the storage of files, information and how these are accessed). Given the ever increasing amount of data, we need to plan for the future, providing a more scalable storage solution. It’s a huge undertaking affecting a component upon which the whole system depends. We’ve already implemented a Cassandra workunit storage feature to take that load off Dali. We are now looking to moving DFS out of Dali in a way that will improve the efficiency of how data is stored. We’ve already done some performance testing, but we need to see how a Cassandra based system works in comparison with a Dali based system. Richard makes the point that sometimes, ESP services aren’t always efficient or scalable, going on to say that one benefit of using Cassandra is that it doesn’t allow inefficient queries.

While this project won’t be complete for 7.0.0 due to its size and far reaching impact, Jake will be working on some preliminary tasks which need to be done before the final implementation is complete. This is good to know and feed back to our users, particularly those who have been asking about this recently. I’m sure they will be happy to know progress is being made and a solution is in the pipeline.

Attila mentions that he plans to work on the round robin spray for 7.0.0. The current spray implementation processes the source file twice, determining the theoretical partitioning size and then generating the partition table. For CSV files, this can take a while to determine the row/record terminator characters. The round robin approach is designed to eliminate the partition table generation, so the source file is only processed once, which makes it much faster.

Richard and Gavin chime in that if he’s going to do that, he may as well also have a go at the projected despray they have been talking about. This is a work in progress which involves providing the ability to despray a subset of fields, file parts or records of a source file. Attila says, ok, he’ll have a go and report back later.

What else is on the agenda? Hurry up someone and think of something else quick or we have to discuss Lorraine’s. This is a standing joke.

But I get my way in the end and we move on to talk about whether there are any hardware changes coming up that we may need to consider for 7.0.0. I’m thinking about the AWS cloud, there’s still an issue with using static IP addresses instead of floating ones. This will be discussed at the offsite. Also, Jon Burger’s name comes up. I know he was having some problems with his Fusion Roxie but apparently, I’m behind the times as his issues seem to have been resolved. There’s also his Hive 360 project, but all seems in hand with this too as far as we know. I must ask him and check. Jake mentions that we need to think about some long term planning for scenarios where users will want to use even more slaves, channels and cores. Maybe this is something else for the next offsite or the one after that.

When?

So, before they all start to pack up and leave to catch trains, I need an answer to the all important question that I’ve been building up to all day. What timescale are we looking at here then for HPCC Systems 7.0.0? It’s a pre-emptive strike really because it’s still a bit of a moving target. But let’s just say I have a premonition that I’m am going to be asked this a lot soon, so I need to be prepared to give an idea of when it may be available. In fact, two days later Flavio asks me the very question!

Looks like we’re aiming for the end of the first quarter in 2018. With that, the meeting is over and we join the regular rush hour commuters as we travel home.

You’ll be able to get a sneak preview slightly earlier by downloading one of the release candidates as we close down towards the gold version. Look out for related announcements in our Community Forum and on our social media channels early in 2018.

Who’s who and what they do…

Lorraine Chapman – New release and intern program co-ordinator
Richard Chapman – Team leader and Roxie guru
Gavin Halliday – Code generator guru
Jake Smith – Anything related to Thor and Dali
Attila Vamos – Expert on DFU spraying and keeps us honest with his automated testing
Shamser Ahmed – Code generator and statistics improvements
Gordon Smith – Anything UI related (e.g. ECL Watch, Visualizations and ECL IDE)
John Holt – Machine Learning guru
Mark Kelly – Thor, networking and performance tweaks
Tony Fishbeck – All things ESP and Roxie packages
Yanrui Ma – ESP, especially ESDL
Flavio Villanustre – HPCC Systems evangelist leading the open source initiative
Trish McCall – Works with corporate and academic partners to build and promote our community
Jon Burger – Infrastructure and cloud guru

The Provisional HPCC Systems 7.0.0 feature list: