Michael Targett, Senior Director for Data at FlightGlobal (www.flightglobal.com) joins us on The Download Podcast. Michael shares how FlightGlobal utilizes the industry’s most comprehensive fleet data to derive insights that create efficiencies and cost savings.
FlightGlobal provides the industry’s most comprehensive data and insight. They help to discover everything about the global fleet of aircraft and world schedule to gain insight on current and future market dynamics and drivers and achieve a complete understanding of the aviation industry.
- Provides the industry’s most comprehensive view of the world fleet from overall growth view to detailed information on individual aircrafts and where they are flying.
- With insight into key market changes and external factors, FlightGlobal’s analysis influences business strategies and aids risk management.
- Reports and tailored advice help users understand the future make-up of the work fleet, the forecast of deliveries and the evolving trends that will influence the industry.
Key questions of the podcast include:
0:15- Tell us about yourself.
- Michael introduces himself as the Senior Director for Data at FlightGlobal, a company that specializes in aviation and aerospace information. Michael oversees the data operations and analytics for a variety of data ranging from editorial content to data and key proprietary data sets such as aircraft information, schedule, and flight status.
1:20- How do you help customers with the information your company provides?
- FlightGlobal provides a range of services and products which include web interfaces, grids, and websites that can be used to answer specific questions and provide data for them. Customers can run queries and filter the data down to answer very technical questions for people who are in business development or analytics in a variety of aviation companies including airlines, aerospace manufacturers, maintenance companies, or airline service partners.
2:25- Is this information important for airlines to maximize their margins?
- Margins are very important in the airline industry. Michael says the accuracy of the data is the most important and they need it to see how their airline benchmarks to other airlines. It can also help them make important decisions about the company. It’s a margin game that is very dependent on the most accurate information that they can bring in. With reliable and trustworthy information, FlightGlobal can make it easy to sort through critical business questions such as whether new routes should be explored or which aircraft can yield profitability on certain routes.
3:50- What is the significance of FlightGlobal and what does it add to the industry?
- Michael explains that fleet information requires the aggregation of large amounts of data from many different data sources to create the most accurate, up to data database. With the recent acquisition of FleetStats, FlightGlobal is now able to take flight status and flight positional information and aggregate that into a status record per flight for all the flights every day. Combined with their technology and data, FlightGlobal is able to provide exact data on the configuration of each individual aircraft. This enables airlines to more accurately predict and manage seat availability but also allows service to be fully customized for the exact aircraft that will be needing services. In addition, airlines are able to compare the airplanes flying similar routes across airlines. This provides an element of differentiation by providing a better in flight entertainment system or wider seats to gain market advantage.
7:30- How did you go about doing a proof of concept and how did you find HPCC Systems?
- Michael talks about the data that they had on hand and how they needed a big data solution to build something that was more viable in the long term because the data would scale and grow and that is where HPCC Systems came into play. It is well utilized when they're experimenting and tweaking the model and adjusting the features and seeing how that affects different elements. HPCC Systems allows them to do that quickly and efficiently rather than wait for that cycle on a traditional SQL solution, which would slow down the R and D velocity.
9:00- What differences did you see when you moved over to HPCC Systems?
- With just under 30,000 aircraft in service with commercial aviation and with about 100,000 flights per day, FlightGlobal needs a big data solution with significant scalability. HPCC Systems gives them the scale they need with a significant speed advantage, especially over a traditional SQL solution. Michael compares how FlightGlobal runs about 14x quicker using HPCC Systems. With aspirations to build capability around delay prediction utilizing weather datasets, the scale of HPCC Systems will allow them future growth capability.
10:00- Have you been able to get more accuracy and make more connections?
- The analytics are limited by the incomplete nature of the datasets. FightGlobal utilizes Machine Learning to make assumptions about the data and make projections. For instance, with airplane travel across oceans or areas with limited connectivity, FlightGlobal is able to make forward predictions. Further, flight records will sometimes have data that varies slightly from each other yet are all related to the same flight. Using HPCC Systems features such as rollup and de-dup has improved the efficiency of the system.
11:25- Can you provide us with some examples on how you are providing new value to your customers?
- Michael discusses the concept of True capacity. FlightGlobal conducted an analysis of 6 months of flight capacity data for airlines flying between the US and China. They discovered an 18.5 thousand seats of overstated capacity across the airlines! FlightGlobal is able to provide superior and more accurate data analytics to their customers that helps companies avoid actions or solve problems that can directly equate to airline efficiency and cost savings.
13:30- Who are FlightGlobal’s customers?
- Customers cross all areas of aviation aerospace, including over 800-900 airlines as well as search engines, global distribution services, and online travel agencies.
You can reach Michael at FlightGlobal by visiting the FlightGlobal website: www.flightglobal.com
For more information:
See Michael’s present at the HPCC Systems Community Day on YouTube Live Stream (25:00 in the presentation)
View Michael’s presentation deck on SlideShare
Listen to Past The Download Podcast Episodes:
Arjuna Chala, Senior Director of Technology for HPCC Systems discusses the Future Technologies Conference that took place in San Francisco on December 6-7, 2016. Arjuna was a keynote speaker at the event and we discuss conference highlights and what he thinks these insights mean for the future of big data technology.
Proagrica utilizes agricultural data in a big data solution to address the challenge of a growing world population by identifying opportunities to increase yields on farms and enabling growth through agricultural efficiencies.
Raj has a background in technology consulting and has led practices on Technology Strategy, Platforms, and Architecture. Currently on his third start-up, Raj shares information on how ClearFunnel is using the open source HPCC Systems big data platform to solve customer big data challenges as well as advice on approaching big data solutions.
Watch Past The Download Webcasts:
- Anirudh Shah, Co-Founder, 3Loq
- How we use HPCC Systems to process more than 500 monthly marketing campaigns at the largest private bank in India across the banks entire portfolio.
- Our experience with HPCC Systems in production
- Automation and data sanity frameworks
- Allan Wrobel, Senior Engineer, LexisNexis
- Making full use of Superfiles to make order of magnitude improvements to build times on THOR. (plus fringe benefits)
- Thor is well known for making short the processing of billions of records, and this promotes the tendency to use brute force in its deployment. Watch how the UK managed to implement efficiency over brute force to reduce the processing time for a daily build of a billion record ingest file from 12 hours, to 2 hours, and enabled further speed increases in other processes.
- Lorraine Chapman, Consulting Business Analyst, HPCC Systems
- In 2015, HPCC Systems was an accepted organization for Google Summer of Code (GSoC) taking on 2 students involved in this program. However, we had the bandwidth to support more students and so the HPCC Systems summer internship program was born. Four students joined the program in 2015 and four more in 2016. We will apply for GSoC and run our intern program again in 2017. Hear how the programs work, how projects are identified and find out about student successes on these programs.
Prefer to read the transcript? Below is the full podcast transcript:
Jessica: Hi. I'm Jessica Lorti with HPCC Systems Marketing and I'm here with Michael Targett from FlightGlobal.
Michael: Hi, Jessica.
Jessica: All right. You gave a very interesting presentation for us today at the summit about FlightGlobal, and I was hoping that you could tell us a little bit about your company and the challenges that you try and solve for your customers.
Michael: Yeah, sure. I'll start by introducing myself. I'm Michael Targett. I'm the senior director for data [00:00:30] at FlightGlobal. I look after data operations and analytics for our company. FlightGlobal is a company that specializes in aviation and aerospace information, so that could be editorial content, which we were just talking about, but specifically of interest here is around the data and the analytics that FlightGlobal generates for the industry. We've got some key propriety data sets that we use, and they're really important for the [00:01:00] case study that I know we're going to go on and talk about in a minute. Just to cover those off, so that's around fleet information, so aircraft information, the schedule of where things fly, flight status information, so when you see things mapped in real time on maps, and those are the key data sets here that we're talking about.
Jessica: All right. How does that information, how do you use it to help your customers?
Michael: We provide a range of services and products out to [00:01:30] the industry. They can be web interfaces and grids and websites through which people, they'll run queries and filter the data down. They're usually looking to answer a specific question, How many aircraft are operating on this route? Or, What's the average life expectancy between delivery and retirement of this type of aircraft? Very technical questions for people who are in business development or analytics [00:02:00] in a lot of different aviation companies. That could be in airlines, it could be aerospace manufacturers, maintenance companies, people on the ground that service aircraft. You name it, we're providing data out in some way or form. I always say, "However you want it, we'll give it to you," so if you want data on the back of a napkin I'll give it to you that way. There are a lot of ways in which our customers interact with us.
Jessica: All right. Great. The airline industry, am I correct, that they usually run on fairly [00:02:30] tight margins, especially as the fuel charges go up and down? Is this important to them for maximizing their margins, or is there another reason they use this?
Michael: That's a great question, actually. It is a margin industry, particularly the airlines, we all know that. Accuracy of data here is what we're talking about, and it's very pertinent to this fleet-flight matching concept that we'll go on and talk about in a sec. For airlines it's all about, "Can I get the most accurate information?" They [00:03:00] obviously know what's going on in their own airline, but they don't know how that benchmarks against their competition. Also when they're looking at things like, "Should I go for that new route?" Or, "Should I up the frequency and the amount of times I serve that route?"
They're trying to pull in information to give them a view on, "What's the current capacity there, and what's the current demand? And if I assume I'm going to get an 80% load factor on an aircraft can it be profitable? [00:03:30] What kind of aircraft do I need to put on there to be profitable?" You're right to say it's a margin game that is very dependent on the most accurate information that they can bring in. And so they have a lot of analytics people that are trying to work out whether to trust certain information or not. That's where we come in, usually.
Jessica: All right. You have a bunch of different data sources, and some of them are public, but it's the way you work with the information, right? What is the significance [00:04:00] of what you're doing? How does that add to the industry aside from the margins?
Michael: I'll explain a bit in a second about fleet-flight matching because that's quite pertinent. Yeah, some of the data is out there. Some of it we have direct relationships with those data originators. For example, we are approved by IATA, the industry regulator for the airlines, to collect and distribute the global schedules. There's [00:04:30] only two approved providers of that information, so we've got a right to play in that market there. The fleet information is all about aggregating lots of different data sources down together to build the most accurate and up-to-date data base.
Where we're adding new value, and we're doing something brand new, really, and which is why we're here at the HPCC Summit doing a presentation, is we made an acquisition earlier in the summer of a company called FlightStats, which, I think [00:05:00] is fairly well-known. If you've ever used Google to type in your flight number and see if it's on line, that information is coming from FlightStats. They specialize in flight status, so they take a lot of that flight positional information that you can see on maps sometimes and they aggregate that into a status record per flight for all the flights everyday.
We acquired FlightStats early in the summer, so they're now part of the FlightGlobal family, which is great news. What we're doing is we're saying, we know what the schedule [00:05:30] is, we know what the airlines plan to do. There's a certain amount of information in the schedule that the airline gives us that says, "We're going to fly between A and B at this time with this kind of aircraft." But critically there, we don't know which particularly aircraft it's going to be. Now, you can have two 737s that are configured completely differently. There might be a 20% shift in terms of the number of seats, depending on how the airline's chosen to split down by, a lot of business seats rather [00:06:00] than economy seats, et cetera. But we also know on an individual aircraft-by-aircraft basis what exact seats are on that aircraft. The trick is can we use the flight status information to say, "Ah, it's that aircraft now on that route." So therefore we're able to say to the airlines using our services, "Here's the exact information as opposed to the estimated information on that route."
Jessica: All right, so that would help them to maximize the number of seats [00:06:30] that they sell because they know how many seats there are, essentially?
Michael: Yeah, so that's about a more accurate market analysis. Ultimately, they're trying to say, "If you extend that out what difference does it make to the audience, the passengers?" Here's a good example, so the airline wants to be competitive on that route, so the guy at the airline gets to understand, "What are my competitors offering [00:07:00] to customers on that route? Have they got a wider seat, a better in-flight entertainment system?" because they now know exactly what aircraft is being used. So he gets to say, "Okay, so I know that if I put the exact same aircraft with a worse in-flight entertainment system on at the exact same time I probably wouldn't get as many sales on that because my passengers would be getting a worse experience." We call it quality of service index.
Jessica: Fantastic. Who knew? That's [00:07:30] very interesting. What you guys have done is you've taken all of this data and you've done a proof-of-concept. How did you go about doing that, and how did you find HPCC Systems?
Michael: That's where HPCC comes in, really. We understood the problem, and we had a hypothesis that if would could match this information up it would be valuable to our customers in lots of different ways. We had lots of different use cases that we were exploring. We also knew that when we're looking at just under 30,000 [00:08:00] aircraft that are in service in commercial aviation and we get about 100,000 flights a day, so that's going to grow quite quickly if we want to track that everyday. In fact, we've been tracking it for about six months now and we're already up to quite a lot of data.
Initially, we're experimenting in different ways in the more traditional SQL world to do that matching, but it quickly became [00:08:30] evident we needed a big data solution to build something that was more viable in the long term. So, that's where HPCC came in, for two reasons. It's going to scale, the data's going to get big. We need a big data solution, and also it's really good when you're experimenting and you're iterating and you're tweaking the model and adjusting the features and rerunning it and seeing how that affects. It's great to be able to do that really quickly with HPCC rather than wait for that cycle on a traditional sort [00:09:00] of SQL solution, which would slow down your R and D velocity.
Jessica: All right. What differences did you see in terms of your speed or your accuracy when you moved over to HPCC Systems?
Michael: Bear in mind the data's growing everyday. It's running about 14 times quicker than our previous efforts, and we've not got the biggest cluster in the world on that. That's just our R and D cluster. I think we've only got four nodes on [00:09:30] that, but the production cluster as we move this stuff in, that's when we'll start to get big. And we haven't even thrown the full range of data sets in at the moment, so at the moment we're just doing that matching of the schedule, the fleet, using the flight status. We have aspirations to build services around delay prediction, for example, and to do that you need to introduce weather data. Weather data is the archetypal or big data set. It can be as big as you want to be, so having a system that scales, like HPCC, [00:10:00] is perfect for us at that point.
Jessica: All right. Fantastic. You said it's 14 times faster. Have you seen better accuracy, or is there linking that you're using? Are you able to make more connections?
Michael: Yeah, it's good accuracy. You're still dealing with the limitations of the data itself. This isn't a 100% complete data set. There's the limitations of aircraft observations. You don't see every movement around [00:10:30] the world, for example, in Africa and Asia and across the sea. These are areas which have gaps in them, so we have to use some assumptions to fill those gaps in and then be able to project that forward. That's, again, another use there. We use machine learning to do some of that estimation, some of that approximation, some of that gap-filling, and particularly in terms of projecting forward what's likely to happen based on what's actually happened. Those are some of the advantages we've seen as well. Then using features [00:11:00] like rollup and dedup, which HPCC have, that's been great because when you're dealing with those flight records we get multiple pings across a flight record or with slightly different information, but it's all relating to the same flight record. We want to roll that up and make a more efficient process, so that's helped us out as well.
Jessica: All right. Fantastic. Just one last question, can you give us some examples? You talked about some great examples in your presentation. You talked about true capacity. You talked [00:11:30] about aircraft time-on-ground. Can you give us some examples of the way you've been able to provide value that you couldn't do before?
Michael: Yeah. That's fine. There's a good one we did. We just got back from China and we were presenting out there at a conference for network planners who are very interested in capacity, so if I use that one. True capacity is about us saying, "You might have been using the seating information that the airline submits in the schedule to do your capacity analysis in the past. Well, now we're able to say exactly [00:12:00] how many seats are on each flight." We call that true capacity.
When we were presenting out in China to the network planners we ran some figures across the airlines that serve those routes between the U.S. and China. We thought it'd be interesting. As part of that we took a bunch of carriers, including Air China and United and American to see, actually, if you were running analysis on what's included in the schedule, and then you compared that with what actually happened, what's the difference there?
We looked at that across [00:12:30] six months and the figure that we put out is there was one airline that was obviously submitting some figures around how many flights it's going to fly. I won't name names. I don't to name and shame, but within one month, the month of August, the difference between the capacity stated in the schedule and that actually flew was 18 and a half thousand seats. That's huge. That's like 17% overstatement for the capacity, and if you're [00:13:00] trying to build a clear picture of what's actually happening and you're including that overestimate you're in trouble because, actually, in the world of aircraft that would be two full Boeing 777-200s a day that you're assuming are flying that route which aren't flying that route. That's quite a technical and industry, geeky answer, but hopefully it gives you a bit of color to what we're doing.
Jessica: All right. We're technology here. We like geeky answers. It's good for us.
Michael: Excellent. We have a hashtag Avgeek. [00:13:30] That's us.
Jessica: Awesome. When I'm listening to you talk I'm thinking that your customers are major airlines. Who are your customers? Am I correct that it's major airlines?
Michael: Yeah. We have customers across the board, across the aviation aerospace. Virtually everyone needs this information and there are relatively few players. Certainly, we feel that we're the premium choice for that. Pretty much we have an interaction either as a distributor of the [00:14:00] products like the schedule or solutions to pretty much everyone out there in all the airlines. For example, if I talk about the schedule, that's eight to 900 airlines that are supplying us with the schedule which we distribute on their behalf to places like global distribution services the Sabres and the Amadeus who you book flights through. And, indeed, Google and search engines, people like that and online travel agents. You name it, they're probably [00:14:30] interacting with FlightGlobal in some way or another.
Jessica: All right. Fantastic. How can people reach you? If people are interested in learning more about FlightGlobal or even reaching out to you to learn more about the solutions that you provide, how can they find you?
Michael: Sure. FlightGlobal.com is our website, and that's got information about our products and also the kinds of benefits that customers can expect from using our data, so that's a good place to start. We've got a team of very [00:15:00] knowledgeable commercial people that will set up trials and demos and take people through the benefits they can expect to see, so that's probably a good place to start.
Jessica: All right. Fantastic. Michael, thank you so much for joining me today.
Michael: Thank you.
Jessica: All right.