Following the HPCC Systems Community Day, Jeff Bradshaw, founder of Adaptris (www.adaptris.com), and Graeme McCracken, COO of Proagrica (www.proagrica.com) joined The Download to discuss their implementation of HPCC Systems.
Listen to learn how Proagrica utilizes agricultural data in a big data solution to address the challenge of a growing world population by identifying opportunities to increase yields on farms and enabling growth through agricultural efficiencies.
Proagrica helps drive growth and improve efficiency by delivering high-value insight and data, critical tools and advanced technology solutions, as well as a range of effective channels-to-market. Proagrica provides:
- Some of the most influential media brands in the UK and the Netherlands, helping suppliers and manufacturers to reach an engaged audience through print, digital and face to face.
- Software solutions to help customers effectively manage and optimize their farms and rural businesses. They deliver the market-leading farm management software for both arable and livestock enterprises, as well as accounts software for effective financial management.
- Integration and connectivity solutions used by industry participants across the supply chain.
- Cutting-edge visualization technology by delivering real-time insights from the most comprehensive data sets to enable their customers to react quickly and forecast effectively, driving greater efficiency and maximizing profits.
Adaptris simplifies integration challenges for thousands of businesses seeking efficiencies, especially in their supply chains. For more than fifteen years, Adaptris has combined deep expertise with highly flexible iPaaS and data exchange technology to bolster businesses across the globe -- cost effectively and fast, in the cloud, on premise or a hybrid environment.
The Adaptris InterlokTM integration engine is a part of the HPCC Systems STRIKE platform. More information on Adaptris and Interlok technology can be found on their website, http://www.adaptris.com/
Proagrica and Adaptris and both part of Reed Business Information (RBI) and RELX Group. HPCC Systems is a RELX Group technology.
Key questions of the podcast include:
0:20- Tell us about Proagrica and the challenges you solve for your customers.
Jeff and Graeme discuss how Proagrica is focused on the agricultural sector in the UK and the Netherlands. They work to improve the yields farmers get and to provide greater visibility into what is going on both in and out of the farm. Proagrica covers four main business models in the Agricultural sector: media assets, software assets and farm management systems, connectivity solutions, and data and analytics through agility products.
1:50- You deal with an amazing amount of data sources. In general, agricultural institutions around the world are gathering different kinds of data. How do you deal with so many different data sources?
Agricultural data is very vast and there are many challenges in working with some legacy farm systems. Proagrica addresses not just the efficiency of the farm, but also what the farmer or the provider can actually get for their goods in the market.
Jeff and Graeme give some in-depth examples and discuss how Proaagrica turns this data into usable information on a global scale. For instance, traditional farm equipment such as tractors, sprayers, and combines don’t communicate with each other or more modern farm equipment resulting in a very complicated data environment. The number of sensors in agriculture is astonishing, whether it is sensors that measure the gait of the cow as it walks into the dairy parlor or the chickens that are pecking. With cows, for instance, they can measure where they are going in the field, what they're doing in the field, what the temperatures are, and fat profile.
7:40- How did you decide on HPCC Systems as your big data management and analytics platform?
Jeff and Graeme describe their need for a big data platform that could scale and be able to easily handle the diversity of the data, the diversity of the data sources, and the complexities associated with data accuracy coming from so many different inputs across a farm management system.
Graeme explains how, “Thor, is absolutely fabulous at taking data from a multitude of different sources and helping you create a canonical version of it, developing the entity models and the attributes around the entities, and then create meaning of that. So that was the thing that really drove us into HPCC, and I think we've come to love Thor and what it can do for us as a business.”
9:40- How does HPCC Systems help you overcome the standardization challenge of your data?
The biggest challenge Jeff and Graeme faced was in the volume of data and the cleanliness of data. They describe how ECL enabled them to get to the data faster and allowed them to focus on the higher value of creating the canonical versions, putting the attributes in the entities, and then deriving meaning from their data faster and more efficiently than they could with other big data platforms. They are able to focus on the meaning of the data rather than actually trying to massage it into a specific form resulting in massive time saving.
15:40- How has the Interlok™ Integration Framework helped with your big data management?
The Interlok™ Integration Framework is an event-based framework designed to enable architects to rapidly connect different applications, communications standards and data standards to deliver an integrated solution.
Jeff and Graeme discuss how Interlok has enabled central control to help bring data into the system more quickly and cleanly than other data ingestion methods. Hear more about their implementation and how they use Interlok to apply permissioning and trust and rules to sensitive data.
For more information:
See Graeme and Jeff present at the HPCC Systems Community Day on YouTube Live Stream
View Graeme and Jeff’s presentation deck on SlideShare
Prefer to read the transcript? Below is the full podcast transcript:
Jessica: Hi, this is Jessica Lorti [00:00:02], I'm at the HPCC System Summit, and I'm here with Graeme and Jeff from Proagrica.
Graeme: Hi, my name's Graeme McCracken and I'm the chief operating officer for Proagrica.
Jeff: My name's Jeff Bradshaw, I'm the group CTO for Adaptris FRF and DBT within Proagrica.
Jessica: All right guys, thank you so much for joining me today. Can you give us a little bit of information about your company and the problems that you're trying to solve for your customers?
Graeme: So, Proagrica is [00:00:30] the Reed business division that's focused on the agricultural sector and it covers four main business models. Covering media assets like Farmers Weekly and Boerderij [00:00:41], in the UK and the Netherlands it's software assets and farm management systems. We have connectivity solutions in Adaptris which is our new product that we acquired, our new business that we acquired last year, and now data and analytics through [00:01:00] our agility products.
Jessica: All right.
Jeff: Hang on, sorry. So using our collective assets we're trying to basically improve the yields that farmers get and provide a greater visibility into what's going on, on the farm, and also everything into and out of the farm.
Graeme: Yeah. So lesson puts more outputs, better productivity, [00:01:30] it really is about trying to face the challenge of our world population is growing, and naturally [00:01:38] arable land that's shrinking in volume. So we need to grow more with less, and we need to put less on the ground, and it really is about feeding the world.
Jessica: All right, you guys deal with an amazing amount of data sources. I think you were talking about Fitbits for cows, [00:02:00] and you mentioned that maybe cows in the Netherlands don't know how to communicate with the Scottish highland cow, but just in general agricultural institutions around the world are gathering different kinds of data. So your data is very vast, and it's very different. So how do you deal with your different data sources?
Jeff: It's massively diverse. One of the problems we have is dealing with some legacy systems on farm, you're dealing with tractors that don't [00:02:30] speak to the sprayers that don't speak to the combines, and then also we have more modern IoT devices capturing all data as well and feeding that back in. Also we're overlaying sources of data like soil information, soil temperature, soil type, weather, and things around the manufacturers. So the recommendations, the recipes, and also yield data and everything that goes on is generating a huge amount of data in different forms. So yeah, it's a very complicate data landscape, and one of our biggest [00:03:00] challenges at the start of the project actually was coming out with our entity model to be able to capture all the data we need, which we then obviously tidy up as we bring it in. So yeah, it's quite a complicated environment.
Jessica: You guys are able to measure all sorts of really interesting things. Can you go into some of the, so for instance, I remember you guys had mentioned that you can tell whether the chickens are eating enough by this sound that the chickens are making. What are kind of the things that you can measure that maybe nobody's thought of?
Graeme: So, [00:03:30] in just the amount of sensors that there are in agriculture is astonishing. It's got more data than any other market I've ever seen, whether it's sensors that measure the gait of the cow as it walks into the dairy parlor, or the chickens that are pecking. The cows, where they're going in the field, what they're doing in the field, what the temperatures are, whether or not they're fit, the whole [00:04:00] range of things that get measured and these sensors, these devices, are all measuring it in different ways, and the real challenge is to take that vast plethora of data and consolidate it and create meaning for it on a global basis. So there's a vast number of different farm management systems, and even more different types of sensors. You look at the different tractors and combines, they're starting [00:04:30] to come to like a common set of standards, but in reality farmers have still got tractors that go back fifteen, twenty years easily, and those guys aren't using those standards. So we need to really find ways to consolidate that data and really bring that data from all of those devices, from satellites, from drawings, from farm management systems, from tractors, from combines, from sensors in [00:05:00] the field, in the sheds, on the cow, on the sheep, whether it is a cow a sheep or a wheat crop, all of that is generating massive amounts of data.
Jeff: You've also got things like waverages [00:05:14] as well and quality data. So things like the farm will be paid based on the quality of his goods. So for example, the nitrogen content dictates the protein in wheat. The families know that, they get paid based on things like the starch content in potatoes. So [00:05:30] there's a whole feedback loop there, so it's not just all on a farm, you're dealing with a whole complex supply chain. So when the food is being processed, when a cow goes to slaughter and all the beef ends up going to the same place, all the parts go, so you've got an end to end traceability issue right there, just capturing the data around that cow and all the lineage of the cow. One of our examples was around the animal passports in the UK. A hundred percent of cows in the UK have passports. For us, humans have a [00:05:58] lower portion [00:06:00] of passports and don't travel so much. So we kind of know what everything's going in to the supply chain and it is just a vast amount of data of diverging types. Image data, sensor data, simple OPOs and OCR type data when we're putting data in, so it's a really, really complicated landscape with a lot of data and a lot of challenges.
Graeme: And then consumers and retailers, some of the really good ones are really interested and not just in the quality of the product but the fact that the animals have been [00:06:30] well looked after, or that the wheat hasn't been mistreated and has been grown in the right way and not too much pesticide applied, so more and more of that is becoming important to them so you need to know, for a cow or a sheep you need to know its entire background. You need to know where it came from, who it sows, where all of that needs to be recorded so that you can provide to the consumer at the end of the day assurances that [00:07:00] this is a good product and that animal welfare and pesticides and all of those things that you want to reduce, not animal welfare obviously, you want to improve that, are really core to where we're going.
Jeff: And also it's a challenge because we're not just doing it in one territory. It's a global problem, it's a global product and it's a global challenge. And then you're dealing with cross border type situation with data, it's a whole raft of different sources of data across the world. And [00:07:30] of course multi language-
Graeme: Well, just the cow is speaking in multi languages.
Graeme: Different types of mooing between the Dutch and the British cows.
Jessica: All right, that's fantastic. So it's not just about the efficiency of the farm, but it's also what the farmer or the provider can actually get for their goods in the market. So that's fantastic. All right, how did you guys decide on HPCC systems?
Graeme: What we were really looking for [00:08:00] was something that could scale with us, and that was given the diversity of the data, the diversity of the data sources, and the complexities of that. We needed something that was a big data solution, but we needed something that was a big data solution that was really focused on data. Now that sounds like a little bit of a country statement when it's all called big data, but the thing about HPCC, in particular [00:08:30] Thor, is absolutely fabulous at taking data from a multitude of different sources and helping you create a canonical version of it, developing the entity models and the attributes around the entities, and then create meaning of that. So that was the thing that really drove us into HPCC, and I think we've come to love Thor and what it can do for us as a business.
Jeff: Well also, the other thing to remember is, with our data [00:09:00] farmers are not typists. Data accuracy is a huge issue, and a lot of the big data platforms we'd previously looked at are much more geared toward storage of data and then letting you run rules across it. Well, I was discussing this with someone the other day, big data's great, but you need a view of all of it to make sense. It's no good looking at a small subset of the data just because a farmer can't type in a particular chemical name or something or spells it slightly wrong, it can skew your results and therefore skew the recipes and skew the quality of what you deliver [00:09:30] back to the market. And for us, our key thing is delivering back quality, clean data to our customers to enable them to act on it.
Jessica: That's an interesting point. One of the things that I hear frequently from people who are looking at big data platforms is bringing that data in and standardizing and normalizing that data. It sounds like that is a massive challenge for you guys. How does HPCC systems help you overcome that standardization, normalization challenge?
Jeff: [00:10:00] Well it's much more around the ECL and obviously within Thor, a lot of the entity stuff, and a lot of the stuff that's been built previously by LexisNexis in their use cases helped us. And we can inherit a lot of those benefits, which is something when we looked at other big data platforms we just couldn't do. We had to map the data ourselves and write a lot more logic around them. We did a preview concept with another vendor probably a year and a half ago, certainly before we were acquired by the group, and the biggest challenge was the data volume and the cleanliness [00:10:30] of data. So data hygiene for us now is not such an issue, which is great as we're expanding the platform and on-boarding more farm management systems around the world.
Graeme: We even did a second POC earlier this year where we were playing about with actually Elastic, and it was interesting to see the difference in, you know, we spent like ninety-five percent of the time getting the data into a form that we could use [00:11:00] when we moved into Thor and SALT all of those sort of tools, and that became a lot easier. So rather than struggling to really manage to get the data and play with it, we could focus on the higher value bits of creating the canonical versions, putting the apt attributes in the entities, and then taking the meaning from that, that's where you really add the value.
Jeff: Yeah, certainly. It's the focus on [00:11:30] the meaning of the data rather than actually trying to massage it into a form you can then look at. So that for us was a massive time saving. Massive time saving.
Jessica: All right, you've mentioned several different components of our STRIKE Platform, so that's SALT, Thor, ROXIE, Interlok, KEL, and ECL, that's STRIKE. So what aspects of our STRIKE Platform are you using?
Jeff: Well the key for us-
Jessica: And what are you using them for? Sorry to interrupt you.
Jeff: Well the key to us clearly is the Interlok platform, I mean that's our favorite, we've used that [00:12:00] a lot, so all the time. Yeah, all joking aside, it was something we struggled a little bit at the start on the POC, we struggled a little to get the data in, and then obviously blessed in terms of we have our Interlok in our kit bag, so we quite quickly redid things, made some conflict changes, and actually were able to spray directly into Thor. So we use Interlok a lot, and of course the nature of it lets us federate out to the edge of the network. So we can federate out Interlok's collect data from places where most people wouldn't deploy something. [00:12:30] So we have examples where they've been deployed in Raspberry Pis, sat next to machines gathering data and feeding it back up, but we have a notion of central control with that so it helps bring the data in quicker, cleaner than other people can do it. And then obviously with Thor we've talked extensively about he data quality, so we use a lot of Thor, and of course a fair amount of the ECL side of things as well, and a lot of SALT. A lot of SALT as well.
So I think they're the core components that we use. [00:13:00] Obviously we have ROXIE and our favorite implementation of Interlok which we refer to as the ROXIE Proxy, you may have heard me mention that earlier on, because for us, obviously we talked about data sharing and data security, so what we're able to do now because we have Interlok bring on the ingest side, we can apply permissioning and trust and rules at that level before it even hits Thor. And then also on the extraction of data and the massaging of data we can apply again Interlok via the ROXIE Proxy, [00:13:30] and that's hooked into the core MBS and trust systems that we have here. So we have a permissioning portal that's integrated on the way in and on the way out via Interlok, and then everything else flows straight through. So it's pretty neat the way we've designed the system and very scalable.
Graeme: Because one of the key things is we want to be, we will be that sort of trusted third party, so we have to be very careful about what we do with the data and make sure that it's very secure. Because [00:14:00] it's farmers who are trusting us to share anonymized data, even with anonymized data you want to be careful, but for certain where they have certain partners they want to permission those to the guys to actually have access to them, whether that be the agronomists [00:14:13] or manufacturers that they want to share with, then we want to give them the ability to say that we're going to share with that, you can share that data with that company, so we've built in a whole trusted network for that.
Jeff: And also it's about time models as well, because [00:14:30] fields actually, you know, I might tenant out a field during a particular season or I might loan a crop zone within a field to one of my neighboring farmers, so I may not have data in my farm management system for that field for a particular season. Someone else might give that data, and it's a real complicated model so the trust stuff we put around the edges is really important, and we can actually time box and share data with particular parties for a particular season to make sure that, because we can get all the data, we collect it all anyway, I mean we've got something like a million hectares [00:14:58] in the UK [00:15:00] already feeding data into our system. So we are live, it is a live platform, it's something that's there, so for us certainly the ability to have control on the way in and the way out, and that flexibility that Interlok gives us is core.
Jessica: That's really interesting about the security needs. That's something that I think might even apply to other industries, correct?
Jeff: Yeah, absolutely. Absolutely. And we do, obviously, a lot with security [00:15:30] and data security, it's something we do. When you're integrating, obviously security becomes king.
Jessica: All right, that's fantastic because I don't think a lot of people realize how security is really important within agriculture. So can we talk a little bit about Interlok? That's something that's brand new to our STRIKE Platform, that's the "I" in STRIKE, and that's something that you guys know quite a bit about. Would you be able to talk to us a little bit about Interlok?
Jeff: Yeah, well I mean obviously Interlok is, we kind of brought it to the party ourselves. We've had it, it's been in existence since about [00:16:00] 2000, and we have something like just about three hundred different implementations of packaged applications. So common off the shelf things like an SAP or the different Oracle Applications, SalesForce.com, so there's a whole raft of pre-built, off the shelf things of course. The spray and de-spray into Thor is a key part of the thing that's just in the latest release now. We have a very slick, nice, GUI for configuring the work flows, it supports a notion of kind of central hosted conflict with remote deployment. [00:16:30] When we first designed the software back in 2000 data sensory was expensive, so our model was we push data out, and of course we feature a lot of security stuff in there because we use the internet predominantly as a transport mechanism. So we encrypt a payload level with communications.
It's very scalable, it's using industries outside of ag [00:16:49], there's a number of airlines running it, a number of banks, traders and telcos [00:16:54] also use the same Interlok technology. So it's an industry agnostic, it's not just about ag. Some [00:17:00] of our customers in the financial services sector are doing about 25,000 messages a second through a single instance with Interlok so it's incredibly scalable, and we have customers doing images as well across the network. So yeah, it's very powerful, and it just makes life a lot easier on boarding. I mean, again, when you're dealing with a big data project and something like Thor and the STRIKE stack, you don't want to be wasting your time getting data in. Data scientists are not integration guys, they know the data and they want to work on the entities and deal with the data. So [00:17:30] using Interlok to bring data into the platform on the ingest side of things, it lets those guys get to work a lot quicker, and for us that was key. Hopefully we'll see more people rolling out as well as things move forward.
Jessica: Can you give me an, you have hundreds of modules that are already in place, can you give an example of some that are available?
Jeff: Absolutely. So probably one of our biggest selling ones would be SAP, so integrating to SAP systems to [00:18:00] extract data and bring it in. So it could be traditional kind of B2B type data, so orders and those sorts of things, it can be manufacturing data, it can be customer data. We have a number of our customers where they're using core sensor type applications, so Siebel [00:18:11] for example on premise, and they're using Interlok to connect between Siebel and SalesForce.com and the cloud where they have engineers out on site. So it's quite an all encompassing system, and again, for us it doesn't really matter what data is. Probably my most obscure one is one of the airlines, where they actually use [00:18:30] us with a TCP adapter to extract WIN data from the kind of cash cloud alerting system, and we actually then convert that and place into a pilot briefing pack. So I think most people were affected by the volcanic ash alert a few years back, I know I was stranded myself for a few days, so yeah. Hopefully now we have a bit of a better way of living and working around that sort of stuff thanks to our software.
Jessica: All right. You guys have done so much implementation wise already, I know that you're already seeing results. [00:19:00] Can you give us some examples of the results that you've seen and the impact maybe that it's had for some particular farmers or ranchers?
Jeff: We can certainly do that. For one of our, actually manufacturing customers where, and this is an example we mentioned earlier on where they, typically a manufacturer will be using a number of trial farms and it's a small, modest number of trial farms. This particular customer has about seventy trials of a particular chemical and they were recommending [00:19:30] the use of their product versus a competitors based on the yield. We looked at that using our data in the UK, and actually it was about 3,000 farms we had to play with, and we could tell them quite quickly that actually no, your claims are not quite right at the dose you're recommending. However, if you apply a higher dose then you can get the yield you need. So it's one of these things where, and it's kind of useful for the growers because people don't follow recommendation necessarily and sometimes these guys, clearly they knew something that the manufacturer didn't know, which [00:20:00] is why they applied a higher dose to get the yield they needed. So in farming yield is king, but you've got to keep your cost under control as well.
Graeme: Yeah, I think the other thing is we've started to release something in the UK called the Grower Dashboard, which is really aimed at helping farmers get more insight into what's working in their area, giving then news and various other services to really make their life a little bit easier, [00:20:30] and we'll see a lot more of that as we build out the analytics as part of the platform.
Jeff: Yeah, and that also is key as well with the growers. Grower Dashboard is great because it gives them a single place to look. There are many sources of information. I mean I was speaking to a grower recently, he said, "If I want to know the weather I just look out the window." I said, "Yeah, but it doesn't really help you with the forecast." So now it's giving him forecasts and details for their particular geographic area. I think we do weather down to like one square kilometer, so it's very, very accurate.
Jessica: [00:21:00] All right guys, that's all the questions I have. Thank you so much for joining us, really appreciate your time.
Jeff: Well, thank you for having us.
Graeme: Yeah, thank you for having us.