Establishing Total Cost of Ownership for a Big Data Platform

In today’s enterprise, an optimized big data strategy can mean the difference between success and failure. For example, Netflix reports the company is able to save $1 billion a year from customer retention thanks to its use of big data analytics and AI, and enterprises in every other vertical market are following suit. However adopting a big data strategy is a significant undertaking for enterprises, and there are a host of questions an IT team must answer before they can decide on the best big data platform for their needs.

Should the enterprise use an on-premises or cloud-based infrastructure for data analytics? Which data analytics software best fits the organization’s use case? Does the IT team have the requisite experience and expertise to implement a big data solution? Will the chosen big data platform still meet an enterprise’s business needs in 12 months? What about in 5 years?

In addition to confirming a potential big data platform has the performance and features needed, IT teams should also consider other factors if they are to gain a true understanding of the capital and operational expenses required to keep a big data platform operational and scalable.

Here’s a list of criteria IT teams should consider when choosing a big data platform.

  1. Fees for seat licenses – Proprietary big data platforms require enterprises to purchase licenses, often sold on a per seat basis. As the platform grows and matures, additional IT staff and seat licenses may be necessary, so license fees could increase over the long term. To reduce license costs, enterprises should also consider free, open-source big data platforms like HPCC Systems or Spark.
  2. Technical support – If the platform isn’t performing as required, enterprises need access to technical support resources that can quickly and effectively solve the problem. However identifying which vendor is responsible for providing a technical fix in mixed-vendor IT environments can be difficult, and it’s common for customers to be stuck without a fix for their problem because their vendors can’t agree on which component of the environment is at fault. Technical support is particularly important for security. If a platform is orphaned by its vendor, security patch development ceases, leaving the platform’s users and data at risk.
  3. Third-party software – Many big data platforms are sourced from multiple vendors; each vendor providing a solution for a specific stage in the big data pipeline. Enterprises need to be sure they understand all the software they’ll need to install and support to get the functionality they require.
  4. Compute resources – The amount of compute processing and storage a big data platform needs will vary over time. Many businesses have seasonal data workloads that rise and fall throughout the year. This can lead to overprovisioning: allocating extra capacity to manage spikes in workloads only to watch that same capacity go unused during periods of low activity. Furthermore, does the platform selected require high-end computing and storage components or can it perform well using lower-tier options as well?
  5. Cloud support – Cloud-based infrastructures allow enterprises to scale their compute and storage capacity up or down in real time to keep overprovisioning under control. Is the platform selected well adapted to this paradigm? Cloud computing also has security risks that may require specific security capabilities to meet regulations and SLAs around data privacy, sovereignty, and security, and IT teams will need to make sure any cloud-based big data solutions comply with those requirements.
  6. Staffing – IT managers need to determine if their choice of big data platform will require adding additional staff to address any skill gaps or increase code output. What skills will those new hires need? As big data grows in popularity, potential hires with expertise in big data and cloud computing will be in high demand and their salaries will reflect this. IT managers must keep staff payrolls in mind when considering a big data platform’s total cost of ownership.
  7. Implementation time – After a platform is selected, how long will it take to get the platform up and running? Sourcing software and hardware from different vendors can cause compatibility issues that must be addressed before the platform goes live, potentially delaying the platform’s launch date.
  8. Ongoing maintenance – Once a big data platform is operational, how much ongoing opex will it cost to keep it running? How frequently is the platform updated and how simple it is to install those upgrades? If the platform’s processing and storage capacities need to expand in the future, how long will that expansion take and how much will it cost?
  9. Flexibility – If an IT team requires its big data platform to support specific features, what resources are available to provide that feature if the platform’s vendor is unable or unwilling to build it?
  10. Developer ecosystem – Is there a robust, global network of developers working on value-added projects for the platform? Does an enterprise need its big data platform to support a specific vertical industry? Or a particular application? The larger a big data platform’s developer community, the more likely software for specific industries or use cases is already available.
  11. Reliability/maturity – Is the platform’s technology new and without extensive real-world testing? Is the vendor a startup that may not be around to support their technology, or currently not able to scale to meet the demand for support? Do they have good technology and good customer service? Can they provide localized support resources for different regions?
  12. Data support – Does the platform process data in any format? Does data in different formats work well together? Data in different formats often end up siloed in separate components that don’t communicate with one another, which can lead to inaccurate or incomplete data analysis.

Big data can transform a business. But before rushing into selecting and implementing a big data platform, be sure you take the considerations outlined above into account. By doing so, you can spare your organization from making poor, ill-informed decisions that can lead to significant problems and costs down the road.

Read more about Total Cost of Ownership in the White Paper here.

About the Author

Hugo Watanuki

Manager Community Tech Programs 
LexisNexis Risk Solutions

Hugo is responsible for the HPCC Systems internship program at LexisNexis Risk Solutions. He is part of the team effort responsible for supporting the outreach and growth of the HPCC Systems open source community. Hugo holds a PhD in Information Systems and has worked for more than 15 years in various technical roles in the IT industry.