Hardware and OS
- Processing clusters using commodity hardware and high-speed networking.
- Linux operating system.
- Thor, the Data Refinery, is the extraction, transformation and loading engine.
- Roxie, the Data Delivery Engine, provides high-performance online processing and data warehouse capabilities.
- The two systems work together to provide an end-to-end solution for big data processing and analytics.
- Thor and Roxie both feature distributed file systems.
- Thor distributed file system (Thor DFS) is record-oriented and optimized for big data ETL (extract-transform-load).
- Roxie distributed file system (Roxie DFS) is index-based and optimized for concurrent query processing.
Scalability and Performance
- Horizontal scalability from one node to thousands of nodes.
- Thor Data Refinery is capable of processing up to billions of records per second.
- Roxie Data Delivery Engine is capable of supporting thousands of users with sub-second response time, depending on the application.
Redundancy and Availability
- Thor and Roxie store file part replicas on multiple nodes to protect against disk or node failures.
- Both systems are designed for resiliency and continued availability in event of hardware failures.
- Enterprise Control Language (ECL) is designed specifically for processing big data.
- Highly efficient—accomplish big data tasks with far less code.
- Declarative, modular, extensible.
- Graphical IDE supports coding, testing, debugging.
- Extension modules are available for web log analytics, natural language parsing, machine learning, data encryption, and more.
Web Services Platform
- Enterprise Services Platform (ESP) enables end-user access to Roxie queries via common web services protocols.
- Supports SOAP, XML, HTTP, REST and JSON.
- Tools for environment configuration, job monitoring, system performance management, distributed file system management, and more.