O'Reilly Strata Conference 2012: Making Data Work
Strata Conference offers the nuts-and-bolts of building a data-driven business—the latest on the skills, tools, and technologies you need to make data work.
HPCC Systems is a strategic sponsor and will be presenting at the event. Attendees can receive 30% off registration by using HPCC30 for the discount code. Be sure to visit booth #706 for more information on the powerful HPCC Systems platform.
Tuesday, February 28
Time: 1:30 to 5:00p.m.
What: Three hour tutorial by Richard Taylor, Chief Trainer
Room: Ball Room F
Topic: Big Data Entity Extraction With Less Work and Less Code
Do you want to write less code and get more done? This tutorial will demonstrate a natural language parsing technology to extract entities from all kinds of text using massively parallel clusters. Attendees will gain hands-on experience with the newly-released, data-centric cluster programming technology from HPCC Systems to extract entities from semi-structured and free-form text data. Students will leave with all the data and code used in the class along with the latest HPCC Client Tools installation, HPCC documentation, and HPCC’s VMware installation. Prizes, give-aways and a raffle is included.
Wednesday, February 29
Time: 9:30a.m.
What: Keynote by Flavio Villanustre, Vice President of Infrastructure and Products
Room: Mission City Ballroom
Topic: Trends in machine learning
Machine Learning and Big Data: Sustainable Value or Hype?
Back in the late 80s artificial intelligence was set to take over the world; it didn’t happen. In 2012; AI has been stripped down, dressed up and reborn as machine learning. Will it take over the world this time?
What makes a Big Data - Machine Learning solution "better"? Can machine learning happen with legacy tools? What exactly does it mean to be fully parallel? Do I care? Will I be any better if I get it right?
Time: 1:30pm.
What: 40-minute session on SALT (on main agenda) by Tony Middleton, Sr Architect and Data Scientist
Room: Mission City B1
Topic: Scalable Automated Linking Technology
Attendees will learn about the Scalable Automated Linking Tool (SALT) tool which is available with the open source HPCC Enterprise Edition, and how it can simplify the data integration process and also save a significant amount of development time by automatically generating ECL code for processes such as data profiling, data cleansing, and record linkage. A case study will be included where a complex, Big Data linking application for insurance data was converted to HPCC using the SALT tool and reduced 20,000+ lines of source code to a 48-line SALT specification.
Thursday, March 1
Time: 11:30a.m. to 12:10p.m.
What: 40-minute sponsored session by David Miller, Sr Architect
Room: Ballroom G
Topic: Solving big data analytics with an emerging data-centric language
Learn the simplicity of processing Big Data with an emerging data centric language called Enterprise Control Language (ECL) and its use for ETL, data analytics, and query processing. ECL is the programming language of HPCC Systems, an open source-enterprise proven Big Data analytics platform. Queries will focus on: Have the demographics of inventors changed over the last 40 years with the Midwest’s industrial center becoming the rust belt while Silicon Valley has grown to dominate the high tech industry? Have the number of inventors declined in Michigan and grown in California? What states have the most then versus now?









