It’s one thing to collect data. It’s another altogether to transform that data into intelligence the business can actually use. As companies realize the power of being able to query their data and uncover valuable insights, the inclination is to collect greater volumes and more types of data. For data analytics companies it’s imperative to stay ahead of customers’ growing data sets to ensure they can continue to leverage their competitive advantage.
NITS Solutions prides itself on being a leader in data analytics. The company takes on challenges and everyday improvements with innovation and creativity. So administrators didn’t hesitate when they noticed that performance was degrading on a real-world data set. Three years of vehicle repair record data for car companies such as Toyota and Volkswagen could no longer be efficiently processed in Oracle. Queries that once took minutes had grown to two-to-three hours.
Like any relational database, Oracle scales horizontally. It analyzes data with a singular large process, which limits its power to the overall maximum memory and processing power available on a single server. As a result, queries take longer as datasets grow and queries become more complex. With actual client datasets being impacted, NITS Solutions decided it was time for a new approach.
An early cloud adopter, NITS runs its software on Amazon Web Services (AWS). Late in 2016, NITS’ lead Architect and Database Analyst (DBA) attended Michigan’s AWS Meetup group, which was started and is led by RightBrain Networks (RBN), a cloud development firm specializing in AWS. After meeting RBN and comparing the firm to several other AWS partners, NITS chose to consult with RightBrain on how to achieve the scalability and performance its customers need to analyze complex and growing data sets.
The project’s main goal was to prove out, demonstrate, and enable NITS engineers with a horizontal scaling technology capable of outperforming the existing Oracle database. RightBrain Networks picked Spark, running on AWS’ Elastic Map Reduce (EMR) service, to run computations on large data sets, and provide analytics and answers to interesting business questions.
Instead of scaling vertically like a relational database, Spark parallelizes computations across multiple small processes. By splitting large data sets among many servers, Spark achieves nearly unlimited scaling power. With faster computation and run times, NITS — and its customers — no longer have to wait for key business insights. When queries performed against the vehicle repair record data were rewritten in Spark, processing time was reduced from 2-to-3 hours to just over one minute. Queries that weren’t possible on a 128+ GB Oracle machine are now easily handled by Spark, making the previously impossible possible.
The final piece of the puzzle was data visualization. RightBrain choose the ELK stack (ElasticSearch, Logstash, Kibana) to provide such means. This provides rich real-time charts and graphs, as well as the ability to generate new visualizations on-demand through convenient UIs.
Over the course of six weeks RightBrain gave NITS the means to reach its goals. With technology for the future, NITS can be sure to maintain its competitive advantage and offer new insight into previously unreachable data analysis.
“RightBrain exceeded our expectations both with its knowledge of AWS and big data technologies. Our developers are now making great progress with Spark. We look forward to engaging with RightBrain again in the near future.”