fbpx

BIG DATA reporting system

Project summary: the product is a real-time reporting system that works with a large number of different telecommunications data. The need for this product at the customer’s enterprise arose due to the slow operation of the existing system, which gave the final result to the user in several minutes.

As a result, the JazzTeam engineers created a high-performance system that allows generating the necessary reports almost instantly. In addition, the functionality of this reporting system has been significantly expanded.

Technical description:

The customer uses the Oracle Database, which contains a large number of records. The complex structure of the database and the large amount of data led to the fact that the report generation based on these data was quite slow. The main task of our team was to reduce the time required to generate reports without changing the existing data structure.

At the stage of technical analysis, the decision to use the OLAP approach and implement the analytical data storage without changing the existing Oracle Database was made. Three candidates were selected:

  • Mongo DB
  • Cassandra
  • Druid

Each system was loaded with real data and measured with performance indicators tests. The best result was shown by the Apache Druid system. Its architecture is well suited to quickly generate reports from a large database.

At the next stage, we developed a prototype based on the following bundle:

Oracle + Java + Spring + Spring Batch + JUnit + Apache Droid

The prototype performed the following operations:

  1. Exporting data from Oracle Database.
  2. Uploading data into Apache Druid.
  3. Report generation.
  4. Addition and adjustment of existing reports.

The customer was satisfied with the successful testing and results demonstration with help of the prototype.

Technologies:

Programming language: Java.
Frameworks: Spring, Spring Batch, JUnit.
Databases: Oracle Database, Cassandra, MongoDB, Apache Druid.
Cloud service: Amazon Web Services.

Screenshots:

Project features:

  • Dealing with production data; GDPR compliance.
  • Creation of unit tests for various NoSQL databases (Cassandra, MongoDB, and Apache Druid) in limited time.
  • Real-time report generation.
  • Possibility to update existing reports with new information.
  • Limited development time.
  • Adherence to Scrum processes throughout the project.

Project results:

  • A comparative analysis of several Big Data systems for a specific practical implementation has been performed.
  • A high-performance interactive system for generation of required reports has been developed.
  • The functionality of the implemented system fully meets the customer’s requirements.
  • A possibility of viewing and updating existing reports has been implemented.
  • The customer deemed successful all iterations of the project.

The company’s achievements on the project:

JazzTeam staff has examined existing research on the project subject and conducted their own benchmarking analysis of several data transformation and storage methods, while keeping the integrity and security of customer’s data.