Case
Content & Behavior Analytics
Technology Stack:
Infrastructure:
CDNSun, Cloudflare, DigitalOcean, Servers.com, Hetzner
Backend:
Node.js, Express.js, PHP, RabbitMQ
Databases:
Vertica, Clickhouse, Redshift, MySQL, MongoDB, Redis
Frontend:
React, TailwindCSS
Project Overview:
The service is designed to empower media outlets by providing deep insights into the performance of their editorial teams and the engagement levels of their audiences. Key functionalities include:
Tracking metrics such as the average time a reader spends on an article and the volume of content consumed
Analyzing homepage layout effectiveness
Running A/B tests on article headlines
Monitoring conversion rates from visitors to subscribers via offerwalls
Generating analytical reports segmented by content categories, publication authors, visitor sources, and many other dimensions
01.
Architectural Evolution
The platform has undergone significant development and scaling. With the integration of major media portals from various countries, the load has grown sharply, routinely processing 10+ billion HTTP requests daily. To address both performance and economic efficiency, the decision was made to develop a proprietary cloud architecture based on a microservices approach.

02.
Data Collection & Processing
Client-Side Script Distribution
Visitor data is collected through custom JavaScript codes that are distributed via a Content Delivery Network (CDN). Given that each client website has unique requirements, the toolset is tailored for precise site markup, resulting in unique JavaScript implementations for each site.
Front-End Servers
User activity data is transmitted from websites to a distributed network of dozens of front-end servers. These servers store raw data in file format on disk. The number of active front servers adjusts dynamically based on the current load.
Data Transformation & Loading
Raw-data parsers continuously aggregate files received from the front-end servers and push the parsed data into dedicated database servers. To facilitate rapid access, long-term data aggregation, and custom report generation, the service employs columnar databases (initially Vertica, later enhanced with Clickhouse). These databases are particularly effective for executing complex queries that retrieve millions of records per request. While they offer minimal performance degradation even as data volumes increase and provide quick indexing for new entries, they may experience reduced efficiency when inserting data in small batches or retrieving a limited number of rows.
Data Transformation & Loading
Once the data is loaded into the databases, it is made available for report viewing through a user interface built with modern web technologies, including Node.js, Express.js, React, and TailwindCSS.

03.
Maintenance and Monitoring
Given the high operational loads and the vast number of physical servers involved, specialized monitoring tools have been developed to ensure system reliability and performance. Key monitoring capabilities include:
Verifying the global availability of front-end servers
Tracking resource utilization on individual servers (memory, CPU, disk space)
Monitoring disk wear and overall hardware health
Parsing and analyzing error logs
Reporting the replication status of database servers
Other Cases
Your application has been accepted.
Our specialists will contact you within 24 hours.
Oops! Something went wrong while submitting the form.

Svitlana Rakova
Head of Sales
Business acceleration starts today.
Leveraging AI to make things done
Leveraging AI to make things done