Squash Logs

Posted in Cloud, Mobile, Web Tagged , , , , , , ,

Squash games are played daily. The system is designed to store the game log data.  The aim here is to let users enter match logs in the quickest possible time, after they play their games. Thousands of log data records are expected to be uploaded daily. The site can be viewed on iPhone, iPad, Desktop browser, Android browser and is hosted as static html files.  Ajax calls are used for dynamic functionality and asynchronous updates.

A very simple login form is required to start with. If user does not exists, then the login form performs the role of registration.

Once a user logs in, he can enter the logs of the game that he played. Extensive client side validations are implemented to make the form extremely simple, for the player to use.


You can read more about the validations implemented here.

Same validations are implemented on server side, to ensure irrelevant data is not pushed through. XSS and CSRF security is implemented.

For speed of response, all master data is saved in json format on Amazon S3 files in various locations across the globe.  These data files are loaded as static content during log entry. Amazon DynamoDB is implemented for fast access to log data.

You can read more about the amazon implementation here.

Adserver – Pretargeting

Posted in Big Data, Cloud Tagged , , , , , , , , , , , ,

Handling millions of ad-request per hour

The below diagram summarizes the implementation of the ad targeting system. Ad request and response are handled via an Apache server. Each ad request is saved in a log file on the filesystem. Each hour a new log file is generated. System is designed to handle atleast a million requests per hour.  Maximum server response time must not exceed 50 milliseconds.

Kafka is used to handle high loads. Cassandra is used as a scalable NoSQL solution. MySQL is used to store summary data in the system.


Synchronization of Hadoop Tasks.

Summary Data is also generated daily by using Amazon EMR implementation of Hadoop, and Amazon SWF for task synchronization. For reporting needs, the recent data that is not yet summarized, is fetched from Cassandra. The below workflow, explains the implemented flow.

workflow diagram