Cytoscape Web Graphs -> Force Directed Network Graph

Posted in API, Big Data, Web Tagged , , , ,

The graph is used for analysis of “Ad Targeting Data”. Relevant data is presented at campaign level. Targeted keywords and URLs are related in the graph.

The challenge here is that sometimes the data points to be shown on the graph can be about 1,76,000 which makes it extremely slow. Also calculations for 1,76,000 points is extremely time consuming.

So the solution proposed was to perform calculations as a back-end process. The calculations could be refreshed at any time, by the click of a button. The data requirements were sufficing with bg processing.

To give an idea of the calculations involved, the time taken by the bg process to be completed was 8 hours, on a dedicated server with 4GB RAM. Configurations are implemented to allow long running bg processes.

Force Directed Network Graph

The data on graph is restricted by number of minimum related nodes, which can be specified by administrator. For eg: only those data points will be shown which are related to atleast 5 other keywords. The top matching results can also be filtered.

Indoor Soccer Sports Club

Posted in Mobile, UI, Web Tagged , , , , ,

It is a system to manage soccer matches at “Sydney Sports Club”

New teams can register on the portal, and add their players. Solo players can also register. They will be assigned to teams looking for new players. A back-end admin can assign players to teams.  Admin can create and update different competition tournaments, each of which would last for about 11 rounds (11 weeks). Some of the created competitions are “Monday Evening Mixed”, “Tuesday Evening Mixed”, etc. Each day there would be multiple matches. Admin could schedule matches and update scores for the registered teams.

Register Teams and Players

Register Teams and Players

Each tournament has a team standing table, wherein the position of the team and its statistics are updated. Statistics like wins,loss,draws, games played, f,a, points are shown. The lead player statistics would be shown in frontend.

Competition Events and Team Standings

Competition Events and Team Standings

 

Admin can manage corporate events and upcoming events.  He can update competition, teams and match information. Admin can also configure the email which will receive the contact messages. All contact messages are sent via email. However a message copy is also saved in database, so that admin can check it anytime,in case his incoming mail server is down.

 

Contact The Team

Contact The Team

The system has been developed to be web, mobile and tablet ready.

Squash Logs – Amazon Web Services

Posted in Big Data, Cloud Tagged , , ,

Right from start, this project necessitated creation of design which is scalable in the cloud.

The data-store needed to be cloud based.

The concept is that on a web-request, the html page is pulled up, as a static file – from Cloudfront – which will then use JavaScript to populate the Opponents and Locations. So basically each user will not have to hit the web server or the database at all, unless they add a new Opponent/Location, or save a game result. On saving a call to DynamoDB is made to save the squash game data.

Basically each user had 2 sets of data – opponent list and location list. He could add values to this data set any time through the log form.

For each user, we have created json location file and json opponent file to store data on S3/Cloudfront. Hence user data is loaded very fast, without any serve load. Whenever user edits this data, the json files are updated via Amazon Web Services, API call.

An example of the implemented URL format would be:  https://s3.amazonaws.com/bucket_name/opponent_list/user1.json  Here “user1.json” is a random name, so users cannot loop through others users opponent or locations, by randomly guessing names.

Once the user clicks save, the data is sent to DynamoDB datastore via Amazon Web Services API. Admin can also update these logs or delete them via API calls.

You can read more about this project here.

Squash Logs – Client Side Validations

Posted in UI, Web Tagged , ,

For Squash Logs, validations need to be implemented at run-time, so that it is very easy for the player to enter his log data, after each game he plays. We designed the form, so that it was extremely simple for end user to update it.

The below client validations are implemented :

(*) Save button is hidden till all mandatory data is entered.

(*) Date control is shown. Future date cannot be selected for match logs. Date displayed is as per the format required.

(*) Total Score cannot exceed 5. Hence if any one opponent, enters a score of 3, the other opponent can only enter a score of 0,1 or 2.

score_1

My Score is 3, So Opponent Score can be only 0,1 or 2. Drop-down updated accordingly.

 

score_2

My Score is less than 3, So Opponent Score can be only 0,1, 2 or 3. Drop-down updated accordingly.

 

score_4

Opponent Score is 3, So My Score will be less than 3. Drop-down updated accordingly.

 

Opponent Score is less than 3, So My Score can be 3. Drop-down updated accordingly.

Opponent Score is less than 3, So My Score can be 3. Drop-down updated accordingly.

(*) If Scoring System is Selected as “HiHo”, then Games To are always set to 9. If Scoring System is Selected as “PAR”, then Games To, can be selected as either 11 or 15.

HiHo system, will always have 9 games. So no need to select anything here.

HiHo system, will always have 9 games. So no need to select anything here.

PAR System, can have 11 or 15 games. Player will select accordingly.

PAR System, can have 11 or 15 games. Player will select accordingly.

(*) Notes cannot exceed 500 characters. Number of pending characters are shown during typing.

Characters restricted to 500.

Characters restricted to 500.

(*) Weight Must be Positive number.

weight

 

(*) All users Locations are detected and loaded in the drop-down box.

location_3

(*) Any Location can be selected from all the Location created. In the select box, we have an option as “New Location”. If this is selected, then an input box is shown to enter the location name. This is mandatory only if “New Location” is selected from drop-down. Save logic is implemented accordingly.

location_2
(*) Either entering a new location, or selecting one is mandatory.

location_1

(*) All users Opponents are detected and loaded in the select box.

opponent_3

(*) Any Opponent can be selected from all the Opponents created. In the select box, we have an option as “New Opponent”. If this is selected, then an input box is shown to enter the opponent name. This is mandatory only if “New Opponent” is selected from drop-down. Save logic is implemented accordingly.

opponent_2

(*) Either entering a new opponent, or selecting one is mandatory.

opponent_1

(*) The users last entered details are retrieved via cookie and are auto-filled next time. He does not have to re-enter his common details again. If user is accessing the dashboard, from different systems, then the last entered details are fetched from the server. This way even without a cookie, he does get his details auto-filled in.

The same validations were implemented on server side, to ensure no one can push through any irrelevant data.

You can read more about this project here.

 

 

Squash Logs

Posted in Cloud, Mobile, Web Tagged , , , , , , ,

Squash games are played daily. The system is designed to store the game log data.  The aim here is to let users enter match logs in the quickest possible time, after they play their games. Thousands of log data records are expected to be uploaded daily. The site can be viewed on iPhone, iPad, Desktop browser, Android browser and is hosted as static html files.  Ajax calls are used for dynamic functionality and asynchronous updates.

A very simple login form is required to start with. If user does not exists, then the login form performs the role of registration.

Once a user logs in, he can enter the logs of the game that he played. Extensive client side validations are implemented to make the form extremely simple, for the player to use.

enter_logs_1

You can read more about the validations implemented here.

Same validations are implemented on server side, to ensure irrelevant data is not pushed through. XSS and CSRF security is implemented.

For speed of response, all master data is saved in json format on Amazon S3 files in various locations across the globe.  These data files are loaded as static content during log entry. Amazon DynamoDB is implemented for fast access to log data.

You can read more about the amazon implementation here.

Adserver – Pretargeting

Posted in Big Data, Cloud Tagged , , , , , , , , , , , ,

Handling millions of ad-request per hour

The below diagram summarizes the implementation of the ad targeting system. Ad request and response are handled via an Apache server. Each ad request is saved in a log file on the filesystem. Each hour a new log file is generated. System is designed to handle atleast a million requests per hour.  Maximum server response time must not exceed 50 milliseconds.

Kafka is used to handle high loads. Cassandra is used as a scalable NoSQL solution. MySQL is used to store summary data in the system.

ad_flow_architecture

Synchronization of Hadoop Tasks.

Summary Data is also generated daily by using Amazon EMR implementation of Hadoop, and Amazon SWF for task synchronization. For reporting needs, the recent data that is not yet summarized, is fetched from Cassandra. The below workflow, explains the implemented flow.

workflow diagram

Alchemy API Integration

Posted in API Tagged , ,

Alchemy API is used to analyze content of URLs which are targeted for ads. It returns certain keywords that are relevant to the specified URL. For each keyword a relevance parameter and a sentiment score is returned.

Data is requested in XML format, via a CURL call to their API. The obtained response is processed as per the analysis algorithm. The entire functionality runs in background. Alchemy has limit of 1000 API calls within a 24 hour period. The limit is tracked and URLs are not reprocessed multiple times if they have been recently processed.

XML output from Alchemy API is in the below format.

<results>
<status>REQUEST_STATUS</status>
<url>REQUESTED_URL</url>
<language>DOCUMENT_LANGUAGE</language>
<text>DOCUMENT_TEXT</text>
<keywords>
<keyword>
<text>DETECTED_KEYWORD</text>
<relevance>DETECTED_RELEVANCE</relevance>
<sentiment>
<type>SENTIMENT_LABEL</type>
<score>SENTIMENT_SCORE</score>
</sentiment>
</keyword>
</keywords>
</results>

Pretargeting – SEMRush API

Posted in API, Web Tagged , ,

Objective

Automate the aggregation of URL meta data, including basic scoring and exporting of output

Implemented Points

The API is used to generate a list of keywords for a specific URL and the weight-age of those keyword.

The below output fields are parsed.
(1) Dn – Sites competing with this site in search results
(2) Np – The number of keywords for which the site is displayed in search results next to the analyzed site
(3) Or – Keywords this site has in the TOP 20 organic results
(4) Ot – Estimated number of visitors coming from the first 20 search results (per month)
(5) Oc – Estimated cost of purchasing the same number of visitors through Ads
(6) Ad – Keywords this site has in the TOP 20 Ads results

SEMRush

SEMRush sets a daily limit on the API calls (5000 API calls). Admin can see the updated number of calls made in a 24 hour period, so that he can stop processing the URLs once the call limit is reached.

The generated data is further used for keyword based ad targeting.

OpenAmplify API

Posted in API Tagged , ,

Open Amplify is used for analyzing the content of targeted URLs.

Top topics and results are analyzed using polarity output from the API. Using specific algorithm, weight is assigned for the each result. Positive, Negative and Neutral mean polarity values are taken into consideration for analysis.

The analysis output is used by adserver for targeting ads for specific keywords.

XML API respsonse is as below:

<ns1:AmplifyResponse xmlns:ns1=”http://amplify.hapax.com”>
<AmplifyReturn>
<Topics>
<Domains>
<DomainResult>
<Domain>
<Name>Sports</Name>
<Value>10.000000</Value>
</Domain>
<Subdomains>
<DomainResult>
<Domain>
<Name>Baseball</Name>
<Value>10.000000</Value>
</Domain>
<Subdomains>
<DomainResult>
<Domain>
<Name>baseball</Name>
<Value>10.000000</Value>
</Domain>
<Subdomains xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xsi:nil=”1″/>
</DomainResult>
</Subdomains>
</DomainResult>
</Subdomains>
</DomainResult>
</Domains>
<TopTopics>
<TopicResult>
<Topic>
<Name>baseball</Name>
<Value>10.000000</Value>
</Topic>
<NamedEntityTypexmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xsi:nil=”1″/>
<Polarity>
<Min>
<Name>Neutral</Name>
<Value>0.000000</Value>
</Min>
<Mean>
<Name>Positive</Name>
<Value>0.600000</Value>
</Mean>
<Max>
<Name>Positive</Name>
<Value>0.600000</Value>
</Max>
</Polarity>
</TopicResult>
.
.
.
</AmplifyReturn>
</ns1:AmplifyResponse>

Yahoo Content Analysis API

Posted in API Tagged , , ,

Pretargeting requires analyzing the list of URLs to find the most relevant keyword for a URL. Yahoo Content Analysis is one of the APIs used for analysis.

Keyword is extracted from Yahoo API, and the corresponding weightage (score) for that keyword is analysed. In the API, enable_categorizer is set, to find the category of the URL, with the relevant score.

Below is the sample XML output, extracted and processed:

<?xmlversion=”1.0″encoding=”UTF-8″?>
<queryxmlns:yahoo=”http://www.yahooapis.com/v1/base.rng”
yahoo:count=”2″yahoo:created=”2012-03-11T15:30:34Z”yahoo:lang=”en-US”>
<diagnostics>
<publiclyCallable>true</publiclyCallable>
<user-time>116</user-time>
<service-time>90</service-time>
<build-version>25587</build-version>
</diagnostics>
<results>
<yctCategoriesxmlns=”urn:yahoo:cap”>
<yctCategoryscore=”0.999337″>Politics &amp; Government</yctCategory>
<yctCategoryscore=”0.721854″>Government</yctCategory>
</yctCategories>
<entitiesxmlns=”urn:yahoo:cap”>
<entityscore=”0.993251″>
<textend=”6656″endchar=”6656″start=”6642″startchar=”6642″>Lucas Papademos</text>
<wiki_url>http://en.wikipedia.com/wiki/Lucas_Papademos</wiki_url>
<related_entities>
<wikipedia>
<wiki_url>http://en.wikipedia.com/wiki/George_Papandreou</wiki_url>
<wiki_url>http://en.wikipedia.com/wiki/Debt_restructuring</wiki_url>
<wiki_url>http://en.wikipedia.com/wiki/European_Central_Bank</wiki_url>
</wikipedia>
</related_entities>
</entity>
<entityscore=”0.938968″>
<textend=”2735″endchar=”2735″start=”2720″startchar=”2720″>unity government</text>
</entity>
<entityscore=”0.927013″>
<textend=”2707″endchar=”2707″start=”2699″startchar=”2699″>Papademos</text>
<types>
<typeregion=”us”>/person</type>
</types>
</entity>
<entityscore=”0.914795″>
<textend=”7148″endchar=”7148″start=”7128″startchar=”7128″>European Central Bank</text>
<wiki_url>http://en.wikipedia.com/wiki/European_Central_Bank</wiki_url>
<types>
<typeregion=”us”>/organization</type>
<typeregion=”us”>/organization/company/other</type>
</types>
<related_entities>
<wikipedia>
<wiki_url>http://en.wikipedia.com/wiki/Jean-Claude_Trichet</wiki_url>
<wiki_url>http://en.wikipedia.com/wiki/Eurozone</wiki_url>
<wiki_url>http://en.wikipedia.com/wiki/Axel_Weber</wiki_url>
<wiki_url>http://en.wikipedia.com/wiki/Euro</wiki_url>
<wiki_url>http://en.wikipedia.com/wiki/Currency_sign</wiki_url>
</wikipedia>
</related_entities>
</entity>
<entityscore=”0.843647″>
<textend=”10362″endchar=”10362″start=”10338″startchar=”10338″>national unity government</text>
<wiki_url>http://en.wikipedia.com/wiki/Zimbabwe_Government_of_National_Unity_of_2009</wiki_url>
</entity>
<entityscore=”0.794965″>
<textend=”7461″endchar=”7461″start=”7452″startchar=”7452″>government</text>
</entity>
<entityscore=”0.748128″>
<textend=”11218″endchar=”11218″start=”11187″startchar=”11187″>Prime Minister George Papandreou</text>
</entity>
<entityscore=”0.680339″>
<textend=”7417″endchar=”7417″start=”7399″startchar=”7399″>Evangelos Venizelos</text>
<wiki_url>http://en.wikipedia.com/wiki/Evangelos_Venizelos</wiki_url>
<types>
<typeregion=”us”>/person</type>
</types>
<related_entities>
<wikipedia>
<wiki_url>http://en.wikipedia.com/wiki/Dimitrios_Droutsas</wiki_url>
<wiki_url>http://en.wikipedia.com/wiki/Virginia_Mayo</wiki_url>
<wiki_url>http://en.wikipedia.com/wiki/United_States_Secretary_of_Defense</wiki_url>
<wiki_url>http://en.wikipedia.com/wiki/Greek_language</wiki_url>
<wiki_url>http://en.wikipedia.com/wiki/ThyssenKrupp</wiki_url>
</wikipedia>
</related_entities>
</entity>
<entityscore=”0.648517″>
<textend=”10053″endchar=”10053″start=”10036″startchar=”10036″>austerity measures</text>
</entity>
<entityscore=”0.613155″>
<textend=”6001″endchar=”6001″start=”5994″startchar=”5994″>CNN Wire</text>
<types>
<typeregion=”us”>/organization</type>
</types>
</entity>
<entityscore=”0.609287″>
<textend=”7061″endchar=”7061″start=”7043″startchar=”7043″>financial stability</text>
</entity>
<entityscore=”0.609287″>
<textend=”11102″endchar=”11102″start=”11084″startchar=”11084″>financial stability</text>
</entity>
<entityscore=”0.593508″>
<textend=”9820″endchar=”9820″start=”9805″startchar=”9805″>Greek Parliament</text>
<types>
<typeregion=”us”>/organization</type>
</types>
</entity>
<entityscore=”0.590604″>
<textend=”9701″endchar=”9701″start=”9681″startchar=”9681″>DimitrisAvramopoulos</text>
<wiki_url>http://en.wikipedia.com/wiki/Dimitris_Avramopoulos</wiki_url>
<types>
<typeregion=”us”>/person</type>
</types>
</entity>
</entities>
</results>
</query>