Wednesday, September 30, 2015

New Community TDS (NCETDS... just kidding we have enough acronyms!)

Video Tutorial - [ Here ]
Developer Documentation - [ Here ]
Community TDS inferface - [ Here ]

This blog post (one of the few by Andrew) is here to tell you about the new public TDS (technically an update for the community TDS so that it is inline with the private TDS source base). For those who aren't interested in reading all the words we have a great video to talk about this below:

Let's start off with an introduction to the TDS. It provides an easy to use, distributable means of writing and sharing transforms (and essentially the data so that users can turn that into intelligence) . All the transforms in the transform hub are built on either the the free public TDS or a private one.

When a "normal" transform (one on the public/private CTAS) runs what happens in the back is that a message is sent to the server containing the entity details (like its value and other properties) as well as the transform that needs to run. For example it could be the domain and the transform "to MX record". This would then be run on the server (the code would execute - performing an MX lookup on and the result would be returned to the client.

Previously people used local transforms which had a number of painful setup and distribution points:
  1. Local transforms require people to setup code and environments on the end user system.
  2. Code updating was painful as you would need to send all your users new code to run.
  3. Code containing all the API calls, passwords and other sensitive information needed to be obfuscated.
Our solution to this was the original TDS. Essentially what it does is provide a way that users can write and create transforms that they host on a web server. This is all done through a simple and intuitive web interface (the TDS).

What happens with the TDS is that when it recieves the call that includes the entity and transform to run, (as described previously) instead of executing code on the machine it will simply make a call over HTTP or HTTPs to a web server. This web server then receives the call and can then do whatever it needs to - be that talking to a database or API or something else... literally anything that you can write a program for.

You can read more about this over at our developer portal. It's got explanations, code samples and more. It will quickly get you up to date with the aspects of coding transforms.

This update of the community TDS keeps it in line with the private version with a range of new features include the following:

OAuth Integration

OAuth Integration allows transform writers to utilise open OAuth integration connectors (such as Twitter) or write their own to control who uses their transforms or just to do statistics.

Paired Configuration

Developers can now pair  exported Maltego configurations with their transforms which means they no longer need to ship MTZ files containing entities, machines, sets and seeds. These configuration files can be simply uploaded to the web interface and when the end user discovers the transforms they will automatically get these items added to their client!

Bug fixes, interface tweaks

A number of bug fixes and interface updates have been done to the interface and the whole experience should hopefully be more usable and intuitive for everyone :)

What are you waiting for? Head on over to the new community TDS  now!

Pink fluffy unicorns, dancing on rainbows

Thursday, September 3, 2015

Jumping on the Website Tracking Code bandwagon

Services like Google Analytics allow you to easily add functionality to your website simply by pasting a bit of JavaScript into your page's html. Often this JavaScript includes a tracking code that uniquely identifies the site owner's account with that service. Searching this tracking code with a search engine that indexes JavaScript allows you to find other sites that belong to the same user. There are quite a few web services that require you to add a tracking code to your webpage in order to use it. For analysts this provides a great way for making connections between websites that may seem unrelated using other OSINT techniques.

Recently there was an interesting project write-up called Automatically Discover Website Connections Through Tracking Codes by @jms_dot_py and @LawrenceA_UK. They used the source code search engine Meanpath to search for websites with a specific tracking code and Gephi to visualize the relationships from their results. We've been having the same idea for while now and decided to release two new transforms today. This means you can use this technique from within Maltego.

The first transform is called To Tracking Codes and runs on a website entity in Maltego. The transform will parse the home page of the specified site for tracking codes from services including Google Analytics, PayPal donate buttons, the Amazon Affiliate program, Google Adsense and AddThis. The image below shows the different tracking codes that can be found with this transform as well as the Detail View that is returned with each entity that includes a source code snippet of where the tracking code was found. The second transform is called To Other Sites With Same Code and is used to find other website that have the same tracking code.

Let's see what can be done with these transforms with a quick example using the Google Analytics code found on Ashley Madison's home page from the graph above. Running the transform To Other Sites With Same Code returns 100* different sites that all use a tracking code from the same Google account as the one from Ashley Madison. The resultant graph is shown below. (*Currently this transform is limited to returning a maximum of 100 results so there could actually be far more sites).

Most of these sites are just variations of the name and all redirect to Ashley Madison's home page. There are also a few other online dating sites here too as well as a couple of completely unexpected results of pages that you would not see being related to Ashley Madison in any way. These sites have piqued our interest so let's look a little deeper.

Taking all the websites from the previous step and running the transform To Tracking Codes again only finds one new code on the sites and Running To Other Sites With Same Code on this new code does not result in any new sites being found. This looks like it could be a dead-end so let's use another tool we have in the Maltego workbench. Resolving all the websites in the graph above to IP addresses shows that most of these sites sit on the same IP address except for a couple of outliers as shown below:

(only a portion of full graph)
We are looking for something out of the ordinary that is seemingly unrelated to Ashley Madison. We next remove all the sites with titles that are obviously related to Ashley Madison. This results in the graph below with just a couple of IP addresses that are scattered across the globe.

Finally let's see what else resolved to these IP addresses by running the transform To DNS Name [Other DNS names]. This transform will return historical DNS records for these IP addresses. Doing this results in some really interesting NSFW sites specifically found on the IP address that also host and

The image below summarizes the connection found between Ashley Madison and our somewhat unsurprisingly very much not safe for work (VMNSFW) websites that won't be listed here.

These two new transforms for working with website tracking codes are now available in the PATERVA CTAS seed on both commercial and CE. Simply hit the Update Transforms button in the transforms hub and they will be added to your Maltego client.

As always, enjoy responsibly,

Friday, August 14, 2015

We talk to Allan about NewsLink

This blog post presents our new transform hub item called NewsLink that we have just released on the Transform Hub. NewsLink aims to assist in identifying and monitoring patterns in information posted on the Internet from a wide range of sources including Twitter, blog posts and news articles.

Every day millions of news articles, blog posts, Tweets, pastes, etc. are posted online with this continuous stream of information it makes it difficult to identify what information is important to us and should be focused on and what could just be ignored.  One approach to pick out important information would be to look at when multiple sources all mention the same people, locations, company names (and a slew of other types of entities) in a certain time period. This is the basis for NewsLink.

The image of the graph below is a small piece of a graph that was monitoring news articles related to Defcon. The snippets on the right list the news articles that mention both Samy Kamkar and Defcon on the same page. This is an example of what we will be working towards in this blog post.

This blogpost will be broken down into a couple of sections. Firstly we'll look at transforms that are used in NewsLink to gather your information from different sources. We'll then move on to the transforms that are used to extract entities and keywords from these web pages as well as calculate the page’s sentiment towards that topic. The last step is to automate this process with the use of Machines. Using Machines you can continuously monitor your search term and only be alerted by email when something of interest occurs.

Transforms that gather information
We have four new transforms for gathering information from different sources - two of these transforms get information from Twitter and the other two get information from websites using search engines.

Search for News Articles [using Bing]

The first transform we have is called Search for News Articles [using Bing] and is used to gather recent news articles relating to a specific search term from unspecified news sources. The transform uses Bing’s news search API and will return articles from a wide range of news websites that are indexed by the search engine. The starting point for this transform is a phrase entity where you will enter your search term as seen in the image below (1). After running the transform a transform settings will pop-up allowing you to limit your results according to the age of the articles and its news category (2). Defcon is in the news currently so let's see some articles relating to the con that have been posted in the last 7 days. You can use a numerical value followed by 'd' for days, 'h' for hour, 'w' for weeks and 'm' for minutes.

The next image shows the results from this search. Each entity that is returned represents a website that has been posted about Defcon in the last 7 days. Clicking on one of these websites and having a look in the Detail View will provide you with more information on the article as seen below:

(Dated: 24 Jul 2015)
Search for Websites [Using GCSE]

The next new transform we have is called Search for Websites [Using a GCSE] and is slightly more flexible than the former as it allows the user specify a list of sites to search and only returns results from those sites. The transform uses a [Google Custom Search Engine] (GCSE) ID as a transform setting to specify the list of sites that you want it to search. To use this transform you first need to create a GCSE with the lists of the websites that you want to monitor. This list could really be anything from your favorite security blogs to a list of influential financial news services. Once created you will receive a unique ID for your GCSE which is what you will use as a transform setting when running the Search for Websites [Using GCSE] transform. The example image below shows the list of websites we have included in our GCSE (1) as well as the settings that are displayed when running this transform. These settings include the GCSE ID as well as the maximum age of the pages you want returned (3). In this case the setting can be populated with 'd' for days, 'w' for weeks and 'y' for years followed by a numeric value (hours and minutes are unfortunately not supported by the API).

The next image shows the results from this search in which you will notice that only sites included in our list were returned. By clicking one of these entities and looking in its Detail View you'll see that the different pages from the relevant websites are displayed.

(Dated: 24 Jul 2015)
Expanding websites to the actual web pages

Next we want to get all these webpages out into their own URL entities to work with them separately. To do this we run To Pages from Website which results in the graph below:

(Dated: 24 Jul 2015)
Clicking on one of these URL entities shows details of the webpage including the sentiment of the text as seen above.

Before we start running further transforms to process these articles we should speak about our Twitter transforms that that can be used to get Tweets on specific topics or from specific users.

To Tweets [Search Twitter]

The first Twitter transform is called To Tweets [Search Twitter] which has actually been available in Maltego for quite some time and can be found in the PATERVA CTAS transform seed. This transform simply searches for Tweets that mention your search term. The image below shows running the transform on the hashtag Defcon with the transform slider set to 50:

24 Jul 2015

This transform is a very general search as it will search all of Twitter for Tweets made by any user. Most of the time you won’t actually be interested in what the common folk on Twitter have to say about your search term, instead you would like to only search for your topic from specific list of Twitter accounts.

Fortunately Twitter allows users to create lists of accounts and then search for Tweets by users in these lists. You can create your own lists of Twitter users from your Twitter profile and then access that list in Maltego by finding your Twitter profile and running the transform To User Lists [That this person owns].  Paul's Twitter account contains a public list of Twitter accounts belonging to news sites that he believes to be quite influential/popular. To find this list you will first need to find his Twitter account which can be done by searching for his alias and running the transform To User Lists [That this person owns] to see his lists. From the user list entity you can see which Twitter accounts are included in the list by running the transform To Twitter Affiliation. The image below shows the steps to get the list and the users in the list:

 To Tweet [Written by user list member]

Next up we want to monitor this list of accounts and return Tweets to our graph whenever our search term is mentioned by anyone of these users. To do this we run the transform To Tweet [Written by user list member] on the user list entity (1). A transforms setting window will pop up allowing you to specify your search term as well as specify a term to ignore Tweets by (see 2). You can also specify the maximum age of the Tweet that you want to be returned. This is entered in seconds in the first transform setting field as show in the image below:

The search above results in only ten Tweets by users in Paul's list that mentioned Greece in the last week (604 800 seconds). (3) You can see the details of each Tweets by having a look in the Detail View.

If you didn't want to search a specific topic but instead wanted all the Tweets by the users in your list you could run the same transform leaving the two transform settings, Tweets that don't contain and Tweets that contain, blank which will return all the user's Tweets in the specified time.

These four transforms are what we use to gather our information from the web and from Twitter with two of them allow you to get results from very specific sources (eg: from your Twitter lists or from your GCSE) and the other two allowing you to get results from a wide range of sources (eg: all the users on Twitter or all pages indexed by Bing's news search ). The table below summarizes how these four transforms can be categorized:

Processing the information we've mined
Now that we have our information collected we can do some interesting operations on the data to find where different sources are mentioning a common entity (like a person's name..and then some) We will also look at the sentiment across the different sources on an entity to determine that entity's “average sentiment”.

Let's return to our previous Defcon graph where we got related news articles by running the transform Search News Articles [using Bing].  We've run the transform To Pages from Website to get the different news articles out into their own URL (webpages) entities.

From here there are a few options for transforms to run on these URL entities. The first transform is called To Related Words with Sentiment and is used to extract uncommon words from webpages. The words need to be within a certain distance of your search term in order for them to be returned. This distance between the extracted word and our search term is specified in a transform setting. This same transform can also be run on our Tweet entities although you won't need to specify a sentence distance as the transform will look at the entire Tweet. There are two other settings for this transform which are used to specify a list of words to ignore and another to specify a list of words that should always be returned if found on the webpage or in the Tweet.

The next transform we have for processing our information is called To Entities with Sentiment and uses Named Entity Recognition (NER) to identify different entities that are mentioned anywhere on the webpage. The transform will look for things such as peoples’ names, company names, countries, cities, etc. It will also extract the sentiment of that entity and return it in its Detail View. This same transform can be run on our Tweet entities too.

If you want to be more specific and only return entities that are found a certain sentence distance from your search term you can then use the set of transforms To Related Entities. These transforms take in a transform setting that specifies a maximum sentence proximity between the found entity and your original search term on the page - it thereby reduces the amount of irrelevant results that are returned to your graph. Running the To Related Entities transform set on our URLs that mention the term ‘Defcon’ and specifying a maximum sentence proximity of "1" results in the graph below.

All the nodes at the bottom of the tree are entities extracted from the various webpages and appear within 1 sentence of the word ‘Defcon’ on our page.

Viewing this type of information in the Main View is not ideal as it is very difficult to see where multiple pages link to the same entity which is what we are looking for. The next image is of the same graph but in Bubble View using the new DiverseSentiment Viewlet (included in this post, but not available by default - please install manually). This viewlet will be explained next:

In this view entities are sized according to how many incoming links they have making it easier to identify entities that are mentioned across multiple news sources. Entities relating to a common topic will also cluster together on your graph. For the NewsLink Hub Item we created a new Viewlet called DiverseSentiment which colours nodes on the graph according to their average sentiment - the more red the entity is the more negative it is and the greener it is the more positive.

The sentiment for an entity is calculated by taking each sentence that the entity was mentioned in from the various different sources and then averaging the sentiment across all the articles. To calculate this sentiment we use a great service from [AlchemyAPI] which gets the targeted sentiment of each entity in each information source. The image below shows an entity from this search in more detail. It has quite a negative "average sentiment" from the three articles it was mentioned in (this graph was created on the 24 Jul 2015):

(Dated: 24 Jul 2015)

Automating the process with machines
So far what we have done has been a manual process but what we really want is to build a machine that automatically fetches information from various sources every [n] minutes, runs our word processing transforms on the data and then only alerts us when anything interesting happens on our graph by sending us an email, bookmarking the entity or performing some other action to alert the user.

For each of these new transforms we have a new perpetual Machine that automates the process of running these transforms and can be used to continuously monitor websites for activity. Each Machine is essentially broken down into three phases. Initially your information is collected with one of the "information gathering" transforms discussed earlier. Transforms are then run to pull out related entities and uncommon words that are mentioned on the webpage in close proximity to your search term. The last phase of the machine is to deletes old entities from your graph that are out of your monitor's time window and then sets up email alerts for when a new topic being mentioned by multiple sources.

Another new transform we have is called Email Alert Message which takes in an email address (or list of email addresses) as a transform setting and sends an email alert message to those addresses when the transform is run. This transform is used in our new machines to alert the user when a specific event happen on their graphs. By default the email alerts are commented out in the Machine scripts.

The machines also use different coloured bookmarks to indicate which iteration of the Machine an entity was returned in - red bookmarks indicate that the entity was returned in the most recent iteration, orange for the previous iteration and so on.

The names and descriptions of the four machines are below:

  • General News Source Monitor - This machine will search for news articles relating to a certain topic using the Search for News Articles [using Bing] transform. It will then run the language processing transforms on the results to extract related words and entities.
  • GCSE Term Monitor - This machine uses the transform Search for Websites [Using GCSE] to search a list of websites for a specific term. It will then run the language processing transforms on the results to extract related words and entities.
  • Twitter Monitor V2 - This machine will start by searching a specific phrase on Twitter and then extract entities found in the Tweet, uncommon words, hashtags, links and Twitter handles.
  • Twitter List Monitor - This machine is similar to the former however it will only return tweets from a specific list of Twitter users by using the transform To Tweet [Written by user list member].

Opening up the script for any of these new Machines you will see at the top there are a couple of variables you can configure for your monitor which are explained below:
  • incoming_link_count - This variable specifies how many incoming link an entity will need before an email alert is sent or before the entity is bookmarked.
  • ignore_words - This is a comma separated list of words/entities that you want the transforms to ignore in results. For instance if you were monitoring Defcon you wouldn't want to be alerted every time terms like 'BlackHat', 'hacker' or 'Las Vegas' were mentioned close to your search term. You can achieve this by include these in your ignore list.
  • through_words - There are some words that you will always want to have returned if they are mentioned close to your search term somewhere on the web, these words should be included in the through_words list. For instance if you were monitoring a stock you could include the words 'buy', 'sell' or 'hold' in the through_word list.
  • timer - Timer will specifies the time between iterations of your machine and is measured in seconds.
  • max_age - This specifies the maximum age an entity can be on your graph before it is deleted.
  • email_address - An email alert will be sent to this address when an alert is triggered. 

One last note about DiverseSentiment: the new Viewlet won't be downloaded when you install the NewsLink hub item but you can get it here ( and manually import it into your Maltego client.

Newslink aims to provide a flexible way of monitoring news, websites and Tweets and then alert the user of what is most important by identifying where multiple sources are mention the same words or entities.

As always, enjoy responsibly,

Transforms reference

For gathering information:

  • Search for News Articles [using Bing] - This transform will search for news articles that are indexed by Bing relating to a specific topic. The transforms has two transform settings: one for specifying the maximum age of articles that should be returned and one for specifying the news category of the results. The age of the articles should start with a numeric value and be followed by either 'm' for minutes, 'h' for hours, 'd' for days or 'w' for weeks.
  • Search for Websites [Using GCSE] - This transform will search for a specific term on a custom list of websites specified in a Google Custom Search Engine (GCSE). The transform has three transform settings: one to specify the age of results, one to specify you GCSE and another to specify whether or not pages without a publish date should be returned. The maximum page age should begin with either a 'd' for days, a 'm' for months or 'y' for years followed by a numeric value.
  • To Tweets [Search Twitter] - This transform searches Twitter for a specify phrase.
  • To Tweet [Written by user list member] - This transform returns Tweets from a specific list of Twitter users, it has three transform settings: one to specify the age of the Tweet (in seconds), one to specify a search word and one to specify words to ignore Tweets by.

For extracting information:

  • To Entities with Sentiment - This transform will return all entities found on the entire page including that entities targeted sentiment. [This transform can be run on URLs and on Tweets entities].
  • To Related Entities - This is a transform set with transforms that will only return entities found in a specific sentence proximity to your search term. This sentence proximity is specified in a transform setting. The transforms in this set include:  To Related Companies,  To Related Countries, To Related Cities, To Related People , To Related Financial Market Index , To Related States Or Counties , To Related Organizations , To Related Technologies and To Related Field Terminology.
  • To Related Words with Sentiment - This transform will look for uncommon words that are mentioned in close proximity to your search term. [This transform can be run on URLs and Tweet entities].

For alerting the user:

  • SendEmailAlert - This transform will alert the user by sending an email when multiple sites point to the same term.

Wednesday, April 29, 2015

Maltego Chlorine Community Edition is ready for download

Hi there,

We're pleased to announce the release of Maltego Chlorine community edition. The release would hopefully solve most of the Java compatibility issues. It comes bundled with Java 8u45 and is available for download at our website [HERE].

The Chlorine release brings (almost) all the goodness of the commercial release with a 0$ price tag. If you're interested in the changes made from Carbon->Chlorine we suggest you view our Chlorine release video [HERE].

One of the main differences between the commercial and the community edition is that it will feature only free items in its Transform Hub.

When Kali Linux 2 is released we'll also release a Maltego for Kali release. In the meanwhile Kali Linux user can simply install the .deb on their Kali Linux.

Additionally we've made a new 'Intro to Maltego' video that will replace the first video in our tutorial series. It was about time - the previous version was made in Oct 2011 and used version 3.0. We've also had lots of complaints about the quality of the audio. The new video should be crisp and clear at 1080p with awesome sound. You can view it by clicking on the image below:

As always - please enjoy responsibly.

Friday, March 13, 2015

Connecting the links

Hello there,

Today I am going to talk a bit about our new Linkedin transforms that we have been working on. Linkedin is all about finding connections between people so what better way to visualize this information than in Maltego. I set out to build some Linkedin transforms that could help show connections between Linkedin users, their shares and company profiles that may not be easy to identify on Linkedin itself. All the transforms that I built here use the Linkedin developer API so you can log into your own Linkedin account from Maltego and start visualizing your Linkedin network.

Linkedin's API provides awesome search functionality for finding people and companies by allowing you to refine your searches with additional search parameters making it a lot easier to find profiles with common names. Our Linkedin transforms allow you to enter these additional search parameters using transform settings (transform pop-ups). To search for a Linkedin company profile from within Maltego you will start with a phrase entity and run the transform Linkedin Company Search, a transform setting will pop-up asking you if you want to specify a county-code. Running this transform on the phrase ‘KPMG’ without specifying a country-code results in the graph below:
The results returned from the Linkedin Company Search transform are weighted according to relevance meaning that the entity in the top left-hand corner is the most relevant result for your search. In the detail view there are links to the company's Linkedin profile page and to their website as shown in the image above. There are a range of transforms that you can now run on the Linkedin company profile entity which are listed in our shiny new context menu also shown in the image above. One of the highlights of these transforms is the To Email Domain which returns domains that the company has specified they receive email on. This transform often returns loads of results which is great if you are looking for sub-domains for that company. Running the To Email Domain transform on the first company profile from our 'KPMG' search results in 34 different email domains many of them being sub-domains of The result is shown below:
If you are ever looking to mine email addresses for a company this is probably a good place to start but that is a bit off topic for this post so I will leave that for you to try on your own. 

To search for a person’s Linkedin profile from Maltego you run the Linkedin People Search transform on a person entity, three transform settings will pop up allowing you to specify this person's company name, the country code of their home country and a past school of theirs. These transform settings are really useful when searching common names, for example when searching the name John Doe while specifying a country-code IR (Iran) you will receive only two Linkedin profiles. If you had to exclude the country code from this search you would be flooded with results. The image below shows this search result as well as the context menu which shows all the transforms that can be run on a Linkedin Affiliation entity:
The Detail View in this image shows additional information about the user that is selected which includes their Linkedin headline, location and the industry they work in.

Currently the Linkedin People Search transform returns the 25 most relevant results for your search while the Linkedin Company Search transform will return the 20 most relevant company profiles for your search.

Okay enough with the details, let’s move onto an example of how this can be used: imagine you wanted to inform as many Linkedin users from a particular company of something without directly messaging them and without them being aware that they are being fed targeted information. How we could do this is as follows: start by finding our target company's Linkedin profile, from our target company's profile we then run the transform To Affiliations [only in your Network], this transform will return all the users in your network who work (or worked) at that specific company. This results in the following graph:

From all these users we then want to see what shares are currently showing in their news feed, to do this we run the transform To Shares in User’s Network. This results in the following graph (shown in bubble view on the left):

This graph is quite large but by selecting all the shares and ordering them according to their number of incoming links we find that there is a single share that is currently on 23 news feeds belonging to users at our target company. Taking this share plus its incoming links to a new graph results in the following:

Now if we were to post a comment on this share we know that our comment would show up on 23 Linkedin user's news feeds that work (or worked) at our target company.

Next we want to find who authored this share, to do so we run the transform To Share’s Author on this share which reveals who it was initially posted by. Finally we run the To Companies transform on this user that reveals the company that this user works for:

This user’s Linkedin profile seems to be quite popular amongst users from our target company so its owner may be a person of interest if we were really targeting this organization. The next step would be to find this profile owner's email address which could be done by finding the companies email address domain and then their naming format for their email address but again this is out of the scope of this blog post.

I have one last highlight from our new Linkedin transforms that I want to mention before its time to go. The To Entities [Using AlchemyAPI] transform can be run on a Linkedin share entity, this transform will extract people’s names, places and company names that are mentioned in the share article. It is a nice way to easily identify topics that are being discussed across multiple shares in your Linkedin network.

A quick word about rate limits on the Linkedin API, to use these transforms you will need to log into your Linkedin account from Managed Services in Maltego, most of the API calls that these transforms use are limited to around 300 calls per day per user, when you reach your limit for the day you will receive a message in your transform output notifying you and you will have to wait until midnight UTC for your limit to be reset for your account. The Linkedin People search and the To Affiliations [in your network] transforms have a much stricter limit so you might find that you reach the limits for these transforms a lot quicker.

For those of you who have upgraded to Maltego Chlorine the Linkedin transforms will be arriving in your Transforms hub shortly, you will be able to add them to your Maltego client simply hitting the install button. For those of you who are still running Carbon here is the seed-url:

Enjoy responsibly